October 12, 2010 Features

Obtaining Objective Data in Clinical Settings

Basic Techniques for the Clinician

Identifying sensitive and objective indicators of a patient's function and change in function is important to consider when providing speech-language treatment. Determining the subsystem(s) contributing to the patient's speech impairments provides important information to direct intervention. Often perceptual characteristics (such as reduced loudness) can result from problems in more than one subsystem. Ideally, outcome measures will be sensitive to early indicators of change (i.e., leading indicators) so that clinical decisions can be shaped continuously. Objective data regarding changes in patient function provide evidence about intervention effectiveness and should be used at the start and end of treatment and also at regular intervals throughout.

Third-party payers are increasingly requesting outcome measures to demonstrate the value of the services provided (for more information, search "value-based purchasing" at ASHA's website). Soon, it will be customary to provide third-party payers with data obtained from outcome measures. It will not be sufficient to indicate simply that the goals were met—for reimbursement, you'll need to show data. A large number of measures can be obtained, so the important question is what type of data will be most useful. This discussion focuses on objective outcome measures for persons with voice and/or speech impairments.

Voice Recordings 

Choosing a good microphone is crucial to producing reliable and valid measurements. There are two basic types—condenser and dynamic. Condenser microphones are very sensitive, but expensive and fragile. Dynamic microphones are sturdy, less expensive, and sensitive enough to obtain good recordings in a clinical setting. Choose a microphone that has a flat frequency response, with little variation in response across the speech frequencies from 500 Hz to 5,000 Hz.

Microphones can be unidirectional (sensitive to sounds coming toward the microphone) or omnidirectional (sensitive to sounds coming from all directions). Unidirectional microphones respond differently depending on the distance between the microphone and the sound source (the patient's mouth). Omnidirectional microphones are more susceptible to room noise, but respond equally well at all distances from the patient.

The microphone can be plugged directly into a computer and recordings can be made with acoustic software, such as Praat (Boersma & Weenink, 2010) and TF32 (Milenkovic, 2003), which are available as free downloads. Choose a quiet setting for recording with an ambient room noise level of less than 50 dB.

Measuring Sound Pressure Level 

Speech sound pressure level (SPL) is important because it is controlled by and reflects the function of the respiratory, laryngeal, and supralaryngeal subsystems. Measuring SPL requires a microphone and sound level meter that is set on "C-weighting." The microphone gain should be set to respond to the expected decibel level of the patient's voice, and the gain level must be consistent across testing sessions. An easy way to obtain SPL measurements is to jot down readings from the sound level meter as the patient speaks, taking samples at various points in time within utterances (beginning, middle, and end).

During recording sessions, the patient should be asked to talk at a comfortable volume and pitch, and the distance from the microphone to the patient (mouth-to-microphone distance) must be the same across all testing sessions. A microphone that is closer to the patient's mouth will produce a better recording because less noise will be recorded. When the microphone is very close to the patient's mouth, it is important to maintain that exact distance because a small change in distance will have a big effect on SPL. A head-mounted microphone or a small camera stand can be used to maintain mouth-to-microphone distance. It is best to use the same distance as your normative sample. Normative data from connected speech are available in the literature (see Table 1 [PDF]; Huber, 2007, 2008; Sadagopan & Huber, 2007).

Measures of SPL are important for clinicians to quantify the perception of soft and loud speech. For example, patients with Parkinson's disease often are characterized as having voices that sound weak or quiet. Alternatively, some patients with vocal nodules or patients who are prone to vocal abuse have voices that can be characterized as loud (as well as hoarse or breathy). In both cases, SPL measures quantify the clinician's perceptions and provide evidence of treatment-related behavioral change.

Assessing the Respiratory System 

The respiratory system is often a target for treatment in patients who demonstrate low or high overall loudness, inconsistent loudness, difficulty altering volume, reduced stress patterning, short utterances, and unnatural pause locations. Respiratory patterns are often overlooked in clinical exams but can be easily assessed without equipment.

As the patient speaks, watch for sudden inspirations or expirations, exaggerated respiratory movement, excessive shoulder movement, and insufficient air. Note whether the patient inhales before speaking (preparatory inhalation). Some patients do not breathe before speaking, resulting in more effort and much lower lung volumes for speech. Scales for non-instrumental assessment of the diaphragm, rib cage, and abdomen using speech and non-speech tasks are available in the literature (Hixon & Hoit, 1998; Hixon & Hoit, 1999, 2000).

Respiratory phrasing is important as an indicator of breath support for speech. Phrasing can be assessed easily by having the patient read a passage. As the patient reads, watch for chest wall movements associated with breathing and mark those breaths on the passage. The length of each breath group (words spoken on one breath) can be measured by counting the number of words or syllables produced on each breath. Reduced breath group length can be an indicator of respiratory muscle weakness. If the passage is recorded, the duration of each breath group can be measured using recording software; the rate of speech can be determined by dividing the number of syllables by the duration. Some individuals with Parkinson's disease are described as using a fast rate of speech.

Speech rate measures can provide objective data indicating whether the rate is increased or whether this perception is related to another issue, such as an articulatory impairment. Measures of speech rate also provide objective data about the effectiveness of rate treatment. Normative data for breath group length and speech rate from connected speech are available in the literature (see Table 1 [PDF]; Hoit & Hixon, 1987; Hoit, Hixon, Altman, & Morgan, 1989; Huber, 2007, 2008; Sadagopan & Huber, 2007).

It is also important to note the syntactic appropriateness of breath locations. Breaths at non-syntactic locations may reduce speech intelligibility and naturalness (Shah, Baum, & Dwivedi, 2006). Healthy adults tend to breathe at major and secondary syntactic boundaries (see Table 1 [PDF]). Individuals with dysarthria (for example, due to Parkinson's disease or traumatic brain injury) tend to breathe at more minor syntactic boundaries and at locations unrelated to syntax (Hammen & Yorkston, 1994; Huber et al., 2008). Mark the locations of breaths during a reading passage and analyze their syntactic location.

Assessing the Laryngeal System 

Well-planned stimuli and acoustic measures provide a valid and noninvasive measure of the laryngeal system. Control of loudness, pitch, and vocal quality are major issues for people with laryngeal disorders. Perceptual judgments are important indicators of change, but acoustic measures provide objective substantiation of the perceptual characteristics. Measures of SPL, fundamental frequency (F0), fundamental frequency variation (F0SD), and noise in the voice source (signal-to-noise ratio or SNR) are useful for corroborating perceptual measures and describing laryngeal function.

One primary speech task for laryngeal system assessment is a sustained vowel, which eliminates the articulatory component from the measurements and focuses on the laryngeal system. Record the vowel production, instructing the client to produce the vowel at a comfortable pitch and loudness for about six seconds. Measures of SPL, F0, F0SD, and SNR can be made from the middle two seconds of vowel production using computer software. F0 reflects the pitch of the voice and should be age- and sex-appropriate (Huber, Stathopoulos, Curione, Ash, & Johnson, 1999). F0SD can reflect tremor, irregular vocal fold vibration, or additive noise in the voice signal (Linville & Fisher, 1985). SNR reflects the amount of additive noise in the voice source, which relates to perceptual changes in voice quality (Ferrand, 2002), and normative data should be based on the normative references provided by the program used to make the measurement.

These measures provide good indicators about intervention effectiveness. For example, an adult female with bilateral vocal nodules could have a low F0, high F0SD, and low SNR pre-treatment. Depending on the patient's voice patterns, SPL may be high (if she tends to abuse her voice) or low (if she has large nodules that impede vocal fold closure). After successful treatment, these values would be expected to approach normative values (F0 would increase, F0SD would decrease, and SNR would increase).

Assessing the Velopharyngeal System 

Inadequate functioning of the velopharyngeal (VP) mechanism (soft palate closing against posterior pharyngeal wall) can lead to hypernasality. Speech samples for both perceptual and objective measures should be gathered with a variety of stimuli (Peterson-Falzone, Trost-Cardamone, Karnell, & Hardin-Jones, 2006): single words as from the Iowa Pressure Articulation Test (Morris, Spriestersbaach, & Darley, 1961), repeated or read sentences (Kummer & Lee, 1996; Sell, Harding, & Grunwell, 1999), paragraph reading such as the "Zoo Passage" (Fletcher, 1972), counting, and conversation.

Perceptual evaluation continues to be the "gold standard" for assessment of hypernasality. Traditionally, judgments of hypernasality have been made using an Equal Appearing Interval (EAI) scale (Morris, Shelton, & McWilliams, 1973). An example of that type of scale is a four-point EAI scale for hypernasality: 1-Normal, 2-Mild, 3-Moderate, 4-Severe. Whitehill and colleagues have argued that nasality judgments are best rated using continuous scales such as Direct Magnitude Estimation (DME) or a visual analog scale (VAS; Cannito, Buder, & Chorna, 2005; Whitehill, Cheng, & Jones, 2007; Whitehill, Lee, & Chun, 2002). Whitehill et al. (2007) note that VAS produces measures similar to the more complex DME scales but is much easier to implement and interpret. VAS uses a continuous line, with endpoints of 0 (normal nasality) to 100 (severely hypernasal). Any point along the distance can be measured; a value for nasality is calculated based on its distance from 0 (Cannito et al., 1997). For example, if perceptual judgment suggested hypernasality (e.g., more than 20 on a VAS), then objective measures would be recommended.

Spectrograms are images of acoustic measures, including frequency, time, and amplitude characteristics of the waveform. Key characteristics of nasality in oral signals include (see Figure 1 [PDF]):

  • Addition of nasal resonance at approximately 250–300 Hz
  • Reduction in amplitude of formant frequency transitions leading into vowels from consonants, particularly for the first two resonances
  • Absence of burst release for stop consonants
  • Absence of high-frequency energy

Spectrographic measures can be useful in discriminating nasal from non-nasal productions and for measuring progress after physical management. A disadvantage of spectrographic measurements is that no single number represents the degree of nasality. However, comparison of a patient's spectrographic measures pre- and post-management can provide an objective indicator of intervention success.

Computer programs such as KayPENTAX nasometer can provide objective measures of nasality. The nasometer gathers signals from two microphones, one from the mouth and one from the nose, separated by a sound barrier. The calibrated microphone headset is placed on the face and talkers are cued to produce the "Zoo Passage" and other oral speech signals. The program calculates "nasalance," a ratio of the nasal to the oral + nasal microphone levels, multiplied by 100. Scores near 100% nasalance correspond to nasal sounds; scores near 0% correspond to non-nasal productions. The "normal" value for oral productions has been shown to vary with age, sex, and dialect (e.g., Hardin, Van Demark, Morris, & Payne, 1992). To determine "normal," users should gather their own set of typical speakers and calculate values for the speech samples to be used.

Nasoendoscopy and videofluoroscopy, which permit direct views of VP function through a small-diameter tube inserted into the nasal cavity above the VP port, show the degree and pattern of VP closure during speech and lateral pharyngeal wall contributions to closure. Videofluoroscopy, a radiation technique, typically shows three views: above the VP mechanism, below the VP mechanism, and from the side while the patient speaks (Skolnick, 1973).

Assessing Articulation 

Forrest and Weismer (2009) provide an excellent summary of acoustic measures of segmental articulation. Vowel formant frequencies allow for inferences concerning the vocal tract shape underlying production of a vowel. The first (F1) and second (F2) formants are regarded as the most important for vowel identification. F1 is inversely related to tongue height and F2 is directly related to tongue advancement. Thus, a low-back vowel like /a/ is associated with a relatively high F1 and low F2, and a high-front vowel like /i/ is associated with a relatively low F1 and high F2. Formant frequency measures can be obtained throughout vowels. When measures are obtained at one point in time, such as vowel midpoint, formant values represent a snapshot of the underlying vocal tract shape. These measures are appropriate only for monophthongs and may be plotted in F1-F2 space. If formant frequency values for at least three vowels are available, vowel space area can be calculated (Turner, Tjaden, & Weismer, 1995). A vowel space area that is reduced in size suggests reduced vowel distinctiveness. Formant frequency values and vowel space area measures may be compared to published, normative data. Dynamic or time-varying formant frequency measures also may be obtained for monophthongs and diphthongs (Neel, 2008), but normative data are not widely available.

Spectral measures can provide information regarding articulatory configurations for stops and fricatives. Spectral moments analysis is one approach to characterizing the spectra of stop bursts and fricative noise and treats consonant spectra as statistical distributions. Although four moments (mean, variance, skewness, and kurtosis) can be calculated from energy distributions using computer software, articulatory correlates of the first and third moments are best understood. The mean (first moment) indexes spectral center of gravity. The third moment (skewness) is a measure of symmetry. More anterior consonant constrictions are associated with higher first and lower third moment coefficients. First moment difference measures for pairs of fricatives or stops may be calculated for use as an acoustic index of consonant distinctiveness (Tjaden & Wilding, 2004). Published normative data for spectral moments is available (Forrest, Weismer, Milenkovic, & Dougall, 1988; Jongman, Wayland, & Wong, 2000; Maniwa, Jongman, & Wade, 2009; Nittrouer, Studdert-Kennedy, & McGowan, 1989). Because absolute spectral values for vowels and consonants are influenced by myriad factors (e.g., sex, age, phonetic context, dialect, and speaking rate), care should be taken in selecting comparison data.

Acoustic measures of segmental articulation may be used to document treatment-related changes in speech production. For example, clear or hyperarticulate speech is used in the treatment of dysarthria, with the aim of maximizing intelligibility. Based on studies of neurologically normal speech, successful implementation of hyperarticulate speech by an individual with dysarthria would be accompanied by an expanded vowel space area as well as enhanced spectral distinctiveness of consonants, as indexed by first moment difference measures.

Although it takes time to learn to use objective measures in clinical practice, they can be used quickly and easily to obtain evidence of treatment change. Objective measurements can improve treatment outcomes by providing data to guide the selection of treatment targets, monitor progress, and provide credible information about treatment effectiveness that can support third-party reimbursement. 

Jessica E. Huber, PhD, CCC-SLP, is an associate professor in the Department of Speech, Language, and Hearing Sciences at Purdue University. Her research focuses on assessing and treating the speech, balance, and cognitive difficulties associated with Parkinson's disease and determining underlying causes of the difficulties. Contact her at jhuber@purdue.edu.

Elaine Stathopoulos, PhD, CCC-SLP, is a professor in the Department of Communicative Disorders and Sciences at the University at Buffalo-State University of New York. Her research focuses on the function of the respiratory and laryngeal systems during speech, emphasizing voice changes across the life span. Contact her at stathop@buffalo.edu.

Joan Sussman, PhD, CCC-SLP, is associate professor and chair of the Department of Communicative Disorders and Sciences at the University at Buffalo-State University of New York. She studies the acoustic characteristics of speech and their perception by adults and children. Contact her at jsussman@buffalo.edu.

Kris Tjaden, Phd, CCC-SLP, is a professor in the Department of Communicative Disorders and Sciences at the University at Buffalo-State University of New York. Her research focuses on the acoustic and perceptual basis of dysarthria. Contact her at tjaden@buffalo.edu

cite as: Huber, J. E. , Stathopoulos, E. , Sussman, J.  & Tjaden, K. (2010, October 12). Obtaining Objective Data in Clinical Settings : Basic Techniques for the Clinician. The ASHA Leader.

Case Study: Objective Measures for Nasality and Voice

FB, a 58-year-old male, was seen at a university laboratory for an acoustic evaluation of nasality and voice. The client's medical history included a cleft of the soft palate as well as open heart surgery as a child, suggesting the client might have Velocardiofacial Syndrome (VCF), a dominantly inherited disorder. FB was referred to a genetic counselor for further testing. Perceptual judgments of hypernasality and voice characteristics were made. Based on a visual analog scale (VAS), hypernasality was scored a 65 out of 100 for nonnasal sentences. Voice was characterized as breathy and hoarse and the presence of pitch breaks was noted.

The velopharyngeal subsystem was evaluated using the Nasometer. The scores (see Table 1 [PDF]) reflected severe hypernasality during nonnasal speech production tasks. This assessment was based on the high scores in the nonnasal production tasks and the similarity of scores in the nasal and the nonnasal production tasks. The laryngeal subsystem was examined acoustically with KayPENTAX Computerized Speech Laboratory using the Multidimensional Voice Disorders Profile. Although FB's fundamental frequency was appropriate, variability measures of jitter, shimmer, and harmonics-to-noise ratio were not within normal limits (see Table 2 [PDF]).

Overall, results suggested abnormal velopharyngeal and laryngeal function. Repair of the velopharyngeal (VP) inadequacy might potentially resolve other vocal problems because FB appeared to use extra effort while speaking to generate sufficient oral pressure, resulting in laryngeal hyperfunction. An adequate VP mechanism might allow FB to reduce respiratory/laryngeal effort and allow his larynx to return normal. Physiological measures of respiration and phonation mechanisms were not evaluated but should be considered in future assessments as should treatment for vocal hygiene. FB was referred to a regional craniofacial center for physical management of the VP problem with either prosthetic or surgical methods and will be reevaluated one and six months after treatment to evaluate the effectiveness of the treatment.

—Joan E. Sussman, Jessica E. Huber, Elaine T. Stathopoulos, Kris Tjaden


Boersma, P., & Weenink, D. (2010). Praat (Version 5.1.35). Amsterdam, The Netherlands: Phonetic Sciences, University of Amsterdam.

Cannito, M., Buder, E., & Chorna, L. (2005). Spectral amplitude measures of adductor spasmodic dysphonic speech. Journal of Voice, 19, 391–410.

Cannito, M., Burch, A. R., Watts, C., Rappold, P. W., Hood, S. B., & Sherrard, K. (1997). Disfluency in spasmodic dysphonia: A multivariate analysis. Journal of Speech, Language, and Hearing Research, 40, 627–641.

Ferrand, C. (2002). Harmonics-to-noise ratio: an index of vocal aging. Journal of Voice, 16, 480–487.

Fletcher, S. (1972). Contingencies for bioelectric modification of nasality. Journal of Speech and Hearing Disorders, 37, 329–346.

Forrest, K., & Weismer, G. (2009). Acoustic analysis of motor speech disorders. In M. R. McNeil (Ed.), Clinical Management of Sensorimotor Speech Disorders (pp. 46–63). New York, NY: Thieme.

Hammen, V. L., & Yorkston, K. M. (1994). Respiratory patterning and variability in dysarthric speech. Journal of Medical Speech-Language Pathology, 2(4), 253–261.

Hardin, M. A., Van Demark, D. R., Morris, H. L., & Payne, M. M. (1992). Correspondence between nasalance scores and listener judgments of hypernasality and hyponasality. Cleft Palate-Craniofacial Journal, 29(4), 346–351.

Hixon, T. J., & Hoit, J. D. (1998). Physical examination of the diaphragm by the speech-language pathologist. American Journal of Speech-Language Pathology, 7(4), 37–45.

Hixon, T. J., & Hoit, J. D. (1999). Physical examination of the abdominal wall by the speech-language pathologist. American Journal of Speech-Language Pathology, 8, 335–346.

Hixon, T. J., & Hoit, J. D. (2000). Physical examination of the rib cage wall by the speech-language pathologist. American Journal of Speech-Language Pathology, 9, 179–196.

Hoit, J. D., & Hixon, T. J. (1987). Age and speech breathing. Journal of Speech and Hearing Research, 30, 351–366.

Hoit, J. D., Hixon, T. J., Altman, M. E., & Morgan, W. J. (1989). Speech breathing in women. Journal of Speech and Hearing Research, 32, 353–365. 

Huber, J. E. (2007). Effects of cues to increase sound pressure level on respiratory kinematic patterns during connected speech. Journal of Speech, Language, and Hearing Research, 50, 621–634.

Huber, J. E. (2008). Effects of utterance length and vocal loudness on speech breathing in older adults. Respiratory Physiology and Neurobiology, 164, 323–330. 

Huber, J. E., Darling, M., & Francis, E. J. (2008). Influence of punctuation and syntax on breath patterns in reading, Biennial Conference on Motor Speech. Monterey, CA.

Huber, J. E., Stathopoulos, E. T., Curione, G. M., Ash, T. A., & Johnson, K. (1999). Formants of children, women, and men: The effects of vocal intensity variation. The Journal of the Acoustical Society of America, 106(3, Pt. 1), 1532–1542.

Kent, R., & Read, C. (2002). The Acoustic Analysis of Speech. Valley Stream, NY: Singular/Cengage Learning.

Kummer, A., & Lee, L. (1996). Evaluation and treatment of resonance disorders. Language, Speech, and Hearing Services in Schools, 27, 271–281.

Linville, S., & Fisher, H. (1985). Acoustic characteristics of perceived versus actual vocal age in controlled phonation by adult females. The Journal of the Acoustical Society of America, 78, 40–48.

Milenkovic, P. (2003). Time-Frequency Analysis (TF32). Madison, WI.

Morr, K. E., Warren, D. W., Dalston, R. M., & Smith, L. R. (1989). Screening of velopharyngeal inadequacy by differential pressure measurements. Cleft Palate Journal, 26, 42–45.

Morris, H., Shelton, R., & McWilliams, B. J. (1973). Assessment of Speech. In Speech, Language, and Psychosocial Aspects of Cleft Lip and Palate: The State of the Art (No. 9 ed.).

Morris, H., Spriestersbaach, D., & Darley, F. (1961). An articulation test for assessing competency of velopharyngeal closure. Journal of Speech and Hearing Research, 4, 48–55.

Neel, A. T. (2008). Vowel space characteristics and vowel identification accuracy. Journal and of Speech, Language and Hearing Research, 51, 574–585.

Peterson-Falzone, S., Trost-Cardamone, J., Karnell, M., & Hardin-Jones, M. (2006). Treating Cleft Palate Speech. St. Louis, MO: Mosby.

Sadagopan, N., & Huber, J. E. (2007). Effects of loudness cues on respiration in individuals with Parkinson's disease. Movement Disorders, 22, 651–659. 

Sell, D., Harding, A., & Grunwell, P. (1999). GOS.SP.ASS'98: an assessment for speech disorders associated with cleft palate and/or velopharyngeal function (revised). International Journal of Language & Communication Disorders, 34, 17–33.

Shah, A., Baum, S., & Dwivedi, V. (2006). Neural substrates of linguistic prosody: evidence from syntactic disambiguation in the productions of brain-damaged patients. Brain and Language, 96, 78–89.

Skolnick, M. L. (1973). The spincteric pattern of velopharyngeal closure. Cleft Palate Journal, 10, 283–305.

Tjaden, K., & Wilding, D. (2004). Rate and loudness manipulations in dysarthria: Acoustic and perceptual findings. Journal of Speech, Language, and Hearing Research, 47, 766–783.

Turner, G. S., Tjaden, K., & Weismer, G. (1995). The influence of speaking rate on vowel space and speech intelligibility for individuals with amyotrophic lateral sclerosis. Journal and of Speech, Language and Hearing Research, 38, 1001–1013.

Warren, D. W., Duany, L. F., & Fischer, W. D. (1969). Nasal pathway resistance in normal and cleft lip and palate subjects. Cleft Palate Journal, 6, 134–140.

Warren, D. W., & Dubois, A. (1964). A pressure-flow technique for measuring velopharyngeal orifice area during continuous speech. Cleft Palate Journal, 1, 52–71.

Whitehill, T., Cheng, D.-H., & Jones, D. (2007). Rating hypernasality: direct magnitude estimation (DME) versus visual analogue scaling (VAS), Annual Convention of the American Speech-Language-Hearing Association. Boston, MA.

Whitehill, T., Lee, A., & Chun, J. (2002). Direct magnitude estimation and interval scaling of hypernasality. Journal and of Speech, Language and Hearing Research, 45, 80–88.

Winkworth, A. L., Davis, P. J., Ellis, E., & Adams, R. D. (1994). Variability and consistency in speech breathing during reading: Lung volumes, speech intensity, and linguistic factors. Journal of Speech and Hearing Research, 37, 535–556.


Advertise With UsAdvertisement