January 21, 2003 Feature

Basic Research in Speech Science—Speech-Language Pathology

A patient struggles to produce the word "pen" but instead produces something that sounds like "Ben." What are the exact processing deficits that account for this situation? Is there something wrong with the patient's initial sound selection (e.g., mentally swapping "b" for "p")? Or is it the case that this patient can adequately plan speech sounds, but later distorts them during speech output? Are deficits in articulatory strength, timing, or movement involved? Could this talker's speech problems be linked to underlying perceptual errors? What patterns of brain activity might correspond to such deficits? What exactly should treatment focus on, and why?

Such pressing issues are, of course, at the center of the discipline of speech-language pathology. However, key theories of speech-language pathology could not be formulated without a more basic understanding of non-impaired speech processing. Normative speech processes are studied in the field of speech science, a broad discipline that encompasses several related specialties focusing on the way speech is produced and perceived. Although breakthrough discoveries in speech science research are being made every day, the flow of information from laboratory to clinic can sometimes be rather slow. Let me briefly provide some information that will help address this shortcoming.

Speech Production and Speech Perception

The field of speech science is often divided into the specialties of speech production and speech perception. Speech production is concerned with the way in which our thought and language are converted into speech. A number of theories seek to explain exactly how such amazing behavior is accomplished. Most theories share the view that there is a pre-articulatory (or planning) stage and a motor realization (or vocal tract control) stage, and that these stages are accessed in a precisely timed, hierarchical manner during speaking.

Speech production scientists ask the following types of questions: What are the units of speech? How are these various units and levels coordinated during talking? How does speech develop in children? To what extent do speaking processes share commonalities with other types of oral motor behavior? What role does auditory feedback play during our talking? How "flexible" is the speech process, and how do talkers respond if normal articulatory processes are perturbed? How do talking rate and volume affect articulatory processes? Are speaking processes different in relaxed vs. formal speech situations? What are the brain bases for speech motor control?

Speech perception researchers puzzle out how listeners can decode meaningful units of speech (phonemes) from a rapidly changing and highly encoded speech signal—a communication ability that is miles beyond the reach of most computers.

Speech perception researchers ask questions such as: What are the exact acoustic "cues" that listeners attend to in order to identify speech sounds? How do listeners recognize individual talkers or know that very different physical signals coming from different talkers may in fact indicate the same linguistic meaning? How does speech perception develop? How do listeners adjust to volume or rate changes in the speech they are hearing? How does speech perception degrade in noise? To what extent is speech perception a typical auditory process, and to what extent might speech be considered a "special" behavior that involves unique modes of processing? How do acoustic cues for speech interact with visual cues picked up from watching the face? How flexible is the speech perception process, and how is speech hearing reorganized when listeners are provided with "electronic hearing" through cochlear prostheses? How is speech perception represented in the brain?

Current Research

It is an exciting time for clinicians to tune in to speech science, because a number of recent technological breakthroughs are having a great impact on the field. There are important projects taking place all over the world. Let me briefly describe three research projects that scientists, clinicians, and students are working on here at the University of Texas Dallas Callier Center for Communication Disorders (UTD).

Visual Biofeedback

One project explores whether visual biofeedback of tongue movement can help patients with brain damage recover speech. Following brain injury such as stroke, individuals are frequently left with a debilitating loss of language and speech known as "Broca's aphasia" or "apraxia of speech." Most of these patients also present with buccofacial (or oral) apraxia, a disorder that affects the ability to make nonspeech oral motor gestures on command. Although these associated aphasic and apraxic deficits cause some of the most visible and long-lasting symptoms for adult neurogenic patients, the underlying bases of these deficits and the best means of their treatment have not yet been fully sorted out.

For many years, our knowledge of articulatory movement was either limited to the external articulators (lips and jaw) or indirectly inferred from the acoustic signal. However, new technologies are making it increasingly possible to directly measure the movement of the lips, jaws, tongue, and velum using relatively noninvasive techniques. One such technology we have been working with in our laboratory is electromagnetic midsagittal articulography (EMA). In this technique, the subject wears a lightweight helmet that sets up low field strength, alternating electromagnetic fields around the head (about the strength of a handheld hairdryer). Tiny sensors are glued to the subject's articulators, and these are connected to a computer by means of fine wires that are led from the corners of the mouth. As the sensors move through the electromagnetic fields, the computer tracks them, yielding a two-dimensional image of articulatory movement measured in the midsagittal plane.

We have used EMA primarily as a research tool, investigating such issues as how gestural overlap (or coarticulation) differs in the speech of aphasic and healthy control talkers, and whether the ability to flexibly reorganize the motor system during perturbed speaking conditions ("bite block speech") differs in children, adults, and brain-damaged individuals.

Most recently, we have designed programs that show the patient images of the tongue during speech in order to provide visual augmented knowledge of performance for difficult-to-produce sounds. In this technique, the subject views a monitor that shows an outline of the palate and the current tongue position. The sounds to be treated are repeated by the subject until acceptable exemplars are produced. Using a mouse-drawing tool, investigators then mark a circle on the video monitor around the region containing the spatial endpoints for each stimulus. The patient next begins a biofeedback program that sets up different speaking tasks and shows the subject exactly where the tongue is and whether the tongue reaches the right spatial location. Targets to be hit light up green, and targets that were successfully hit change to red. When correct hits are made, a pleasing tone sounds and a small balloon image rises on the computer screen.

Although this work is still in a very early stage, the preliminary findings have been quite promising. Using this biofeedback method, we have treated a small number of individuals with aphasia and apraxia of speech, and we have observed lasting results for some sounds that were otherwise poorly treated by traditional methods. The data also have allowed us to gain some fresh perspectives into some time-honored puzzles about aphasia and apraxia (Is verbal apraxia best considered a special case of a broader class of disorders called apraxia, or do these two disorders dissociate? How does the lesion site relate to aphasic and apraxic symptoms?).

We are currently testing whether this biofeedback technique also can be used with buccofacial apraxia (in which talkers cannot complete facial motions such as "blow out a match" on command). Also, to bring these biofeedback techniques closer to the clinical setting, we are exploring other new tongue-tracking devices that may be cheaper and easier to use.

As research in biofeedback continues, scientists will learn more about the functional and neurological bases of recovery from brain damage. For example, does biofeedback operate because of general relaxation principles (allowing existing faculties to operate more smoothly) or because alternative neural mechanisms are recruited? Or both? Functional neural imaging studies may help elucidate these issues.

Electronic Hearing

A second research project conducted at the Callier Center examines how "electronic hearing" in the form of cochlear implant devices affects speech production. Cochlear implant technologies have had a profound effect on our field and are offering new means of communication to thousands of individuals throughout the world. However, the exact effects that these various devices have on speech need to be further elucidated.

Cochlear implants not only afford the user the opportunity to hear others speak, but they also permit monitoring of one's own speech. A UTD research scientist, Sneha Bharadwaj, recently completed a study that examines the role of self-hearing during speech. Cochlear implanted adults and children produced speech samples under two conditions—with the implant device turned on and with it switched off immediately before the repetition of each word. Subjects' productions were analyzed acoustically and also were presented to normal-hearing listeners to determine speech quality.

The results confirmed previous findings that auditory feedback is used to control suprasegmental information over relatively large speech units, such as syllables and words. However, rather surprisingly, auditory feedback also was found to affect segmental-level productions of consonants and vowels in the short-term. These data suggest that talkers closely monitor their own speech over very brief time spans, matching their output against an "internal model" of what they intended to say. The results also suggest auditory feedback affects various speech sounds differently, perhaps relating to the extent to which these sounds involve tactile or proprioceptive information. Although this work is still in an early stage, the clinical implications are that certain sounds produced by cochlear-implants talkers will be more amenable than others to remediation based on standard, auditory-based techniques.

Altered Brain Responses

In another series of experiments conducted at UTD, Emily Tobey and colleagues have begun identifying the brain structures altered in hearing impairment and deafness and potentially restored by cochlear implants. In conjunction with the University of Texas Southwestern Nuclear Medical Center (directed by Michael Devous), this team has been tracking regional cerebral blood flow (rCBF) in the auditory region of the brain using Single Photon Emission Computed Tomography. rCBF is directly related to the activity of the neurons as they process information. A first series of studies found that rCBF responses to speech signals in the auditory cortices (and in the auditory association areas where hearing and speech get related) are blunted in cochlear implant users relative to normal-hearing control subjects, and the severity of blunting is greatest in cochlear implant patients with poor speech perception.

With this information as background, the researchers are now testing whether pharmacological stimulation paired with auditory rehabilitation will increase neuronal response to electrical stimulation for a cochlear implant and further enhance the speech perception performance of individuals with cochlear implantation, particularly those individuals who receive minimal benefit from the devices. The idea is that unsuccessful cochlear implant users fail to perceive properly arriving signals due to metabolic limitations of the cortex, and that under pharmacological stimulation this situation might be reversed.

Such an intervention has proven successful in restoring speech in post-stroke individuals with aphasia and in restoring motor function in those who have experienced stroke. They also learned that subjects with hearing loss who might be eligible for cochlear implants often show greatly different rCBF responses to stimuli delivered in the left ear relative to the right ear. These "pre-implant" studies are helping the team determine which ear is the best to implant based on differing responsiveness of the brain to signals from one ear or the other.

Decoding Speech

Let us next examine some exciting research in the area of speech perception. Peter Assmann and colleagues are investigating the exact role that different types of acoustic information play when listeners decode speech. During vowel perception, two distinct types of information are known to play a role: formant frequencies (a property of the filter that are heard as changes in vowel quality) and fundamental frequency (F0; a property of the speech source that is heard as pitch). However, the exact way in which these types of information interact during the perception process is not completely clear.

In a series of experiments, Assmann and Jack Scott (a UTD graduate student in audiology) studied the relationship between F0 and formant frequency shifts in vowel perception. They used a high-quality speech synthesizer to process a set of vowels spoken by three adult male speakers of American English. Identification accuracy dropped by about 30% when the formant frequencies were scaled upwards by a factor of 2.0 and, in a separate condition, by about 50% when F0 was raised by two octaves.

However, when formant frequencies and F0 were both increased at the same time, identification accuracy showed a marked improvement, compared to conditions where each cue was manipulated separately. The data suggest that listeners have internalized knowledge of the relationship between F0 and formant frequencies in natural speech and that this plays a key role in vowel perception. These findings have important implications for our understanding of how listeners with cochlear implants perceive speech, because these people face a similar challenge of piecing together F0 and formant frequency information that may be "mismatched" due to the manner in which their processes extract speech cues for perception.

In summary, many discoveries in speech science are shaping the field of speech-language pathology. I have listed a few projects taking place in our university, and there are, of course, many other exciting areas of research and development around the world. Here are a few breaking examples:

• New signal processing breakthroughs are allowing rapid, online estimates of vowel formant frequencies. These developments will facilitate computer-based tools that may be useful for foreign accent reduction and the treatment of disordered speech.

• Speech scientists are learning more about the biological bases of more "masculine" and "feminine" sounding speech and the possible relation between talkers' speech characteristics and their sexual orientation. These data will help us understand the age-old question of how much of our behavior is "nature" and how much is "nurture" and may also be useful in clinical work with transsexual individuals

• In studies of stuttering, speech scientists have reported that frequency-shifted delayed auditory feedback can produce rapid improvement in the speech of certain individuals with severe fluency problems.

Clinicians are urged to stay tuned so that state-of-the-art knowledge can be incorporated into effective clinical practice.

William F. Katz, is an associate professor at the University of Texas at Dallas, Callier Center for Communication Disorders. He teaches and conducts research in the areas of speech science, phonetics, and adult neurogenic disorders. His work is described on his home page (www.utdallas.edu/~wkatz), and he may be contacted by e-mail at wkatz@utdallas.edu. 

cite as: Katz, W. F. (2003, January 21). Basic Research in Speech Science—Speech-Language Pathology . The ASHA Leader.

Web Sites

http://asa.aip.org/  (The Acoustical Society of America)

http://www.articulograph.de/  (Carstens Medezinelektronik EMA system)

www.indiana.edu/~acoustic/spsites.html  ("Speech Web Sites" from Indiana University)

www.haskins.yale.edu/research.html (Research projects at Haskins Laboratories, Connecticut)

("Phonetics Resources" from the University of Washington)


Assmann, P. F., Nearey, T. M., & Scott, J. M. (2002, in press). Modeling the perception of frequency-shifted vowels. Proceedings of the 7th International Conference on Spoken Language Processing.

Bharadway, S. (2002). Role of auditory feedback in speech production by cochlear implant users: Acoustic and perceptual analyses. (Doctoral dissertation). The University of Texas at Dallas.

Katz, W., Bharadwaj, S., & Carstens, B. (1999) Electromagnetic articulography treatment for an adult with Broca’s aphasia and apraxia of speech. Journal of Speech, Language, and Hearing Research, 42, 1355–1366.

Roland, P. S., Tobey, E. A., & Devous, M. D. (2001). Preoperative functional assessment of auditory cortex in adult cochlear implant users. Laryngoscope, 111(1), 77–83.


Advertise With UsAdvertisement