July 12, 2005 Feature

Speechreading and Aging

How Growing Old Affects Face-to-Face Speech Perception

It is evident that America, as well as the rest of the world, is getting older. Age-related hearing loss is currently the third most prevalent chronic disability faced by older adults and is consistently ranked as one of the top five factors affecting overall quality of life for persons over 65. What is perhaps less well known is that hearing loss in the slightly younger cohort, made up of people between the ages of 46 and 64, is, in absolute numbers, even more prevalent than for older adults. For example, estimates from the National Health Interview Survey suggest more baby boomers are suffering from moderate and severe hearing loss (10 million) than there are Americans with hearing impairments above the age of 65 (9 million).

The bottom line, says Martha Storandt, director of the Aging and Development Program at Washington University in St. Louis, is that "the next 20 years will see an unprecedented growth in the number of people demanding effective treatments for age-related speech perception deficits, and the age of those requiring audiological services will continue to decrease."

Watch, Listen, and Learn

Although sensory aids, especially the most recent generation of digital hearing aids, can provide substantial improvements in speech perception, they are often less effective in noisy or reverberant environments, exactly those situations where older adults have the most difficulty understanding spoken language. Fortunately, many of these difficult listening situations involve face-to-face conversations where listeners can both see and hear the talker and thus have the opportunity to use visual speech information (i.e., lipreading ; we use the term speechreading to denote using both the auditory and visual signal to recognize speech, Tye-Murray, 2004) to compensate for reduced audibility.

Moreover, most age-related hearing loss is acquired gradually and therefore provides the opportunity for older adults with hearing impairments to learn to use visual speech information. In fact, the benefits of auditory-visual, as compared with auditory-only speech perception, have been documented at least since 1954, when Sumby and Pollack demonstrated that the addition of visual speech information could significantly improve speech perception performance and that the importance of visual speech information increased as the listening situation became more difficult.

Age and Auditory-Visual Speech Perception

Despite the well-documented importance of visual speech information as a means of compensating for reduced audibility, there is relatively little information about how aging might affect the ability to benefit from both seeing and hearing a talker, compared with listening alone. Over the past five years our laboratories at the Central Institute for the Deaf at Washington University School of Medicine and the Department of Psychology at Washington University in St. Louis have been engaged in a program of research designed to investigate age differences in the ability to benefit from visual speech information and to examine the factors contributing to any such differences.

Our approach is based on a general framework in which the ability to benefit from visual speech information requires at least two independent processes. The first stage in auditory-visual speech perception is the encoding of unimodal (auditory-only and visual-only) speech signals. This initial encoding stage is then followed by the second component in which auditory and visual speech signals are integrated or combined to form a unified perception. The advantage of this framework is that it serves to identify whether any observed age-related changes are a consequence of encoding, integration, or a combination of both.

We measure performance with three levels of stimuli: consonants, words, and sentences, as it is possible that aging may affect one's ability to recognize stimulus types in different ways. In a typical testing scenario, the research participant sits before a computer monitor in a sound-treated booth. In a visual-only or an auditory-visual condition, the head and shoulders of the test talker appears on the monitor and speaks a stimulus. Depending upon the test, the participant either repeats verbatim the test stimulus (words and sentences) or touches one of several response options on the monitor touchscreen (consonants). The three tests that we typically administer are The Iowa Consonant Confusion Test, The Children's Audiovisual Enhancement Test, and the Iowa Sentence Test. These three tests are described in the sidebar above.

In our initial studies, we measured auditory-only, and auditory-visual speech perception in groups of older (over age 65 years) and younger (18-25 yrs) adults with clinically normal hearing for frequencies below 4 kHz. In these investigations, listeners were asked to identify consonants, words, and sentences presented in six-talker background babble, with the level of babble adjusted individually for each participant and each stimulus type so as to produce approximately 50% identification performance in the auditory-only condition. The rationale for this approach was to try and equate baseline performance in the auditory-only condition and observe what happens to older and younger adults when a visual signal is added to the partially masked auditory stimulus. We evaluated the benefit of having both the auditory and visual signals by computing a measure known as visual enhancement (Sumby & Pollack, 1954), using the following formula:

VE = (AV - A) / (1 - A)

In this computation, the proportion correct in an auditory-visual condition (AV) minus the proportion correct in an auditory-only condition (V), is divided by the room for improvement over auditory-only performance (1-A). As a simple example, suppose an individual obtained 50% correct in an A condition and 80% correct in an AV condition. In this case, the visual enhancement measure would be approximately 66%. Another way of stating this is that the individual obtained 66% of the maximum possible benefit by combining auditory and visual information, compared with the A information alone.

The results from our studies examining age differences in visual enhancement were quite clear; for all three types of stimuli (consonants, words, and sentences) older adults did not benefit as much as younger adults from the addition of visual speech information. Figure 1 presents the average performance for the three kinds of stimuli (i.e., consonants, words, and sentences) for a group of 50 young adults and a group of 50 older adults. Averages shown on the top side of the graph are for auditory-visual conditions. Averages on the bottom side of the graph are for visual enhancement. As can be seen, older adults exhibited poorer auditory-visual performance and, consequently, less visual enhancement than younger adults.

Age and Lipreading

In our next set of investigations, we further set out to examine why older adults might exhibit less benefit than younger adults from the addition of visual speech information. Based on our general framework, we thought one reason for the observed age-related deficits in visual enhancement (i.e., the improvement obtained from both seeing and hearing the talker as opposed to only hearing the talker) is that older adults may not read lips as well as younger adults. If aging impairs the ability to perform the initial encoding of visual speech information, it would be unsurprising to find that older adults do not benefit as much as younger adults from the addition of a visual speech signal.

To investigate this possibility, we examined perception of consonants, words, and sentences but this time using visual-only presentations, where participants could see the head and neck of our talkers, but without accompanying auditory information. Figure 2 displays visual-only performance for young and older adults as a function of stimulus type. Although both younger and older adults exhibited considerable variability in lipreading performance, older adults were poorer lipreaders than their younger counterparts and this was true for consonants, words, and sentences. These findings suggest that at least one factor contributing to our earlier finding of age-related deficits in the ability to benefit from auditory-visual, compared with auditory-only presentations is that older adults are simply less able to extract visual speech information.

Age and Integration

Our next task was to determine whether, in addition to impaired lipreading, older adults also have difficulty integrating information across the auditory and visual modalities. Unlike measures of lipreading, however, it is not possible to measure integration abilities directly.

One indirect measure of integration that is available for consonants only, and that has been used in a number of other studies (Braida, 1991; Grant, 2002) to examine integration abilities for consonants, is to use an individual's auditory-only and visual-only scores along with the pattern of consonant confusions to predict their optimal performance in an auditory-visual condition. This predicted optimal performance is then compared to the individual's actual auditory-visual score to determine how close th e individual is to being an optimal integrator. Someone who is an optimal integrator would have obtained auditory-visual performance equivalent to their predicted optimal performance.

As almost no one ever achieves optimal performance, the extent to which individuals score below their predicted maximum reflects their relative integration ability. A particularly appealing feature of this approach is that it provides a way of comparing integration abilities, independent of encoding, across a wide range of auditory-only and visual-only scores.

Suppose, for example, participant A completes the Iowa Consonant Confusion Test and scores 40% in the auditory-only condition, 20% in the visual-only condition, and 80% in the auditory-visual condition (recall that measures of integration require consonant confusion matrices for both the auditory-only and visual-only conditions to predict optimal auditory-visual performance, and are therefore not applicable to either word or sentence material). Using the auditory-only and visual-only performance measures, we then determine that optimal performance for this individual in the auditory-visual condition would be 95% consonants correct. If we then take the ratio of obtained auditory-visual performance (80%) to optimal audi tory-visual performance (95%), we arrive at an integration score of approximately .84.

Now suppose a second individual completes the consonant confusion test with scores of 20%, 5%, and 45% in the auditory-only, visual-only, and auditory-visual conditions, respectively. Further more, based on the pattern of confusions in the consonant test, we predict that optimal performance in the auditory-visual condition for this individual is approximately 53%. Note that the integration measure for this individual (45%/53%) is nearly identical to that of our first participant despite rather dramatic differences in overall performance across the different conditions.

Finally, suppose we conducted our consonant test on a third individual who obtained scores of 60%, 50%, and 70% in the auditory-only, visual-only, and auditory-visual conditions, respectively, with predicted optimal auditory-visual performance of 95%. In this case, the integration ratio (70/95) of approximately .74 is less than either of the two previous individuals, despite considerably better performance in both the auditory-only and visual-only conditions. Based on this pattern of performance, we would conclude that this last individual is not as good an integrator as the other two despite better abilities to encode the auditory-only and visual-only speech information.

In applying this approach to the data that we have collected from normal-hearing older and younger adults, we have observed slight, albeit significant, differences in integration scores for older and younger adults, with integration ratios of 81/92 (or .88) and 85/91 (or .93) respectively. Thus, in addition to impaired lipreading abilities, older adults also have a reduced ability to integrate auditory and visual cues to consonant identification.

Next Steps

Based on the findings reported above, our laboratories are working on several exciting new developments targeted at advancing the understanding of auditory-visual speech perception and at using these developments to improve speech understanding in older adults. In terms of theoretical advances, it is critical to develop measures of integration that can be applied to stimuli other than consonants. The importance of measuring integration, apart from encoding, for words, sentences, and discourse length material is highlighted by our recent findings that visual enhancement for consonants is not correlated with visual enhancement for words or sentences.

In one sense, this result is not surprising, as words and sentences afford individuals the ability to use lexical, semantic, and syntactic knowledge that is not available in the perception of consonants. What is of particular clinical significance, however, is that the absence of correlations between visual enhancement for consonants and more naturalistic stimuli suggests that the mechanisms underlying enhancement may be very different for words, sentences, and other materials, and may show a pattern of age changes very different from what we have observed with consonants.

Taken together, these considerations have led us to develop a measure of integration (which we term integration enhancement) that is derived from basic probability formulas and that can be applied to more naturalistic stimuli, such as words and sentences. Although we are continuing to conduct studies validating this new measure, what is particularly exciting from a clinical perspective is that we have, for the first time, been able to obtain measures of integration for highly naturalistic stimuli and our preliminary studies suggest that for these types of materials, age differences in integration are either reduced or absent. Thus, from a clinical perspective, the emerging picture is that for more natural stimuli, the main reason older adults may not benefit as much as younger counterparts from the addition of visual speech information is they have difficulty with lipreading, rather than deficits in integration.

Lipreading impairments are certainly more amenable to rehabilitation than are deficits in integration. Consequently, we are encouraged that training, practice, and other forms of aural rehabilitation offer the potential for improving older adults' ability to benefit from the addition of visual speech information when trying to understand more natural speech materials.

A Clearly Visible Future

The current generation of baby boomers has grown up in an era of unprecedented biomedical advances. This group of individuals has become accustomed to successful treatment of both acute and chronic medical difficulties. As they reach the age where presbycusis and other age-related hearing impairments make speech perception difficult, they will expect successful treatment.

Our initial studies for the use of visual speech information by older adults would strongly advocate for the inclusion of tests that assess both lipreading and integration abilities during audiological evaluations. At the very least, such testing can provide more realistic expectations about the likely benefits of sensory aids and a better understanding of listening environments where such aids are likely to be effective. Ultimately, we anticipate the findings from our studies will be used to develop individually based aural rehabilitation strategies that can maximize the speech communication abilities of the ever increasing numbers of older adults.

This work was supported by NIH NIA RO1 AG018029.

Nancy Tye-Murray, is a senior research professor at Washington University School of Medicine and author of the textbook, Foundations of Aural Rehabili tation: Children, Adults, and Their Family Members. Contact her at murrayn@ent.wustl.edu.

Mitchell Sommers, is an associate professor at Washington University. Contact him at msommers@ artsci.wustl.edu.

Brent Spehar, is a research audiologist at Washington University School of Medicine. Contact him at speharb@ent.wustl.edu.

cite as: Tye-Murray, N. , Sommers, M.  & Spehar, B. (2005, July 12). Speechreading and Aging : How Growing Old Affects Face-to-Face Speech Perception. The ASHA Leader.

  

Advertise With UsAdvertisement