November 16, 2004 Feature

Tracking Eye Movements to Study Cognition and Communication

Eye-movement measures represent a unique way to study how people process visual information in real time as they are engaged in a specific task. Eye tracking has been used successfully in a rich collection of studies across numerous disciplines to address visual behaviors in a vast array of cognitive tasks. Examples are reading, piloting of airplanes, chess playing, viewing of art, car driving, and during newspaper layout, Web page design, several aspects of problem-solving, and use of devices such as copy and fax machines. Patterns of eye movements may help us answer questions about visual processes involved in attention, the time course of processing visual information, and how the brain controls the eyes to select, extract, and use visual information in communication (for a review, see Rayner, 1998).

When applied to research with individuals with disabilities, eye-movement indices have particular appeal in that they offer an alternative response mode and may reduce reliance on memory and comprehension abilities. Additionally, individuals with neurological disorders, such as those with stroke and brain injury, including motor problems in the head, neck, trunk, and limbs, often retain eye-movement control.

Various Eye Movements

There are several types of eye movement. The eye movements to which we refer here are saccadic eye movements (or saccades). These are quick rotations of the eye that allow us to fixate images with the fovea, which is the area of the eye with the greatest visual acuity. Given that our interest in most eye-movement research related to cognition and language actually involves study of where the eyes fixate between movements, the term "eye fixation" is often used interchangeably with "eye movement," although the two terms are not synonymous.

Here we discuss the use of eye movements to study information processing, not their use for augmentative communication and computer and environmental control, although the latter has tremendous benefits for many persons with speech motor control difficulties.

The Eye-Mind Assumption

Much of the research using eye tracking to study cognition and language is based on the important assumption that there is a relationship between where we fix our gaze and what we are thinking about. Still, we know that it is possible to look at one thing while thinking about something else, and that we are able to process information about things we see through peripheral vision. For this reason, it is important that researchers take great care in the design of their studies to ensure the validity of the eye-mind assumption. The means of doing this depend on the specific tasks to be administered.

Applications of Eye Movements

Although numerous studies have addressed basic aspects of language processing using eye tracking (e.g., psycholinguistic studies of lexical ambiguities and varying levels of syntactic complexity), few authors to date (excluding those addressing reading and reading disorders) have used eye tracking to study clinical issues in the field of communication sciences and disorders. This is likely to change in the near future as the cost of eye-tracking instrumentation continues to decrease, the quality and precision of instrumentation continue to improve, and research entailing eye-tracking measures is increasingly visible in our research journals and professional conferences. The present authors have been involved in eye-movement research for many years.

In the Neurolinguistics Laboratory at Ohio University, directed by Brooke Hallowell, ongoing work is dedicated to developing test materials and protocols geared toward assessing language comprehension in individuals with aphasia and related neurogenic language disorders. In one series of studies, individuals view controlled image displays as they read text or listen to auditory verbal messages. eye-movement indices are used to capture comprehension responses, which are validated through clinical assessments in individuals with known comprehension difficulties. The potential for direct clinical applications for individuals who are difficult to assess is promising. New approaches that use eye movements to supplement or replace linguistic or gestural response modes that may interrupt comprehension or tax memory are being evaluated to assess reading and auditory comprehension.

In another series of studies, specific aspects of visual and verbal stimulus design for assessing reading and auditory comprehension are studied. Additional projects involve eye movements to study auditory processing and psycholinguistic priming applications.

In the Visual Processes in Speech Perception laboratory at the University of Illinois, directed by Charissa Lansing, ongoing work is dedicated to the study of visual cues in spoken language understanding. In one series of studies, individuals with and without hearing loss view full-motion video sequences of talkers producing sentences with natural facial expressions and intonations. The video sequences are controlled to limit regions of observable motion on the face, complexity of the sentence stimuli, or the intensity level of the talker's voice and background noise. Eye monitoring is used to study the location, duration, and sequence of eye movements in speech understanding. Eye-movement patterns are studied in relation to performance accuracy, hearing status, and speechreading proficiency.

Increased scientific knowledge about how people extract useful visual information for speech understanding tasks contributes to new models of speech perception. These incorporate the role of vision as well as the enhancement of speech recognition algorithms that incorporate visual cues produced by a talker. Information about the selection and use of visual cues in speech understanding may also help us better understand the role of visual attention in speech understanding, which will help guide future approaches to aural rehabilitation.

In another series of studies, eye monitoring is used to investigate visual attention as adults perceive speech in combination with facial expressions, finger spelling, or sign language. Performance accuracy and eye-fixation patterns are studied to identify the source or location of critical information in a variety of language-understanding tasks.

Defining Dependent Measures

Those observing eye tracking for the first time are often amazed by the advanced technology that so keenly monitors how people look at displays, images, and real-world scenes. Watching a cursor that moves in correspondence to a person's eye position, superimposed on a computerized display identical to the one the participant is viewing, in real time, is a remarkable experience. Although we share the excitement about the apparent magic of eye tracking, we caution that it is essential to be well grounded in how it is that one can actually analyze and interpret eye-tracking data.

There are three essential steps in analyzing eye-movement data: defining what constitutes a "fixation," determining the locations of fixations and how they correspond to specific regions of interest in a visual display, and deriving dependent measures from fixation data.

Defining fixations

Raw eye-tracking data are generally collected at sampling rates of 30 to 1,000 samples per second, depending on the type of technology used. Each sample is assigned an x, y coordinate corresponding to the horizontal and vertical dimensions of the display being viewed. Raw eye-position data corresponding to particular x, y coordinates do not necessarily tell us anything about what a person is actually seeing while looking at the display. This is because we can only "see" images when our eyes remain stably fixated on a specific point long enough to process the corresponding visual information.

Eye-tracking samples are collected at rates that are quicker than the duration required for a single fixation. Thus, it is essential to first analyze raw eye-position data by defining how stable the eye must be and for how long it must be stable to qualify as a fixation. Published eye-tracking studies addressing information processing employ highly variable definitions of fixations, with a minimal duration of a stable eye position ranging from 40 to 250 msec (with some slight tolerance in the vertical and horizontal dimensions).

It is important to keep in mind that not all published eye-tracking studies represent appropriate means of defining fixations. In fact, this has been noted as a particular weakness in several recent publications in the psycholinguistics literature. A researcher's definition of what it takes to constitute a "fixation" should be derived empirically through carefully controlled studies in which similar tasks have been used.

Correspondence between fixations

Before we can analyze how fixation data pertain to most of our research questions, we must first determine a correspondence between each fixation and an area of interest within a display. We do this by assigning x, y coordinates to each specific region of interest in the display. What we consider to be a region of interest depends upon the nature of our research question. For example, we may be interested to know whether (or how often or for how long) a viewer looks at the head, trunk, arms, hands, legs, or feet of a particular person in a photograph. Or we may want to know whether or how often the viewer looks at the person as compared to a tree or a car in the same photograph.

In the first case we would define separate regions of interest for each of the person's body sections, and examine fixation data according to those. In the second case we would define one more general region of interest for the person, the entire area including the image of the person. Any fixations having coordinates within a region of interest are assigned to that predefined item within the display for subsequent analysis.

Deriving dependent measures

There are an unlimited number of potential dependent measures that can be used to characterize eye-movement data. Researchers must be careful especially in selecting the most appropriate dependent measures in light of the experimental design and the specific hypotheses to be tested. Many research studies incorporate dependent measures involving numbers of fixations and/or totals of fixation durations corresponding to viewing of specific regions of interest. These may be derived from data collected for a specified period of viewing (for example, 10 seconds of picture viewing) or for the time it takes to accomplish a certain task (for example, how long it takes to identify a specific item in the display).

Other studies involve similar frequency or duration measures that are standardized across trials or that involve corrections for the variability in size of areas of interest within a display. Still others involve probabilistic measures, taking into account sequences of fixations. The selection of appropriate statistical analyses to be administered once the dependent measures are derived is another important consideration.

Choosing an Eye-Tracking System

There are many different eye-monitoring technologies. Choosing the best system for a research application requires investigation. Before considering the purchase, we recommend a visit to research labs that use eye-monitoring technology to see different systems in action and get answers to your questions from individuals who have had experience with various systems and can evaluate manufacturer support. If possible, participate in experiments that monitor eye behaviors to get some real firsthand experience. Direct experience with the technologies should provide a framework to help you identify the strengths and limitations across different systems. It is important to keep in mind tradeoffs among the various considerations. For example, gaining greater freedom of movement for the participant may be at the expense of a loss in measurement accuracy.

In our laboratories we use video-based eye-tracking systems that bathe the eye in infrared light and track reflections off landmarks of the eye, recorded by video camera(s) fitted with filters. Image-processing software is used to identify and map eye position to the display. The lens, cornea, and other parts of the eye absorb a small amount of energy from the infrared light, but it is less than 1% of the Maximum Permissible Exposure Level as certified by the American Standards Institute (ANSI Z 136.1-1973). This is about as much energy as you get on a bright sunny day. More invasive methods, such as scleral coil systems, and electrooculography (EOG) are used in some laboratories.

There are numerous types of video-based systems. Here we review considerations for three basic types of system configurations: fixed-head systems that restrain the perceiver using a head or chin rest and a bite bar; head-mounted systems that correct for head movement and may allow for general movement; and remote eye trackers for which hardware does not come in contact with the eye or head.


A basic consideration is that the system be capable of providing the resolution and precision of information necessary to answer the research questions posed. For example, information about which picture was selected among a set of widely spaced pictures would require less spatial accuracy in horizontal and vertical directions than information about the eye behavior directed toward lines and angles of objects within a picture or letters within a printed word. Spatial accuracy is limited in systems that are currently available and users must be cautioned that, although eye-position data may be expressed in x, y pixel coordinates relative to the display or visual scene space, systems are not capable of producing this pixel level of accuracy and humans cannot voluntarily direct their eyes to such precise locations.

Temporal accuracy should also be considered when selecting an eye-tracking system. For example, if questions must be answered about very fast processes, such as eye-motor planning or the time course of visual processes, then a system with high temporal accuracy would be required to permit rapid sampling of eye behaviors and more sophisticated experimental techniques that enable stimulus display changes contingent on eye-gaze behaviors. Additionally, the temporal and spatial characteristics of the display monitor, specifically the refresh rate, flatness, and dimensions should be appropriate for the eye-monitoring system and experimental questions.

Participant characteristics

It is important to consider the requirements of the system in relation to the needs of participants. For example, some systems require a fixed head position to separate eye movements from head movements for high spatial accuracy. Such systems would be appropriate for young, healthy adults, who are highly cooperative and would tolerate restraints to restrict head and chin movement and use a bite-bar to help fix the head. These systems, however, may not be tolerated by adults with physical or cognitive impairments, some older adults, or very active young children for whom remote eye-tracking systems may be more appropriate.

Good head control is another consideration, however, and if participants are unable to tolerate a fixed-head system, then a head-mounted (or a remote system that corrects for head movement) may be required. If participants must wear helmets or other headgear this may limit the use of head-worn hardware. The use of eyeglasses also must be considered; for some systems reflections from eyeglasses interfere with performance accuracy. In fact, data collection with some individuals may be difficult on any system if individuals have problems coordinating the movements of their eyes, blink excessively, or produce irregular eye movements that interfere with data collection.

Response mode

It is also important to consider the type of response that will be required of participants for data collection. If participants are required to speak or sign then the system must allow for some limited movement but be capable of tracking head position. If a keyboarded response or other activity, such as writing, for which the participant must look away from the display, is required, then the camera set-up must be considered to ensure that it does not obstruct the visual field or interfere with eye monitoring. Responses that require limited head and body movement, such as button presses or eye movements toward a display region to select an object, letter, or icon, could be used with any system.

For research that requires simultaneous collection of other physiologic data, attention to requirements of external hardware must be considered. For example, head-worn systems that minimize the amount of metal are recommended for use in functional magnetic resonance imaging (fMRI) protocols, and remote systems are recommended for use in Evoked Response Potentials (ERP) protocols in which surface electrodes are attached to head-worn helmets.

Freedom of movement

It is important to consider the degree of movement that is required and typical for the experimental task. If operations involve manipulation of objects, walking, talking, or signing in virtual-reality environments, or daily movements in real-world interactions, then systems that allow for full movement are necessary; still, some tracking of head movement must be considered to retain accuracy. Systems that are unobtrusive, such as remote systems, may be preferable in some natural settings, but with less physical control, the experimenter sacrifices spatial measurement accuracy. If the experimental task can be executed with little or no movement and participants are alert and cooperative, then it may be preferable to explore systems used with chin rests or other restraints to limit head movement.

Comfort/set-up time/portability

Systems differ in the amount of time required to position and adjust the hardware. If the system requires the use of a bite bar this will add time to the set-up. Some researchers select chairs that can be adjusted to change the height of the display relative to the participant or that use head rests to achieve comfortable positions. If portability is required it is a good idea to consider a system that could be installed on a cart that may be moved to different lab areas. Some systems operate best under special lighting conditions and the luminance levels must be considered. Typically, incandescent light (generated by standard light bulbs) contains some infrared components and may degrade performance accuracy.


The cost of the system is not a good indicator of whether it will be appropriate to a researcher's needs. For example, the most expensive systems may offer very high temporal and spatial accuracy, but require a fixed head and use of a bite bar, which may not be desirable for a specific application. Systems range in cost from approximately $10,000 to $100,000. Customized systems for special environments or dedicated software will add to the cost.


It is highly unlikely that a system purchased from an eye-tracking manufacturer will include software needed to conduct the type of study the buyer may have in mind. Be skeptical of advertisements for "turn-key" eye-tracking systems. Consider the level of support from the manufacturer, as well as the resources available to you for programming. There are many different algorithms for determining eye position and movement, and for calculating dependent measures. Some systems supply these and allow modifications to a standard software package. Some manufacturers or independent support groups provide extensive software libraries specific to a system. Some systems are compatible with other data acquisition and analysis programs. Others are specific to particular platforms. If other data streams, such as speech acoustics, button presses, or electrophysiological measures, are to be analyzed in relation to the eye-monitoring data then they must be synchronized and require customized programming and hardware, adding to the complexity of the system.

Monitoring of eye movements provides an online record of visual behaviors in cognitive tasks, such as listening to and understanding speech, writing, sign, or nonverbal communication. If exploited through careful experimental design, eye movement patterns may provide valuable information about visual attention and the time course of visual processes as they correspond to language and cognition. For individuals with motor and cognitive impairments, eye movements provide a non-linguisitic response mode that can be used to assess cognition and comprehension.

cite as: Hallowell, B.  & Lansing, C. R. (2004, November 16). Tracking Eye Movements to Study Cognition and Communication. The ASHA Leader.

Useful links for Information on Eye-Monitoring Technology, Research, Manufacturers, and Applications - a comprehensive collection of Internet resources to link to people and labs, manufacturers, archives of discussion issues, papers, conferences, and events, designed and maintained by Dr. Michael Liu


Advertise With UsAdvertisement