Vocal Tract Visualization and Imaging

Vocal tract visualization and imaging is the collection of procedures for performing a detailed visual examination of the vocal tract and laryngeal and velopharyngeal structures and gross function, including vocal fold vibration. These procedures enable a speech-language pathologist (SLP) to further assess and plan treatment strategies for

voice,
deglutition, and
resonance disorders.

These procedures use either a constant or a stroboscopic light source for indirect laryngoscopy, rigid fiberoptic oral endoscopy (RFOE), or flexible fiberoptic nasendoscopy (FFN). Images and/or videos can be made using any of these techniques and can be stored on digital media. Physicians are the only professionals qualified and licensed to render medical diagnoses related to the identification of laryngeal pathology as it affects voice. Imaging should be viewed and interpreted by an otolaryngologist with training in this procedure when used for medical diagnostic purposes. SLPs trained in stroboscopy view and interpret imaging for SLP diagnosis (e.g., dysphagia) and to establish/modify treatment plans. Videofluoroscopy, ultrasound, and video images can also be used to view all or part of the vocal tract and oral structures. However, this is not the focus of this page.

Please see ASHA’s resource on Flexible Endoscopic Evaluation of Swallowing (FEES) for further information on imaging for deglutition.

Instrumentation

Although there is typically some variation between procedures, an effort has been made to standardize protocols for instrumental assessment of voice, including recommendations for laryngeal endoscopic imaging (Patel et al., 2018).

Flexible Fiberoptic Nasendoscopy (FFN)

FFN is performed with a flexible nasendoscope inserted through the nasal passage. A fiberoptic bundle transmits high-intensity light to illuminate structures, which are then viewed and/or recorded. Distal-chip flexible endoscopes allow for assessment of vibratory motion similar to that of a rigid endoscope with stroboscopy (Patel, 2012). A nasendoscope with a smaller diameter may be used for pediatric populations.

Advantages

excellent image of the vocal folds and velopharyngeal structures during
- voicing,
- conversation,
- singing,
- eating/swallowing, and
- rest breathing
potential for image recording and instant replay

Disadvantages

equipment expense
possible patient discomfort
possible stimulation of gag reflex

Please see ASHA’s resource on Flexible Endoscopic Evaluation of Swallowing (FEES) for related information.

Rigid Fiberoptic Oral Endoscopy (RFOE)

RFOE is performed with a rigid tube inserted into the oral or pharyngeal cavity. A prism optic system projects high-intensity light at a predetermined angle to illuminate the structures to be observed and recorded.

Advantages

high illumination
wide field of view
excellent image reproduction
smaller diameter rigid endoscopes are available for pediatric populations or those with a smaller oral cavity

Disadvantages

interference with normal speech production
minor patient discomfort
equipment expense
possible difficulties with gag reflex

Videolaryngoendoscopy (either RFOE or FFN)

Videolaryngoendoscopy is used to assess the following (Patel et al., 2018):

vocal fold mobility
vocal fold maximum range
vibratory characteristics of the vocal folds
vocal fold appearance
- malposition
- excrescence (abnormal projection/outgrowth)
- edema
- erythema
vocal fold edge appearance
- smooth
- straight
- bowed
- convex
- concave
- irregular
- rough
subglottal appearance
- erythema
- edema
supraglottal behavior
- medial compression
- anterior–posterior compression
- mild/moderate/severe
arytenoid movement
- normal or impaired mobility
  - bilateral
  - unilateral
velopharynx
- contact between the soft palate and the posterior pharyngeal wall as well as lateral pharyngeal wall movement with
  - sustained fricatives such as /s/,
  - syllable repetition,
  - multisyllabic words,
  - phrases with pressure-loaded consonants, and
  - sentence or spontaneous speech
secretions
- amount
- consistency

Videostroboscopy

Videostroboscopy is performed with either a flexible or a rigid endoscope combined with a strobe light correlated to vocal fold vibration via a laryngeal microphone. This combination permits vocal tract structures to be seen in an apparent “slow motion” format.

Advantages

extensive body of information relative to the effect of pathology on the process of voicing
potential for providing information about the neuromuscular and physiological integrity of the vocal folds and supraglottic structures

Disadvantages

patient discomfort related to the use of FFN or RFOE
image restricted to isolated vowel production when the strobe light is used
highly subjective (Roy et al., 2013)

Videostroboscopy is used to assess the following (Patel et al., 2018):

amplitude of excursion (lateral movement of the vocal fold medial plane)
- symmetrical
- normal/reduced/absent
- each fold can be rated separately as a percentage
vertical level—level difference in the vertical plane between vocal folds during the maximum closed phase of the glottic cycle
- on-plane
- off-plane
periodicity of vocal fold movement
- always/usually/sometimes/never periodic
- segments of the vocal fold that are aperiodic
vocal fold mucosal wave (independent lateral movement of mucosa over the vocal fold)
- normal/diminished/great/symmetrical/absent
glottal closure pattern—glottal configuration at maximum closure
- complete
- incomplete
  - posterior glottal gap
  - anterior glottal gap
  - hourglass
  - incomplete
  - irregular
  - spindle-shaped/bowing
phase closure—relative proportion of the glottal cycle in which the glottis is closed versus open
- open phase
- closed phase
vocal fold appearance
- malposition
- excrescence (abnormal projection/outgrowth)
- edema
- erythema
vocal fold edge appearance
- smooth
- straight
- bowed
- convex
- concave
- irregular
- rough
subglottal appearance
- erythema
- edema
supraglottal behavior
- medial compression
- anterior–posterior compression
- mild/moderate/severe
arytenoid movement
- normal or impaired mobility
  - bilateral
  - unilateral
velopharynx
- contact between the soft palate and the posterior pharyngeal wall as well as lateral pharyngeal wall movement with
  - sustained fricatives such as /s/,
  - syllable repetition,
  - multisyllabic words,
  - phrases with pressure-loaded consonants, and
  - sentence or spontaneous speech
secretions
- amount
- consistency

Interpretation

amplitude asymmetry—mass, compliance, neurogenic difference, scarring, granuloma
function of the velopharynx—degree of closure, context relevant behaviors
inadequate closure—intervening mass, neurogenic disorder (paralysis), hypofunctional disorder
mucosal wave adynamic segment—cover scarring, intracordal cyst, fibrosis, neurogenic disorder, edema
phase asymmetry—mass, compliance, neurogenic difference
supraglottic compression—hyperfunction, compensatory hyperfunction
voice quality abnormal, larynx normal—behavioral disorder

Roles and Responsibilities

For many clinicians, it will be necessary to seek training in visualization and imaging after completion of the requirements for the ASHA Certificate of Clinical Competence through intensive continuing education, pre-service, or in-service training programs. Education and training may vary for each of these procedures. The training and mentorship should take place in a clinical setting, allowing the professional to work with more experienced professionals and a number and variety of patients. Practitioners must determine if they have obtained a sufficient degree of education and training to be competent to perform vocal tract visualization and imaging. The safety of the patient is paramount when considering any procedure. Please see ASHA’s Vocal Tract Visualization and Imaging: Position Statement and ASHA’s States with Specific Instrumental Assessment Requirements for further information.

Precautions and Risks

Before undertaking these procedures, practitioners consider the following precautions:

Check with state licensure board(s), where appropriate, to determine whether there are limitations on the scope of SLP practice that restrict the performance of these procedures.
Follow universal precautions, including personal protective equipment (PPE) as appropriate, to prevent the risk of disease transmission from blood/airborne pathogens.
Have immediate emergency medical assistance available when using topical anesthesia or FFN.
Hold a current Basic Life Support Certificate if performing FFN or using topical anesthesia.
Recommend that the patient remains NPO until anesthetic wears off.

Practitioners also educate patients on risks associated with imaging, obtain the patient's informed consent, and maintain documentation when performing FFN or when using topical anesthesia. Risks may include the following:

vasovagal response
adverse/allergic reaction to topical anesthesia
nasal irritation

Anatomical Structures, Adult

Laryngeal structures—open vocal folds.

Figure 2-3. Laryngeal structures—open vocal folds. Adapted From Voice Disorders, Fourth Edition (pp. 1-517) by Sapienza, C., & Hoffman, B. Copyright © 2022 Plural Publishing, Inc. All rights reserved. Used with permission.

Laryngeal structures—closed vocal folds

Figure 2-4. Laryngeal structures—closed vocal folds. Adapted from Voice Disorders, Fourth Edition (pp. 1-517) by Sapienza, C., & Hoffman, B. Copyright © 2022 Plural Publishing, Inc. All rights reserved. Used with permission.

Aryepiglottic fold—composed of the mucous membrane, not typically used in voice production (Figure 2-4)

Corniculate cartilage—paired cartilaginous structures that sit atop the arytenoid cartilage, not directly implicated in voice production (Figure 2-4)

Cuneiform cartilage—cartilage embedded in the aryepiglottic muscle/fold that serves as a supportive framework for the larynx (Figure 2-3)

Epiglottis—cartilage covered with a mucous membrane, does not serve a function in voice production (Figures 2-3 and 2-4)

Esophageal sphincter—a muscular ring that opens into the esophagus, does not serve a function in typical voice production (Figures 2-3 and 2-4)

Posterior pharyngeal wall—the muscular wall of the posterior pharynx used in swallowing, not used in voice production (Figure 2-4)

Tracheal rings—cartilaginous rings of the trachea, do not serve a function in voice production (Figure 2-3)

True vocal folds—muscularized mucous membranes used for sound production (Figures 2-3 and 2-4)

Ventricular folds—ligaments covered by a mucous membrane that lie superior to the true vocal folds, also called “false vocal folds” (Figure 2-4)

ASHA Resources

References

Patel, R. R. (2012). Updates on endoscopic laryngeal imaging. Perspectives on Voice and Voice Disorders, 22(2), 64–71. https://doi.org/10.1044/vvd22.2.64

Patel, R. R., Awan, S. N., Barkmeier-Kraemer, J., Courey, M., Deliyski, D., Eadie, T., Paul, D., Švec, J. G., & Hillman, R. (2018). Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association Expert Panel to Develop a Protocol for Instrumental Assessment of Vocal Function. American Journal of Speech-Language Pathology, 27(3), 887–905. https://doi.org/10.1044/2018_AJSLP-17-0009

Roy, N., Barkmeier-Kraemer, J., Eadie, T., Sivasankar, M. P., Mehta, D., Paul, D., & Hillman, R. (2013). Evidence-based clinical voice assessment: A systematic review. American Journal of Speech-Language Pathology, 22(2), 212–226. https://doi.org/10.1044/1058-0360(2012/12-0014)