See the Treatment section of the
Voice Disorders evidence map for pertinent scientific evidence, expert opinion, and client/caregiver perspective.
Intervention is conducted to achieve improved voice production and coordination of respiration and laryngeal valving. The ASHA Practice Portal page on head and neck cancer will address intervention aimed at acquisition of alaryngeal speech sufficient to allow for functional oral communication.
Consistent with the WHO (2001) framework, intervention is designed to
- capitalize on strengths and address weaknesses related to underlying structures and functions that affect voice production;
- facilitate the individual's activities and participation by assisting the person in acquiring new communication skills and strategies; and
- modify contextual factors to reduce barriers and enhance facilitators of successful communication and participation, and to provide appropriate accommodations and other supports, as well as training in how to use them.
See the ASHA resource titled
Person-Centered Focus on Function: Voice
[PDF] for an example of functional goals consistent with ICF.
In the case of medically related voice disorders (e.g., vocal polyps, vocal cysts, spasmodic dysphonia), SLPs often team with otolaryngologists and other medical professionals (e.g., pulmonologists, gastroenterologists, neurologists, allergists, endocrinologists, and occupational medicine physicians) and, if appropriate, develop treatment plans to support the medical plan and to optimize outcomes.
Some individuals develop voice disorders in the absence of structural pathology (e.g., functional aphonia, muscle tension dysphonia, and mutational/functional falsetto) and may benefit from support in addition to what can be provided by the SLP. Counseling, direct manipulation of the voice, and use of interview questions can be used to probe possible factors contributing to the voice problem. SLPs refer the individual to appropriate health care professionals (e.g., psychologists) to address issues outside the
SLP's scope of practice (ASHA, 2016b). SLPs often engage in collaborative approaches throughout the course of assessment and subsequent treatment.
See the ASHA resources on
collaboration and teaming and
interprofessional education/interprofessional practice (IPE/IPP).
Norms within different settings are considered when determining vocal needs and establishing goals. For example, vocal norms and needs within the workplace may be different from those within the community (e.g., home and social settings).
SLPs often incorporate aspects of more than one therapeutic approach in developing a treatment plan.
Approaches can be direct or indirect.
- Direct approaches focus on manipulating the voice-producing mechanisms (e.g., phonation, respiration, and musculoskeletal function) in order to modify vocal behaviors and establishing healthy voice production (Colton & Casper, 1996; Stemple, 2000).
- Indirect approaches modify the cognitive, behavioral, psychological, and physical environments in which voicing occurs (Roy, et al., 2001; Thomas & Stemple, 2007). Indirect approaches include the following two components:
- Patient education—discussing normal physiology of voice production and the impact of voice disorders on function; providing information about the impact of vocal misuse and strategies for maintaining vocal health (vocal hygiene)
- Counseling—identifying and implementing strategies such as stress management to modify psychosocial factors that negatively affect vocal health (Van Stan, Roy, Awan, Stemple, & Hillman, 2015)
A therapeutic plan typically involves the use of at least one of the direct approaches and one or more of the indirect approaches based on the patient's condition and goals. Some clinicians concentrate on directly modifying the specific symptoms of the inappropriate voice, whereas others take a more holistic approach, with the goal of balancing the physiologic subsystems of voice production—respiration, phonation, and resonance.
Many clinicians begin by
- identifying behaviors that are contributing to the voice problems, including unhealthy vocal hygiene practices (e.g., shouting, talking loudly over noise, coughing, throat clearing, and poor hydration) and
- implementing healthy vocal hygiene practices (e.g., drinking plenty of water and talking at a moderate volume) and practices to reduce vocally traumatic behaviors (e.g., voice conservation).
The following subsections offer brief descriptions of general and specific treatments for individuals with voice disorders. They are organized under two broad categories: physiologic voice therapy (i.e., those treatments that directly modify the physiology of the vocal mechanism) and symptomatic voice therapy (i.e., those treatments aimed at modifying deviant vocal symptoms or perceptual voice components using a variety of facilitating techniques). This list of treatment options is not exhaustive, and the inclusion of any specific treatment approach does not imply endorsement by ASHA. For more information about treatment approaches and their use with various voice disorders, see Stemple et al. (2010).
Treatment selection depends on the type and severity of the disorder and the communication needs of the individual. Clinicians are sensitive to cultural, linguistic, and individual variables when selecting appropriate treatment approaches. As indicated in the
Code of Ethics (ASHA, 2016a), SLPs who serve this population should be specifically educated and appropriately trained to do so.
Physiologic Voice Therapy
Physiologic voice therapy is inherently a holistic approach to treatment. Physiologic voice therapy programs strive to balance the three subsystems of voice production (respiration, phonation, and resonance) as opposed to working directly on isolated voice symptoms. Most physiologic approaches may be used with a variety of disorders that result in hyper- and hypofunctional vocal patterns. Below are some of the physiologic voice therapy programs, arranged in alphabetical order.
The accent method is designed to increase pulmonary output, improve glottic efficiency, reduce excessive muscular tension, and normalize the vibratory pattern during phonation. During therapy, the clinician may do one or more of the following tasks:
- Facilitate abdominal breathing by initially placing the patient in a recumbent position.
- Use rhythmic vocal play with models of accented phonation patterns, which the patient then imitates.
- Transfer rhythms to articulated speech, initially being given a model and eventually progressing through reading, monologues, and conversational speech.
(See, e.g., Kotby, Shiromoto, & Hirano, 1993; Malki, Nasser, Hassan, & Farahat, 2008.)
Conversation Training Therapy (CTT)
Conversation Training Therapy (CTT) focuses exclusively on voice awareness and production in patient-driven conversational narrative, without the use of a traditional therapeutic hierarchy. Grounded in the tenets of motor learning, CTT strives to guide patients in achieving balanced phonation through clinician reinforcement, imitation and modeling in conversational speech. CTT incorporates six interchangeable components: 1) clear speech, 2) auditory and kinesthetic awareness, 3) negative practice/labeling, 4) embedding basic training gestures into speech, 5) prosody, projection and pauses, and 6) rapport building (Gartner-Schmidt et al, 2016; Gillespie et al, 2019).
Cup Bubble/Lax Vox
Cup bubble, also known as Lax Vox, is an aerodynamic building task aimed at improving ability to sustain phonation while speaking. It is done by having a patient blow air initially into a cup of water without voice. Voicing can be added for subsequent trials, and in time, pitch can be altered across and within trials. Eventually, the cup is removed during voicing, and the phonation continues. These exercises are thought to widen the vocal tract during phonation and reduce tension in the vocal folds. Biofeedback increases the individual's awareness of his or her healthy voice production (e.g., Denizoglu & Sihvo, 2010; Simberg & Laine, 2007).
Expiratory Muscle Strength Training (EMST)
Expiratory muscle strength training (EMST) improves respiratory strength during phonation. Increase in maximum expiratory pressure (MEP) can be trained with specific calibrated exercises over time, thus improving the relationship between respiration, phonation, and resonance. EMST uses an external device to mechanically overload the expiratory muscles. The device has a one-way, spring-loaded valve that blocks the flow of air until the targeted expiratory pressure is produced. The device can be calibrated to increase or decrease physiologic load on the targeted muscles (Pitts et al., 2009).
Lee Silverman Voice Treatment (LSVT®)
Lee Silverman Voice Treatment (LSVT®; Ramig, Bonitati, Lemke, & Horii, 1994) was initially developed for patients with Parkinson disease but can also be used with other populations. It is designed to help maximize phonatory and respiratory function using a set of simple tasks. Individuals are instructed to produce a loud voice with maximum effort and to monitor the loudness of their voices while speaking. The effort that is involved generates improved respiratory support, laryngeal muscle activity, articulation, and even facial expression and animation. Using a sound-level meter, visual biofeedback is provided to demonstrate the effort necessary to increase loudness. LSVT is provided by clinicians who are specifically trained and certified in the administration of this technique.
Five basic principles are followed in LSVT:
- Individuals should "think loud/think shout."
- Speech effort must be high.
- Treatment must be intensive.
- Patients must recalibrate their loudness level.
- Improvements are quantified over time.
Manual Circumlaryngeal Techniques
Manual circumlaryngeal techniques are intended to reduce musculoskeletal tension and hyperfunction by re-posturing the larynx during phonation. There are three main manual laryngeal re-posturing techniques:
- Push-back maneuver—place forefinger on thyroid cartilage and push back to change shape of glottis.
- Pull-down maneuver—place thumb and forefinger in the thyrohyoid space and pull the larynx downward.
- Medial compression and downward traction—place thumb and forefinger in the thyrohyoid space, and apply medial compression.
Applying these maneuvers during vocalization allows the individual to hear resulting changes in voice quality (Andrews, 2006; Roy, Bless, Heisey, & Ford, 1997). Care is taken when employing these techniques, as some patients report discomfort.
Phonation Resistance Training Exercise (PhoRTE)
Phonation Resistance Training Exercise (PhoRTE; Ziegler & Hapner, 2013) was adapted from LSVT and consists of four exercises:
- Producing /a/ with loud maximum sustained phonation
- Producing /a/ with loud ascending and descending pitch glides over the entire pitch range
- Producing functional phrases using a loud and high (pitched) voice
- Producing the same functional phrases using loud and low (pitched) voice
Individuals are reminded to maintain a "strong" voice throughout these treatment exercises. PhoRTE has a less intensive intervention schedule than LSVT. PhoRTE also differs in that it combines both loudness and pitch when producing phrases (i.e., loud and low pitch; loud and high pitch). Use of PhoRTE has been studied in adults with presbyphonia (aging voice) as a way to improve vocal outcomes (e.g., decrease phonatory effort) and increase voice-related quality of life (Ziegler, Verdolini Abbott, Johns, Klein, & Hapner, 2014).
Resonant Voice Therapy
Resonant voice is defined as voice production involving oral vibratory sensations, usually on the anterior alveolar ridge or lips or higher in the face in the context of easy phonation. Resonant voice therapy uses a continuum of oral sensations and easy phonation, building from basic speech gestures through conversational speech. The goal is to achieve the strongest, "cleanest" possible voice with the least effort and impact between the vocal folds to minimize the likelihood of injury and maximize the likelihood of vocal health (Stemple et al., 2010). The program incorporates humming and both voiced and voiceless productions that are shaped into phrase and conversational productions (Verdolini, 1998, 2000).
Stretch and Flow Phonation
Stretch and flow phonation —also known as Casper-Stone Flow Phonation—is a physiological technique used to treat functional dysphonia or aphonia (Stone & Casteel, 1982). It focuses on airflow management and is used for individuals with breath-holding tendencies. Individuals are instructed to focus on a steady outflow of air during exhalation. Various biofeedback methods are used, including placing a piece of tissue in front of the mouth or holding one's hand in front of the mouth to monitor airflow. Voicing is introduced once the individual masters continuous airflow during exhalation. As such, this technique produces a breathy voice quality. Eventually, this voice quality is carried into trials with spoken words and phrases, and the breathiness is gradually reduced.
Flow Phonation (Gartner-Schmidt, 2008, 2010) is a hierarchical therapy program to designed to facilitate increased airflow, ease of phonation, and forward oral resonance. It was modified from Stretch and Flow Phonation by eliminating the “stretch” component which reduced the rate of speech in the original therapy.
Vocal Function Exercises (VFEs)
Vocal function exercises (VFEs) are a series of systematic voice manipulations designed to facilitate return to healthy voice function by strengthening and coordinating laryngeal musculature and improving efficiency of the relationship among airflow, vocal fold vibration, and supraglottic treatment of phonation (Stemple, 1984). Sounds used in training are specific, and correct production is encouraged. VFEs consist of four exercises—warm-up, stretching, contracting, and power exercises. Exercises are completed twice a day (morning and evening) in sets of two. Maximum phonation time goals are set on the basis of individual lung capacity and an airflow rate of 80 ml/sec. Individuals are advised to use a soft, engaged tone and are trained to use a semi-occluded vocal tract (lip buzz) without tension during voice productions.
Symptomatic Voice Therapy
The focus of symptomatic voice therapy is on the modification of the deviant vocal symptoms or perceptual voice components. Deviant symptoms may include pitch that is too high or low, voice that is too soft or loud, breathy phonation, or the use of hard glottal attacks or glottal fry. Symptomatic voice therapy assumes voice improvement through direct symptom modification using a variety of voice facilitating techniques (Boone et al., 2010) that are either direct or indirect.
Amplification devices such as microphones can be used to amplify the voice in any situation that requires increased volume (e.g., when speaking to large groups, or during conversation when the individual's voice is weak). As such, voice amplification can function as a supportive tool or as a means of augmentative communication. It can help prevent vocal hyperfunction as a result of talking at increased volume or for extended periods of time.
Auditory masking is used in cases of functional aphonia/dysphonia and often results in changed or normal phonation. Individuals are instructed to talk or read passages aloud while wearing headphones with masking noise input. Using a loud noise background, the individual often produces voice at increased volume (Lombard effect) that can be recorded and used later in treatment as a comparison (e.g., Brumm & Zollinger, 2011; Adams & Lang, 1992).
The basis of biofeedback is that self-control of physiologic functions is possible with continuous, immediate information about internal bodily state. Biofeedback provides clear and reliable feedback in response to alterations in voice production, thus facilitating improvements in pitch, loudness, quality, and effort. It can be kinesthetic, auditory, or visual. Using biofeedback, individuals are trained to become aware of physical sensations with respect to respiration, body position, and vibratory sensation. Awareness helps the individual understand his or her physiological processes when generating voice. Auditory feedback, such as real-time amplification auditory modeling is an effective way to achieve voice improvement.
Chant speech is characterized by a rhythmic, prosodic pattern that serves as a template for spoken utterances. It is used in therapy to help reduce phonatory effort that results in vocal fatigue and decrease in phonatory capabilities. Chant speech requires pitch fluctuations and coordination among respiratory, phonatory, and resonance subsystems. Speakers habituate to these more efficient vocal patterns. The increased lung pressure required for these tasks may also decrease reliance on laryngeal resistance and reduce fatigue (e.g., McCabe & Titze, 2002).
Confidential voice is designed to reduce laryngeal tension/hyperfunction and increase air flow (Casper, 2000). The individual begins with an easy and breathy vocal quality and builds to normal voicing without decreasing airflow. This technique is intended to address excessive vocal tension and to facilitate relaxation in the muscles of the larynx.
Glottal fry is useful for patients with vocal nodules and other problems associated with hyperfunction (e.g., polyps, functional dysphonia, spasmodic dysphonia, vocal fold thickening, and ventricular phonation). Because the vocal folds must be relaxed in order to produce glottal fry, this technique can be a useful index of vocal fold relaxation (Boone et al., 2010). Although glottal fry is a powerful facilitative technique to offload tension in the larynx, it is not a long-term speech quality target.
Inhalation phonation is a technique used to facilitate true vocal vibration in the presence of habitual ventricular fold phonation, functional aphonia, and muscle tension dysphonia. Individuals produce a high-pitched voice on inhalation. Upon inhalation voicing, the true vocal folds are in a stretched position, suddenly adducted and in vibration. Upon exhalation, patients try to achieve a nearly matched voice. This approach eases the way to gaining true vocal fold vibration.
Semi-Occluded Vocal Tract (SOVT) Exercises
Semi-occluded vocal tract (SOVT) exercises in voice therapy involve narrowing at any supraglottic point along the vocal tract in order to maximize interaction between vocal fold vibration (sound production) and the vocal tract (the sound filter) and to produce resonant voice.
Straw phonation is one of the most frequently used methods to create semi-occlusion in the vocal tract (Titze, 2006). Narrowing the vocal tract increases air pressure above the vocal folds, keeping them slightly separated during phonation and reducing the impact collision force. To accomplish this, the individual semi-occludes the vocal tract by phonating through a straw or tube. Resistance can be manipulated by varying the length and diameter of the straw. Individuals practice sustaining vowels, performing pitch glides, humming songs, and transitioning to the intonation and stress patterns of speech. Eventually, use of the straw is reduced and eliminated.
Semi-occlusion at the level of the lips is accomplished via lip trills. This technique involves a smooth movement of air through the oral cavity and over the lips, causing a vibration (lip buzz), similar to blowing bubbles underwater. Often, the trills are paired with phonation and pitch changes. The focus is to improve breath support and produce voicing without tension.
The patient is instructed in the technique of sitting with upright posture and with the shoulders in a low, relaxed position to facilitate voice production with less effort. Collaboration with a physical therapist or occupational therapist may be necessary with some patients.
In cases of vocal hyperfunction, a variety of relaxation techniques may be useful as a tool to reduce both whole-body and laryngeal area tension. The goal of these techniques is to reduce effortful phonation. Frequently used techniques include progressive muscle relaxation (slowly tensing and then relaxing successive muscle groups), visualization (forming mental images of a peaceful, calming place or situation), and deep breathing exercises.
Twang therapy is used for individuals with hypophonic voice. It involves the narrowing of the aryepiglottic sphincter using a "twang" voice to create a high-intensity voice quality while maintaining low vocal effort (Lombard & Steinhauer, 2007). The desired outcome is decreasing phonatory effort and increasing vocal efficiency.
This facilitating technique uses the natural functions of yawning and sighing to overcome symptoms of vocal hyperfunction (e.g., elevated larynx and vocal constriction). The technique is intended to lower the position of the larynx and subsequently widen the supraglottal space in order to produce a more relaxed voice and encourage a more natural pitch.
Refer to the Service Delivery section of the
Voice Disorders evidence map for pertinent scientific evidence, expert opinion, and client/caregiver perspective.
In addition to determining the type of speech and language treatment that is optimal for individuals with voice disorders, SLPs consider other service delivery variables—including format, provider, dosage, and timing—that may affect treatment outcomes.
- Format — the structure of the treatment session (e.g., group vs. individual; direct and/or consultative).
- Provider — the person offering the treatment (e.g., SLP, trained volunteer, caregiver).
- Dosage — the frequency, intensity, and duration of service. Clinicians consider the unique needs of each patient and the nature of the voice disorder in determining appropriate dosage for therapy. Some voice therapy programs will have specific dosage parameters. See De Bodt, Patteeuw, & Versele (2015) for a summary of international practices regarding temporal variables (dosage and frequency) in voice therapy.
- Timing — when intervention is conducted relative to the diagnosis.
- Setting — location of treatment (e.g., home, community based, work).