Knowledge of the cerebral organization of speech production is slowly emerging. More data are available for other domains of the human sensorimotor system, such as locomotion or upper limb movements. The lack of a homologous animal model presents one difficulty in compiling research. Another restriction arises in tracking movement excursions and making electromyographic measurements at the vocal-tract level.
Analyses of the brain mechanisms subserving articulatory and phonatory functions mainly rely on perceptual and acoustic analyses of dysarthric deficits in patients with focal cerebral lesions or neurodegenerative disorders such as Parkinson's disease or cerebellar atrophy. However, these analyses do not provide clear inferences on the neural mechanisms underlying motor aspects of spoken language. As a more recent alternative, functional imaging techniques such as positron emission tomography (PET) or functional magnetic resonance imaging (fMRI) can be used to evaluate brain activity associated with speech production.
Blood-Flow Changes and Brain Imaging
Animal experimentation in the late 19th century first suggested an "automatic" increase of regional cerebral blood flow in response to local variations of neural activity (neurovascular coupling). Thus, registration of hemodynamic (blood-flow) changes should identify cerebral structures engaged in distinct sensorimotor or cognitive tasks.
PET and fMRI currently represent the two most important brain-imaging techniques based on neurovascular coupling mechanisms. PET makes use of the unique radioactive decay characteristics of positrons—the positively charged particles given off by the nucleus of unstable atoms such as 15O. Following the injection of a small amount of 15O-labeled water, the radioactive substance accumulates at the level of the cerebral cortex in direct proportion to local blood flow. A corona of radiation detectors enclosing a subject's head count the "annihilation events" during PET scanning. These data then allow for the calculation of a map of hemodynamic changes and, thus, local neural activity.
In contrast to PET investigations, fMRI represents a non-invasive procedure, based on the detection of endogenous tissue contrasts (blood-oxygen-level-dependent [BOLD] effect), and provides superior spatial resolution similar to that of anatomical MR imaging. However, articulatory gestures produced when a patient speaks may create motion-induced BOLD signal changes that eventually limit the effects of experimental tasks.
Speech Motor Control
The first systematic account of the cerebral circuitry underlying speech motor control emerged as a by-product of a PET investigation of lexical aspects of single-word processing (see Ackermann & Ziegler, in press, for a review). It was assumed that subtraction of the hemodynamic responses to passive viewing of or passive listening to words from the regional cerebral blood-flow effects along with loud repetition of these items should isolate the brain areas related to motor aspects of speech production. Besides activation of the supplementary motor area (SMA) located within the medial wall of the frontal lobes, bilateral responses of sensorimotor cortex and anterior-superior portions of the cerebellum could be noted.
Unexpectedly, an activation spot "buried" in the depth of the lateral sulcus emerged. By contrast, both Broca's area and basal ganglia did not show any significant hemodynamic effects. A subsequent PET study based upon the repetition of auditorily applied nouns (versus stimulus anticipation) assigned the intrasylvian response to rostral parts of the intrasylvian cortex, i.e., the anterior insula.
A recent fMRI investigation tried to further elucidate the contribution of the anterior insula to speech motor control (Shuster & Lemieux, 2005). Besides damage to Broca's area, lesions of the anterior insula may give rise to apraxia of speech. This syndrome is assumed to reflect disrupted generation of a motor program (phonetic plan), providing the input to the motor execution system of speech production. Because the demands on "motor planning" must be expected to increase with utterance length, hemodynamic activation of the insula should correlate with this parameter of spoken language.
Indeed, overt (versus silent) repetition of mono- and multisyllabic nouns was found to be associated with intrasylvian activation spots. The longer items yielded significantly enhanced blood flow responses of the inferior parietal lobule, the precentral gyrus, and posterior parts of the inferior frontal convolution (Broca's area) of the left hemisphere, but a comparable effect of intrasylvian structures did not emerge. As a result, both Broca's area and the insular cortex in the depth of the lateral sulcus seem to cooperate during speech production, but these structures might subserve different control mechanisms.
Phonotactic rules, such as those of the German or English language, allow for a variety of syllable onset structures (V, CV, CCV). Consonant clusters must be expected to pose higher demands on articulatory/phonetic control mechanisms as compared to CV units. As a probe of motor aspects of speech production, a subsequent fMRI study used four tri-syllabic items ("ta-ta-ta" / "ka-ru-ti" / "stra-stra-stra" / "kla-stri-splu"), systematically varied in sequence complexity (the same three versus three different items in a row) and syllabic complexity (CV versus CCCV onset) (Bohland & Guenther, 2006). Because the experimental design included both GO and NOGO trials, overt task performance was contrasted with a state of being prepared to produce the same items.
The study found the cerebral correlates of speech motor control to encompass the post- and precentral gyrus, encroaching upon the posterior parts of the inferior frontal convolution, the anterior insula, supplementary motor area, the basal ganglia, thalamic areas, and the superior cerebellar hemispheres. As expected, an increase of stimulus complexity in either dimension yielded enhanced activation of at least some components of this "basic speech network." Remarkably, the most rostral aspects of the anterior insula showed a strong interaction of syllable and sequence complexity under the GO and the NOGO conditions. These observations support the assumption that at least a part of the intrasylvian cortex participates in "higher-order" aspects of speech motor control.
Damage to motor cortex, including the respective corticobulbar tracts, to the basal ganglia and the cerebellum may give rise to dysarthria. This network of brain structures (central-motor system), therefore, is assumed to specify vocal-tract movements during speech production (motor execution). As expected, functional imaging studies revealed blood-flow activation of the central-motor system under these conditions.
Consistently, spoken language elicited regional cerebral blood flow responses of SMA, located within the medial walls of the frontal lobes. Patients suffering from left-sided lesions of this area may exhibit reduced spontaneous verbal behavior, in the absence of any motor deficits of the vocal tract muscles and any deterioration of language functions (see Ackermann & Ziegler, in press, for a review). As a consequence, the medial wall of the frontal lobes appears to operate as a "starting mechanism of speech."
Speech Motor Deficits
Syllable repetitions performed as fast as possible (oral diadochokinesis) represent a sensitive and specific overall measure of dysarthric deficits. For example, reduced diadochokinesis rates can be observed in patients with spastic or ataxic dysarthria. By contrast, subjects suffering from Parkinson's disease show a largely unimpaired speech tempo or even may exhibit "speech hastening," i.e., involuntary acceleration of verbal utterances.
Functional imaging of syllable trains can be expected to provide further insights into the pathomechanisms of compromised speech rate control. We measured hemodynamic brain activation using fMRI during syllable repetitions at different rates (2.0, 2.5, 3.0, 4.0, 5.0, 6.0 Hz), synchronized either to an auditorily applied pacing signal or produced in a self-paced manner (Ackermann & Ziegler, in press). Significant hemodynamic main effects, calculated across all repetition frequencies (versus passive listening to the acoustic pacing signals), emerged within SMA, precentral areas, Broca's region, anterior insula, thalamus, basal ganglia, and cerebellum. Ventral premotor and intrasylvian cortex as well as the caudate nucleus showed lateralized responses in favor of the left side whereas the other components displayed a rather bilateral activation pattern ( Figure 1, p. 11 [PDF]).
Because damage to these cerebral structures compromises verbal behaviour and gives rise to dysarthria, apraxia of speech, or transcortical motor aphasia (lesions of SMA), the obtained functional imaging data are in good accord with clinical data. The second step of signal analysis was calculation of hemodynamic rate/response functions. SMA, sensorimotor cortex, anterior insula, and the cerebellar activation spots showed a positive linear rate/response relationship, i.e., the BOLD signal increased in parallel with syllable repetition rate.
A variety of functional imaging studies revealed "mass activation effects," i.e., a parallel increase of hemodynamic responses and motor demands, within the cortical hand representation area and SMA during finger-tapping tasks and joystick movements. These data seem to reflect a close relationship between neuronal activity and movement velocity as documented, for example, by single-cell recordings at monkey motor cortex. Quite conceivably, the same mechanisms are engaged in oral diadochokinesis tasks. Impaired mass activation effects, thus, provide a basis for the explanation of reduced maximum syllable repetition rates in dysarthric patients.
In accordance with previous investigations based upon silent (covert) syllable repetitions, the cerebellar activation spots at either side showed a step-wise increase of the BOLD signal between 3 and 4 Hz. A series of acoustic studies of our group had found that syllable rate does not fall below a value of 3 Hz in patients with ataxic dysarthria during oral diadochokinesis and sentence production tasks (see Ackermann et al., in press). Taken together, these clinical and functional imaging data indicate that the cerebellum "pushes" speaking rate beyond a level of about 3 Hz.
By contrast, parametric signal analysis revealed a negative linear relationship between syllable rate and hemodynamic response within the basal ganglia (putamen/pallidum and caudate nucleus).
In line with these data, recent PET studies documented an inverse relationship between a volume-mean normalized measure of regional cerebral blood flow and maximum syllable frequency at the level of right caudate nucleus in normal speakers (e.g., Sidtis et al., 2003). Thus, the basal ganglia seem to be characterized by a decline of hemodynamic activation in response to an increase of motor demands, at least during repetitive movements. Conceivably, the observed negative rate/response profiles reflect a more efficient organization of higher-frequency movements at the level of the basal ganglia. These suggestions could explain why patients suffering from Parkinson's disease show normal syllable rates during oral diadochokinesis tasks and during production of sentence utterances, in contrast to most other central-motor disorders, and eventually even may exhibit a "hastening" phenomenon, i.e., involuntary acceleration of speech tempo.
Functional imaging techniques found hemodynamic activation of SMA, ventral premotor and intrasylvian areas, primary sensorimotor cortex, and subcortical central-motor structures (basal ganglia, thalamus, cerebellum) during the production/repetition of lexical and non-lexical mono- or polysyllabic items ("minimal cerebral network of overt speech production").
These brain components seem to be organized into at least three functional systems:
- Starting mechanisms of speech production and initiating and maintaining an ongoing and fluent verbal stream depend upon SMA of the language-dominant hemisphere.
- Premotor components of the precentral and inferior frontal gyrus (Broca's area) of the left hemisphere, presumably including parts of the anterior insula, participate in the construction of the phonetic make-up of an utterance prior to innervation of vocal tract musculature.
- Motor execution, i.e., the on-line innervation of respiratory, laryngeal, and supralaryngeal muscles during speech production, is bound to the corticobulbar system as well as the cortico-subcortical motor loops traversing the basal ganglia and the cerebellar hemispheres.
These imaging techniques now begin to provide new insights into the pathomechanisms of dysarthric deficits such as abnormalities of speaking rate in patients suffering cerebellar disorders and Parkinson's disease.