-
PDF
- Split View
-
Views
-
Cite
Cite
Libo Geng, Xinyu Zhao, Qihui Xu, Haiyan Wu, Xueping Hu, Zhiyuan Liu, Lili Ming, Zixuan Xue, Chenyi Yue, Yiming Yang, Cognitive and neural mechanisms of voluntary versus forced language switching in Chinese–English bilinguals: an fMRI study, Cerebral Cortex, Volume 34, Issue 2, February 2024, bhae042, https://doi.org/10.1093/cercor/bhae042
- Share Icon Share
Abstract
The ecological validity of bilingual code-switching has garnered increasing attention in recent years. Contrary to traditional studies that have focused on forced language switching, emerging theories posit that voluntary switching may not incur such a cost. To test these claims and understand differences between forced and voluntary switching, the present study conducted a systematic comparison through both behavioral and neural perspectives. Utilizing fMRI alongside picture-naming tasks, our findings diverge from prior work. Voluntary language switching not only demonstrated switching costs at the behavioral level but also significantly activated brain regions associated with inhibitory control. Direct comparisons of voluntary and forced language switching revealed no significant behavioral differences in switching costs, and both shared several common brain regions that were activated. On the other hand, a nuanced difference between the two types of language switching was revealed by whole-brain analysis: voluntary switching engaged fewer language control regions than forced switching. These findings offer a comprehensive view of the neural and behavioral dynamics involved in bilingual language switching, challenging prior claims that voluntary switching imposes no behavioral or neural costs, and thus providing behavioral and neuroimaging evidence for the involvement of inhibitory control in voluntary language switching.
Introduction
Bilingual individuals often need to switch between two languages. Previous studies, under strict laboratory control conditions, have gained a clearer understanding of how the brains of bilinguals control these two languages. However, a common observation is that bilinguals frequently switch languages voluntarily and at will in their daily lives. How do bilingual brains function during this more ecologically valid form of language switching?
The adaptive control hypothesis, proposed by Green and Abutalebi (2013), suggests that bilinguals’ language selection and switching are influenced by the specific interactive context. In the dual-language context, where bilingual individuals and their conversational counterparts possess only one common language, the act of language selection hinges on the language mastered by the interlocutors. Consequently, bilinguals are compelled to effectuate language switches based on external cues, such as the face of the interlocutor—a phenomenon often termed as “forced language switching” (Jevtović et al. 2019; Jiao et al. 2022). Conversely, in a dense code-switching context, bilinguals share mastery of two or more languages with their conversational partners. In such settings, both parties have unrestricted access to their entire linguistic repertoire, which enables seamless language switching at will. This mode of language switching, where bilingual individuals exercise language switches according to their personal preferences, is termed “voluntary language switching” (Gollan et al. 2014; Blanco-Elorrieta and Pylkkänen 2017; Jevtović et al. 2019; Zhu et al. 2022).
Behavioral investigations centering on forced language switching consistently reveal switching costs, which are interpreted as evidence of inhibitory control involved in mitigating interference from nontarget languages (Green 1998; Costa and Santesteban 2004; Abutalebi and Green 2007). The most common paradigm in forced language switching research is the cue-switching paradigm (Meuter and Allport 1999; Costa and Santesteban 2004; Christoffels et al. 2007; Wang et al. 2009). Participants are presented with picture stimuli and are required to name them in their native language (L1) or their second language (L2). A cue, often a color or a flag, is presented before, during, or after the picture stimulus to indicate which language that should be used for naming. Research has found that artificial cues (e.g. colors or flags) typically produce greater switching effects than more natural cues (e.g. faces) because artificial cues require additional cue-related processing independent of the language switching process. Therefore, faces constitute a valid form of natural language cues (Blanco-Elorrieta and Pylkkänen 2017; Zhu et al. 2022).
The experimental setup generates two types of trials: switch trials, where the language named differs from the language in the previous trial, and repeat trials, where the language named remains the same as the previous trial. The key findings of this paradigm are the “switching costs,” which refer to slower response time and higher error rates on switch trials compared with repeat trials. Considering the differences in bilinguals’ second language proficiency levels, balanced bilinguals show symmetric switching costs, while unbalanced bilinguals exhibit asymmetric switching costs (Meuter and Allport 1999; Costa and Santesteban 2004; Christoffels et al. 2007; Wang et al. 2009; de Bruin et al. 2018). Green’s (1998) Inhibitory Control Model (ICM) proposes that bilinguals employ an inhibitory control mechanism to prevent interference from the nontarget language when speaking in the target language. The more proficient a language is, the more it is inhibited, and the greater the cognitive resources required to reactivate that inhibited language.
Furthermore, previous neuroimaging studies have primarily focused on forced language switching, thus providing limited information about the brain activation patterns triggered by voluntary language switching. Neuroimaging studies have unveiled the engagement of cortical and subcortical regions linked to language control during forced language switching, including the dorsolateral prefrontal cortex, inferior frontal gyrus, anterior cingulate gyrus, bilateral supramarginal gyrus, and bilateral caudate nucleus. This suggests that language control is realized through a network closely related to domain-general executive control (Hernandez et al. 2001; Wang et al. 2007; Abutalebi and Green 2008; Guo et al. 2011; Luk et al. 2011; de Bruin et al. 2014).
It is acknowledged that bilingual individuals frequently engage in voluntary language switching during their everyday discourse, even in the absence of external prompts (e.g. Chinese: 我有一个非常好的 idea——English: I have a very good idea). This exemplifies the phenomenon of voluntary language switching, which has higher ecological validity compared with the constrained scenarios of forced language switching (Gollan and Ferreira 2009; Blanco-Elorrieta and Pylkkänen 2017; de Bruin et al. 2018; Zhu et al. 2022).
Voluntary language switching, which has higher ecological validity, has garnered increasing attention in recent years. However, there is significant controversy in existing research regarding whether the process of voluntary language switching also requires inhibitory control. Some studies have shown an absence of switching costs when it comes to voluntary language switching, suggesting a lack of inhibitory control involvement (Blanco-Elorrieta and Pylkkänen 2017; Reverberi et al. 2018; Liu et al. 2020; Zhu et al. 2022). In contrast, a significant number of studies maintain that switching costs were observed in voluntary language switching scenarios, indicating the continued engagement of inhibitory control (Gollan and Ferreira 2009; Gollan et al. 2014; Gross and Kaushanskaya 2015; Zhang et al. 2015; de Bruin et al. 2018; Jevtović et al. 2019; de Bruin and Xu 2023).
Two studies used Event-related potentials (ERPs) to further explore the neural costs of voluntary language switching. Because of the differences in experimental results, the question remains as to whether inhibitory control is present in voluntary language switching. Specifically, Liu et al. (2020) found no behavioral or neural switching costs in the voluntary language switching of unbalanced Chinese–English bilinguals. Although Jiao et al. (2022) observed switching costs and reverse language advantage effects in the behavioral data of voluntary language switching, their analysis of ERP data showed no neural switching costs or reverse language advantage effects.
The adaptive control hypothesis posits that neural circuits and regions engaged in language control undergo adaptive changes due to the interaction context. The need for language control, including goal maintenance, conflict monitoring, and interference suppression, is highest in dual-language contexts. In contrast, strong suppression of alternatives may be unnecessary in dense code-switching environments, but these environments still demand fine temporal control of morphological syntactic processing (Green and Abutalebi 2013). Zhang et al. (2015) focused on the differences in brain area activation between voluntary and forced language switching. However, it is worth noting that the instructions of this experiment were based on Arrington and Logan’s (2004) voluntary task switch (VTS) paradigm, which requires participants to “choose languages in an equally frequent and random order.” Such instructions imposed additional demands on the participants, greatly reducing the “voluntariness” of the experiment. Moreover, Zhang et al. (2015) only asked participants to freely name pairs of numbers from one to nine in Chinese or English, and thus, the experiment was relatively simple. Blanco-Elorrieta and Pylkkänen (2017) employed magnetoencephalography (MEG) to examine changes in neural activation patterns within regions and circuits associated with bilingual language control across diverse interactive contexts. This study did not find activation in any language control regions during voluntary language switching, leading the authors to suggest that voluntary language switching might not engage language control. However, the use of MEG limits the analysis of activity in certain subcortical areas, such as the caudate nucleus. Previous studies have demonstrated that the caudate nucleus is associated with monitoring and inhibiting interference from nontarget languages and plays an important role in bilingual language control (Abutalebi and Green 2007; Lei et al. 2014).
In conclusion, existing research on bilingual language switching has been largely limited to forced scenarios and lacks neural-cognitive insights. The conventional cue-switching paradigm, originally established in research on forced language switching, fails to encapsulate bilinguals’ language switching in spontaneous settings. Therefore, it is crucial to investigate language switching in a way that more closely reflects real-life bilingual interactions. Moreover, research on voluntary language switching has predominantly relied on behavioral experiments, leading to inconsistent findings. To resolve these inconsistencies and deepen our understanding, measures of neural activity are essential. Our experiment uses fMRI to examine activations in both the cerebral cortex and subcortical areas during both voluntary and forced language switching. Specifically, we have two main objectives: First, we aim to determine whether voluntary language switching incurs switching costs and involves inhibitory control by integrating both behavioral and neuroimaging data. Second, we seek to compare the neural mechanisms underlying voluntary and forced language switching, aiming to identify both commonalities and differences. Taken together, we hope to contribute to the broader field of bilingualism by offering more nuanced insights into the cognitive and neural mechanisms involved in language switching.
Materials and methods
Participants
The G*Power 3.1 program was used to calculate the sample size (Faul et al. 2009). Setting a medium effect size of 0.25, a statistical power of (1 − β) 0.8, and a significance level (α) of 0.05, it was determined that the minimum required sample size was 16 participants. The present study recruited a total of 22 unbalanced Chinese–English bilinguals. Participants with excessive head movement or incomplete voice recordings were excluded, resulting in a final sample of 19 participants (10 females; mean age = 23.21 ± 2.04 years). They were all right handed, had normal or corrected-to-normal vision, and reported no neurological impairments or other related diseases. The research protocol was approved by the Ethics Committee of the School of Linguistic Sciences and Arts of Jiangsu Normal University. All participants signed an informed consent form and received monetary compensation.
All the participants were native speakers of Chinese and had passed the College English Test-6 (CET-6). A 7-point Likert scale was used to ask participants to self-evaluate their proficiency levels in listening, speaking, reading, and writing in both Chinese and English (1 = not proficient in at all; 7 = very proficient). The results are shown in Table 1.
. | Chinese (L1) . | English (L2) . |
---|---|---|
Age of acquisition (AOA) | 0.89 ± 0.27 | 8 ± 1.63 |
Listening | 6.42 ± 0.84 | 3.53 ± 1.02 |
Speaking | 6.37 ± 0.83 | 3.16 ± 0.96 |
Reading | 6.42 ± 0.69 | 4.37 ± 1.26 |
Writing | 6.32 ± 0.89 | 3.89 ± 1.33 |
. | Chinese (L1) . | English (L2) . |
---|---|---|
Age of acquisition (AOA) | 0.89 ± 0.27 | 8 ± 1.63 |
Listening | 6.42 ± 0.84 | 3.53 ± 1.02 |
Speaking | 6.37 ± 0.83 | 3.16 ± 0.96 |
Reading | 6.42 ± 0.69 | 4.37 ± 1.26 |
Writing | 6.32 ± 0.89 | 3.89 ± 1.33 |
. | Chinese (L1) . | English (L2) . |
---|---|---|
Age of acquisition (AOA) | 0.89 ± 0.27 | 8 ± 1.63 |
Listening | 6.42 ± 0.84 | 3.53 ± 1.02 |
Speaking | 6.37 ± 0.83 | 3.16 ± 0.96 |
Reading | 6.42 ± 0.69 | 4.37 ± 1.26 |
Writing | 6.32 ± 0.89 | 3.89 ± 1.33 |
. | Chinese (L1) . | English (L2) . |
---|---|---|
Age of acquisition (AOA) | 0.89 ± 0.27 | 8 ± 1.63 |
Listening | 6.42 ± 0.84 | 3.53 ± 1.02 |
Speaking | 6.37 ± 0.83 | 3.16 ± 0.96 |
Reading | 6.42 ± 0.69 | 4.37 ± 1.26 |
Writing | 6.32 ± 0.89 | 3.89 ± 1.33 |
The paired sample T-test showed that the proficiency scores for Chinese were significantly higher than those for English across all skills: listening, t18 = 14.416, P < 0.001, Cohen’s d = 3.10, 95% CI = [2.47, 3.32]; speaking, t18 = 13.565, P < 0.001, Cohen’s d = 3.58, 95% CI = [2.71, 3.71]; reading, t18 = 6.824, P < 0.001, Cohen’s d = 2.02, 95% CI = [1.42,2.68]; and writing, t18 = 8.367, P < 0.001, Cohen’s d = 2.14, 95% CI = [1.81, 3.03]. Self-assessment results indicated that the participants were unbalanced Chinese–English bilinguals with intermediate L2 proficiency.
Materials
Picture stimulus
The picture stimuli were obtained from 100 black-and-white line drawings standardized by Zhang and Yang (2003), based on the Snodgrass and Vanderwart (1980) image database. The pictures selected for our experiment include familiar daily life categories, such as food. An additional group of 31 Chinese–English bilinguals, not involved in the formal experiment, was recruited to rate the familiarity of the English names corresponding to the pictures on a 7-point scale (1 = very unfamiliar, 7 = very familiar). The average score of all pictures was 6.86 ± 0.01, indicating a very high level of familiarity.
Cue material
Faces were used as naturalistic cues for language. By associating four different facial cues with two separate languages and varying these cues with each trial, the study effectively dissociated the effects of language switching from cue switching.
In the forced context, participants were required to choose the language based on facial cues. Two Chinese faces served as cues for Chinese, and two non-Chinese faces were used as cues for English (see Fig. 1A). These four faces were originally from Rhodes et al. (2000).

Face cues. (A) Four facial cues in the forced context. (B) Four facial cues in the voluntary context. Face images adapted from Zhu et al. (2022) under the permission of CC BY 4.0 DEED.
In the voluntary context, participants were allowed to freely choose which language to use on each trial. For this purpose, four neutral facial cues, created by Zhu et al. (2022), were employed; these did not associate with any specific language. Participants were told that these neutral faces were bilinguals, capable of understanding both Chinese and English. To ensure participants could perceive significant variations in face cues, distinctive pairs of glasses were added to each of the four neutral faces (see Fig. 1B).
Experimental procedure
Participants performed two picture naming tasks while lying in an MRI scanner. The tasks were conducted in a fixed order: participants began with the voluntary context and subsequently engaged in the forced context, a design choice intended to avoid the potential influence of forced switching on voluntary switching (Kleinman and Gollan 2016; Jevtović et al. 2019). Note that both contexts incorporated switch trials and repeat trials: switch trials in Chinese (L2L1), switch trials in English (L1L2), repeat trials in Chinese (L1L1), and repeat trials in English (L2L2). Before the formal experiment, participants reviewed the experimental instructions and completed two contextualized practice tasks to ensure their full understanding the experimental requirements. The experiment consisted of eight runs, each lasting 8 min and 16 s.
Voluntary context
In the voluntary context, reference is made to the voluntary language switching paradigm proposed by Jevtović et al. (2019) and the forced language switching paradigm used by Wu et al. (2021).
Before commencing the voluntary context, an experimenter provided participants with the following instructions: “In the following section, you will see an interlocutor holding a picture, and your task is to tell him/her what the object in the picture is. These interlocutors are all knowledgeable bilinguals, so you can volunteer to use Chinese or English names for pictures. Be careful not to name all images in the same language.”
During the formal experiment, a fixation across appeared for 300 ms, followed by a blank screen for 200 ms. Subsequently, an interlocutor holding a picture was displayed in the center of the screen for 3,000 ms. Participants were asked to voluntarily choose a language to name the picture as quickly and accurately as possible, using a natural voice while pressing the “1” key. After the disappearance of the picture stimulus, a blank screen was shown for an intertrial interval of either 2,500, 4,500, or 6,500 ms. The whole experimental procedure is shown in Fig. 2A. All picture stimuli in the voluntary context were pseudo-random, with the same picture appearing no more than once in a given run and a total of three times across the entire task.

The experimental procedure in the voluntary (A) and the forced context (B). Face images adapted from Zhu et al. (2022) under the permission of CC BY 4.0 DEED.
A recording device captured the language selection during the picture naming in the formal experiment, facilitating the categorization of response types (i.e. switch or repeat trials). The formal experiment in the voluntary context consisted of 5, each comprising 61 trials. The first trial was considered a filler trial and was excluded from the tally of switch and repeat trials.
Forced context
Before commencing the forced context, the experimenter also provided participants with these instructions: “In the following section, you will see an interlocutor holding a picture, and your task is to tell him/her what the object in the picture is. These interlocutors are all monolinguals, so you need to name the picture in a correct language. When a foreign face appears, you have to name the picture in English; When a Chinese face appears, you have to name the picture in Chinese.”
The experimental procedure was identical to that of the voluntary context (see Fig. 2B). All picture stimuli in the forced context were pseudo-random, with the same picture appearing no more than once in a given run and a total of two times throughout the whole task. The formal experiment was divided into 3 runs in the forced context, each containing 61 trials. The first trial was also considered as a filler trial and was not included in the count of switch or repeat trials.
Data acquisition
Participants were scanned with a 3.0 Tesla General Electric Discovery MR750 whole body imager with an 8-channel head coil in the Key Laboratory of Jiangsu Normal University. A high-resolution, T1-weighted structural image was collected using magnetization preparation rapid gradient Echo imaging sequence. The parameters were as follows: TR = 8,200 ms; TE = 3.2 ms; matrix size = 256 × 256; FOV = 256 × 256 mm2; flip angle = 12°; resolution within slices = 1.0 × 1.0 mm2. In addition, T2 functional imaging consists of an echo planner imaging with gradient echo sequence (TR = 2,000 ms; TE = 30 ms; flip angle = 90°; matrix size = 64 × 64; FOV = 192 × 192 mm2; slice thickness = 3 mm; slice number = 38).
Participants’ language choices during the formal experiment were recorded. However, due to the lack of a voice-activated device, naming latencies were measured by having participants press the “1” key simultaneously with their speech production.
Data analysis
Behavioral data analysis
Using offline recordings, we calculated the switching frequency. For naming latencies, this study separately conducted a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) two-factor repeated measurement ANOVA in each of the two contexts. Subsequently, a paired sample T-test was performed to examine whether there was a significant difference of the switching costs between the voluntary context (naming latencies for switch trials minus those for repeat trials) and the forced context (naming latencies for switch trials minus those for repeat trials). Given that the experimental materials were pictures with low naming difficulty, participants’ accuracy rates in both voluntary and forced contexts were very high; therefore, accuracy rates were not specifically analyzed.
Neuroimaging data analysis
The following preprocessing procedures were performed using Statistical Parametric Mapping (SPM12) software (Wellcome Centre for Human Neuroimaging, London, UK) and MATLAB (MathWorks, Natick, MA, USA): adjustment of acquisition timing across slices, correction for head motion, co-registration to the anatomical image, spatial normalization using the anatomical image and the Montreal Neurological Institute (MNI) template, and smoothing with 8-mm full width at half-maximum Gaussian kernel.
In the modeling processing stage, the experiment was individually modeled for each condition (i.e. switch/repeat trials in Chinese/English in the voluntary/forced context) and for each participant. The Canonical Hemodynamic Response Function was used for fitting. For the second-level group analysis, the imaging results from all participants for each condition were combined. We conducted a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) ANOVA on whole-brain activation patterns in each of the two contexts. Data from the two contexts were analyzed separately, comparing switch trials and repeat trials, to delineate similarities and differences in brain activation patterns of voluntary and forced language switching.
Building on this, a conjunction analysis was performed to identify brain regions co-activated by both voluntary and forced contexts. The co-activated regions were selected as regions of interest (ROIs). The coordinates of the peak point of the co-activated region were used as the center to create a spherical ROI with a radius of 8 mm. Beta values of brain activities corresponding to voluntary and forced language switching were extracted separately at the individual subject level and subsequently subjected to statistical analysis. A paired sample T-test was used to investigate whether there was a significant difference in activation intensity voluntary and forced switching within these co-activated regions. Neuroimaging results were reported using consistent thresholds (voxels P < 0.001, PFWE-cor < 0.05).
Results
Behavioral performance
Switching frequency in voluntary context
The average switching frequency in the voluntary context was 40.15% ± 4.7%. Specifically, the average switching frequency for Chinese was 19.98% ± 2.2%, and the average switching frequency for English was 20.17% ± 2.61%. The paired sample T-test showed that there was no significant difference between the switching frequency for the two languages, t18 = −0.704, P = 0.49, Cohen’s d = −0.08, 95% CI = [−0.01,0].
Naming latencices
In the voluntary context, naming latencies were analyzed using a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) repeated measure ANOVA (see Fig. 3A). The main effect of trial type was significant (F(1,18) = 23.055, P < 0.001, η2 = 0.562), with longer naming latencies in switch trials (1287.35 ± 160.82 ms) compared with those in repeat trials (1231.41 ± 138.94 ms). These findings indicate that voluntary language switching has switching costs. The main effect of language type was significant (F(1,18) = 8.467, P = 0.009, η2 = 0.32), with Chinese trials having longer naming latencies (1284.51 ± 162.71 ms) than English naming trials (1234.25 ± 142.25 ms). Furthermore, the interaction between trial type × language type was significant (F(1,18) = 12.941, P = 0.002, η2 = 0.418), with “English-to-Chinese” switching costs significantly greater than “Chinese-to-English” switching costs.

Naming latencies of trial type (switch, repeat) by language type (Chinese, English) in the voluntary (A) and the forced context (B). Error bars for naming latencies represent the standard errors of the mean naming latencies across subjects, calculated separately for each condition.
In the forced context, the naming latencies were similarly analyzed by a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) repeated measure ANOVA (see Fig. 3B). The main effect of trial type was significant (F(1,18) = 42.651, P < 0.001, η2 = 0.703), with longer naming latencies in switch trials (1469.63 ± 176.52 ms) compared with those in repeat trials (1417.66 ± 169.03 ms). These findings indicate that forced language switching has switching costs. The main effect of language type was significant (F(1,18) = 17.958, P < 0.001, η2 = 0.499), with longer naming latencies in Chinese naming trials (1480.61 ± 183.7 ms) compared with those in English naming trials (1406.68 ± 168.15 ms). However, the interaction between trial type × language type was not significant (F(1,18) = 0.222, P = 0.644).
The paired sample T-test showed that there was no significant difference between the switching costs of voluntary language switching and those of forced language switching, t18 = 0.378, P = 0.71, Cohen’s d = 0.09, 95% CI = [−18.12, 26.07].
Neuroimaging data
Activation results for two language switching types
In the voluntary context, a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) ANOVA was conducted on whole-brain activation. The results showed significant main effects for trial type and language type; however, the interaction effect between the two was not significant. Since our experiment focuses on the brain activation of language switching across different contexts, we compared activation in switch trials (language changed from the preceding trial) to activation in repeat trials (language remained the same). The results indicated that there was greater activation in switch trials than in repeat trials in the bilateral precentral gyrus, bilateral postcentral gyrus, left inferior frontal gyrus, left precuneus, and left caudate nucleus (see Table 2 and Fig. 4A).
Brain region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Voluntary language switching Caudate Head | L | 496 | −9 | 21 | 0 | 6.49 |
Precentral Gyrus | R | 151 | 57 | −9 | 39 | 5.51 |
Postcentral Gyrus | R | 51 | −9 | 27 | 4.89 | |
Precentral Gyrus | L | 210 | −48 | −15 | 42 | 5.31 |
Inferior Frontal Gyrus | L | −54 | 3 | 27 | 4.82 | |
Postcentral Gyrus | L | −60 | −9 | 18 | 4.73 | |
Precuneus | L | 91 | −9 | −66 | 51 | 5.08 |
Forced language switching Cingulate Gyrus | L | 4508 | −12 | −27 | 0 | 8.36 |
Supplementary Motor Area | L | −6 | 0 | 72 | 7.03 | |
Inferior Frontal Gyrus | L | 1194 | −54 | 18 | 15 | 7.37 |
Superior Temporal Gyrus | L | −66 | −27 | 15 | 7.12 | |
Postcentral Gyrus | R | 1043 | 51 | −15 | 15 | 6.91 |
Supramarginal Gyrus | R | 60 | −27 | 45 | 6.51 | |
Middle Frontal Gyrus | R | 91 | 33 | 45 | 12 | 5.9 |
Superior Frontal Gyrus | R | 24 | 57 | 24 | 4.87 | |
Superior Frontal Gyrus | L | 233 | −21 | 51 | 18 | 5.3 |
Middle Occipital Gyrus | R | 83 | 45 | −69 | −15 | 5.01 |
Middle Temporal Gyrus | R | 57 | −66 | 3 | 4.3 |
Brain region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Voluntary language switching Caudate Head | L | 496 | −9 | 21 | 0 | 6.49 |
Precentral Gyrus | R | 151 | 57 | −9 | 39 | 5.51 |
Postcentral Gyrus | R | 51 | −9 | 27 | 4.89 | |
Precentral Gyrus | L | 210 | −48 | −15 | 42 | 5.31 |
Inferior Frontal Gyrus | L | −54 | 3 | 27 | 4.82 | |
Postcentral Gyrus | L | −60 | −9 | 18 | 4.73 | |
Precuneus | L | 91 | −9 | −66 | 51 | 5.08 |
Forced language switching Cingulate Gyrus | L | 4508 | −12 | −27 | 0 | 8.36 |
Supplementary Motor Area | L | −6 | 0 | 72 | 7.03 | |
Inferior Frontal Gyrus | L | 1194 | −54 | 18 | 15 | 7.37 |
Superior Temporal Gyrus | L | −66 | −27 | 15 | 7.12 | |
Postcentral Gyrus | R | 1043 | 51 | −15 | 15 | 6.91 |
Supramarginal Gyrus | R | 60 | −27 | 45 | 6.51 | |
Middle Frontal Gyrus | R | 91 | 33 | 45 | 12 | 5.9 |
Superior Frontal Gyrus | R | 24 | 57 | 24 | 4.87 | |
Superior Frontal Gyrus | L | 233 | −21 | 51 | 18 | 5.3 |
Middle Occipital Gyrus | R | 83 | 45 | −69 | −15 | 5.01 |
Middle Temporal Gyrus | R | 57 | −66 | 3 | 4.3 |
Brain region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Voluntary language switching Caudate Head | L | 496 | −9 | 21 | 0 | 6.49 |
Precentral Gyrus | R | 151 | 57 | −9 | 39 | 5.51 |
Postcentral Gyrus | R | 51 | −9 | 27 | 4.89 | |
Precentral Gyrus | L | 210 | −48 | −15 | 42 | 5.31 |
Inferior Frontal Gyrus | L | −54 | 3 | 27 | 4.82 | |
Postcentral Gyrus | L | −60 | −9 | 18 | 4.73 | |
Precuneus | L | 91 | −9 | −66 | 51 | 5.08 |
Forced language switching Cingulate Gyrus | L | 4508 | −12 | −27 | 0 | 8.36 |
Supplementary Motor Area | L | −6 | 0 | 72 | 7.03 | |
Inferior Frontal Gyrus | L | 1194 | −54 | 18 | 15 | 7.37 |
Superior Temporal Gyrus | L | −66 | −27 | 15 | 7.12 | |
Postcentral Gyrus | R | 1043 | 51 | −15 | 15 | 6.91 |
Supramarginal Gyrus | R | 60 | −27 | 45 | 6.51 | |
Middle Frontal Gyrus | R | 91 | 33 | 45 | 12 | 5.9 |
Superior Frontal Gyrus | R | 24 | 57 | 24 | 4.87 | |
Superior Frontal Gyrus | L | 233 | −21 | 51 | 18 | 5.3 |
Middle Occipital Gyrus | R | 83 | 45 | −69 | −15 | 5.01 |
Middle Temporal Gyrus | R | 57 | −66 | 3 | 4.3 |
Brain region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Voluntary language switching Caudate Head | L | 496 | −9 | 21 | 0 | 6.49 |
Precentral Gyrus | R | 151 | 57 | −9 | 39 | 5.51 |
Postcentral Gyrus | R | 51 | −9 | 27 | 4.89 | |
Precentral Gyrus | L | 210 | −48 | −15 | 42 | 5.31 |
Inferior Frontal Gyrus | L | −54 | 3 | 27 | 4.82 | |
Postcentral Gyrus | L | −60 | −9 | 18 | 4.73 | |
Precuneus | L | 91 | −9 | −66 | 51 | 5.08 |
Forced language switching Cingulate Gyrus | L | 4508 | −12 | −27 | 0 | 8.36 |
Supplementary Motor Area | L | −6 | 0 | 72 | 7.03 | |
Inferior Frontal Gyrus | L | 1194 | −54 | 18 | 15 | 7.37 |
Superior Temporal Gyrus | L | −66 | −27 | 15 | 7.12 | |
Postcentral Gyrus | R | 1043 | 51 | −15 | 15 | 6.91 |
Supramarginal Gyrus | R | 60 | −27 | 45 | 6.51 | |
Middle Frontal Gyrus | R | 91 | 33 | 45 | 12 | 5.9 |
Superior Frontal Gyrus | R | 24 | 57 | 24 | 4.87 | |
Superior Frontal Gyrus | L | 233 | −21 | 51 | 18 | 5.3 |
Middle Occipital Gyrus | R | 83 | 45 | −69 | −15 | 5.01 |
Middle Temporal Gyrus | R | 57 | −66 | 3 | 4.3 |

Activation of brain regions in the voluntary (A) and the forced language switching (B).
Similarly, a 2 (trial type: switch, repeat) × 2 (language type: Chinese, English) ANOVA was conducted on whole-brain activation in the forced context. The results also showed significant main effects for both trial type and language type, yet the interaction effect between the two did not reach significance. In the forced context, switch trials, as compared to repeat trials, significantly activated frontal-temporal-occipital brain areas such as bilateral superior frontal gyrus, left inferior frontal gyrus, left superior temporal gyrus, right middle temporal gyrus, right middle occipital gyrus, and limbic system brain areas such as bilateral cingulate gyrus (see Table 2 and Fig. 4B).
Conjunction analysis results
We conducted a conjunction analysis on the brain regions activated by voluntary language switching (switch trials > repeat trials in voluntary context) and those activated by forced language switching (switch trials > repeat trials in forced context) to identify regions that were co-activated by both types of switching. Co-activation was observed in brain areas such as bilateral caudate nucleus, left inferior frontal gyrus, left precuneus, left supramarginal gyrus, and left supplementary motor area (see Table 3).
Brain regions that are co-activated by voluntary and forced language switching.
Brain Region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Caudate | R | 157 | 15 | 24 | −3 | 5.06 |
Caudate | L | 155 | −18 | 24 | −6 | 5.7 |
Inferior Frontal Gyrus | L | 102 | −60 | 9 | 6 | 4.41 |
Precuneus | L | 123 | −9 | −75 | 45 | 4.23 |
Supplementary Motor Area | L | 129 | −6 | −3 | 72 | 5.34 |
Supramarginal Gyrus | L | 20 | −33 | −42 | 36 | 4.37 |
Brain Region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Caudate | R | 157 | 15 | 24 | −3 | 5.06 |
Caudate | L | 155 | −18 | 24 | −6 | 5.7 |
Inferior Frontal Gyrus | L | 102 | −60 | 9 | 6 | 4.41 |
Precuneus | L | 123 | −9 | −75 | 45 | 4.23 |
Supplementary Motor Area | L | 129 | −6 | −3 | 72 | 5.34 |
Supramarginal Gyrus | L | 20 | −33 | −42 | 36 | 4.37 |
Brain regions that are co-activated by voluntary and forced language switching.
Brain Region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Caudate | R | 157 | 15 | 24 | −3 | 5.06 |
Caudate | L | 155 | −18 | 24 | −6 | 5.7 |
Inferior Frontal Gyrus | L | 102 | −60 | 9 | 6 | 4.41 |
Precuneus | L | 123 | −9 | −75 | 45 | 4.23 |
Supplementary Motor Area | L | 129 | −6 | −3 | 72 | 5.34 |
Supramarginal Gyrus | L | 20 | −33 | −42 | 36 | 4.37 |
Brain Region . | Hemisphere . | Cluster size . | MNI Coordinates . | T value . | ||
---|---|---|---|---|---|---|
. | . | . | x . | y . | z . | . |
Caudate | R | 157 | 15 | 24 | −3 | 5.06 |
Caudate | L | 155 | −18 | 24 | −6 | 5.7 |
Inferior Frontal Gyrus | L | 102 | −60 | 9 | 6 | 4.41 |
Precuneus | L | 123 | −9 | −75 | 45 | 4.23 |
Supplementary Motor Area | L | 129 | −6 | −3 | 72 | 5.34 |
Supramarginal Gyrus | L | 20 | −33 | −42 | 36 | 4.37 |
The above brain regions were analyzed separately, and beta values of both voluntary and forced language switching were extracted for a paired sample T-test. To control the error rate of multiple comparisons, we employed the False Discovery Rate correction method. Significant differences were observed in the left caudate nucleus, t18 = −2.701, P = 0.045, Cohen’s d = −0.81, 95% CI = [−1.47, −0.18], with greater activation in this region for forced than voluntary language switching. Similarly, there were significant differences in the left supramarginal gyrus, t18 = −2.951, P = 0.045, Cohen’s d = −0.8, 95% CI = [−1.14, −0.19], also with greater activation for forced than voluntary language switching. No significant difference was found in the activation of the right caudate nucleus, the left inferior frontal gyrus, the left precuneus, or the left supplementary motor area between voluntary and forced language switching (see Fig. 5).

Activation levels of co-activated brain regions in voluntary and forced language switching (*P < 0.05).
Discussion
The present study offers a systematic investigation and comparison of the cognitive and neural mechanisms underlying bilingual language switching in voluntary and forced contexts. We not only tested switching costs through behavioral metrics but also used fMRI to explore the neural costs of voluntary language switching in various cortical and subcortical brain regions.
We found the involvement of inhibitory control in both voluntary and forced language switching; this finding was consistent across both our behavioral and neuroimaging data. No significant differences in behavioral switching costs were detected. Furthermore, both voluntary and forced language switching activated brain regions including the bilateral caudate nucleus and the left inferior frontal gyrus. However, fMRI data indicated increased neural activity in brain regions associated with inhibitory control, such as the left caudate nucleus. On the other hand, we observed a nuanced difference between the two types of language switching: neuroimaging data revealed that fewer language control areas were involved in the process of voluntary language switching, suggesting that voluntary switching may demand fewer cognitive resources. These results shed light on the similarities and differences between voluntary and forced language switching, underscore the dynamic complexity of bilingual language processing, and highlight the importance of understanding bilingual language processing from a more ecologically valid perspective.
Voluntary language switching involves inhibitory control
With respect to behavioral performance, our study unveiled the presence of switching costs in voluntary language switching. This suggests that even when Chinese–English bilinguals switch between Chinese and English at will, the time spent on the switching process is still significantly longer than that for continuous use of a single language. Our study’s observations are in line with numerous prior studies on voluntary language switching among both unbalanced bilinguals (Gollan and Ferreira 2009; Gollan et al. 2014; Zhang et al. 2015; Liu et al. 2021; Jiao et al. 2022) and balanced bilinguals (de Bruin et al. 2018; Jevtović et al. 2019). Notably, studies focusing on unbalanced Chinese–English bilinguals, such as that by Jiao et al. (2022), have also consistently demonstrated switching costs in voluntary switching contexts. These findings corroborate our behavioral results, suggesting that voluntary language switching also requires response inhibition to control interference from the nontarget language. Drawing from Green’s (1998) ICM, language switching is postulated to incur switching costs due to the necessary temporal and cognitive resources devoted to suppressing passive language interference and reactivating the target language.
Previous research has consistently demonstrated that unbalanced bilinguals often display asymmetric switching costs. This phenomenon refers to the differences in switching costs between the two directions of language switching, namely from L1 to L2 and vice versa (Meuter and Allport 1999; Christoffels et al. 2007; Wang et al. 2009; de Bruin et al. 2018). Notably, the current study uncovered a distinct pattern: symmetric switching costs were identified in the context of forced language switching, while asymmetric switching costs emerged in the voluntary switching context. This finding suggests that the pattern of switching costs experienced by bilinguals is significantly influenced by the switching context—whether it is voluntary or forced. In this study, participants were presented with picture stimuli representing common objects from everyday life, with the English names for these objects being highly familiar to them. Despite the fact that the subjects were unbalanced Chinese–English bilinguals, they exhibited a remarkable proficiency in naming the picture stimuli in English, as indicated by their consistently high accuracy rates in both voluntary and forced contexts. Consequently, the difference in switching costs between Chinese-to-English and English-to-Chinese switching was not found to be statistically significant. This observation supports the presence of symmetric switching costs within the forced context.
It is worth noting that asymmetric switching costs emerged in the voluntary switching context; unbalanced bilinguals experienced greater costs when switching from English to Chinese than from Chinese to English. Unbalanced bilinguals in the voluntary context may feel more “self-imposed pressure.” This is because the experiment allowed subjects to choose either Chinese or English voluntarily but also stipulated that not all pictures could be named in the same language. Given the limited time available for naming pictures, subjects had to constantly consider whether they needed to switch languages. However, in the forced context, they were forced to use a specific language, which may alleviate some of the “self-imposed pressure.” Therefore, in the more stressful voluntary context, subjects may prioritize accuracy over speed in addition to the involvement of inhibitory control mechanisms. If a subject initially struggled to retrieve the English word of a picture, they might resort to using Chinese instead, which potentially explains the greater switching costs observed when switching from English to Chinese than from Chinese to English.
In terms of neuroimaging data, the whole-brain analysis illuminated the engagement of inhibitory control regions during voluntary language switching. Notably, several key brain regions greater activation in switch trials compared with repeat trials, including the bilateral precentral gyrus, bilateral postcentral gyrus, left inferior frontal gyrus, left precuneus, and left caudate nucleus. The prefrontal cortex has been found to be involved in overarching executive functions, serving to suppress nontarget language activation (Abutalebi and Green 2007). Furthermore, the left inferior frontal gyrus is recognized for its involvement in response selection and inhibition processes (Hernandez et al. 2001; Branzi et al. 2016). Activation of the supplementary motor area correlates with inhibiting erroneous responses and selecting appropriate responses with diminished automaticity, a process germane to language switching (Abutalebi and Green 2007; Luk et al. 2011). The significance of the caudate nucleus pertains to its demonstrated role in monitoring and managing the language in use, including active suppression of nontarget responses, thus contributing significantly to language-specific lexical selection (Abutalebi and Green 2007; Lei et al. 2014). Meanwhile, the augmented activation observed in the postcentral gyrus during switch trials is closely linked to the phonological processing involved in pronunciation, suggesting that switch trials demand more cognitive effort than repeat trials when it comes to naming pictures in the target language (Hillis et al. 2004).
In summary, the switching costs observed in behavioral data suggest the involvement of inhibitory control during voluntary language switching in unbalanced bilinguals. The whole-brain analysis further provides neural evidence, showing activation in regions associated with inhibitory control, including the left inferior frontal gyrus, the supplementary motor area, anterior cingulate gyrus, and caudate nucleus.
Similarities and differences between voluntary and forced language switching
A number of studies have found that the switching costs associated with forced language switching are much greater than those in voluntary language switching (Gollan et al. 2014; Zhang et al. 2015; Jevtović et al. 2019; Jiao et al. 2022). These studies suggest that lexical access mechanisms, in addition to inhibitory control mechanisms, may also play a role in the dynamics of voluntary language switching. Nevertheless, our results diverge from these findings; we did not find substantial difference in switch costs between voluntary and forced language switching, which echoes the findings of de Bruin et al. (2018). Our interpretation is 2-fold: First, participants in the voluntary context might engage in a top-down decision-making process, where they must decide whether to switch languages or to continue using the same one. This decision-making process incurs a cognitive cost similar to the demands of forced language switching, which requires processing external cues. Second, even with the option to choose languages voluntarily for naming images, participants might infer the experiment’s underlying goals and deliberately adjust their language choices. Such strategic adjustment could mask any inherent ease of voluntary language switching relative to forced language switching.
For brain activation patterns, our whole-brain analysis indicated that voluntary language switching activates a more restricted network of brain areas related to language control compared with forced language switching. For instance, areas such as the superior frontal gyrus, middle frontal gyrus, cingulate gyrus, and superior temporal gyrus were significantly activated during forced but not voluntary language switching. In a combined analysis of both voluntary and forced language switching, co-activation was observed in regions including the bilateral caudate nucleus, left inferior frontal gyrus, left precuneus, left supramarginal gyrus, and left supplementary motor area. Further ROI analysis revealed that the left supramarginal gyrus and the left caudate nucleus had higher activation levels during forced language switching than during voluntary switching. However, no significant differences between the two switching types were noted in the remaining four brain regions. These findings imply that forced language switching may require more intensive inhibitory control in certain brain regions than voluntary switching does (Zhang et al. 2015).
The differences in brain activation between voluntary and forced language switching in our support the adaptive control hypothesis (Green and Abutalebi 2013), which posits that engagement with language control networks can vary according to the demands of the context. This hypothesis suggests that, in the forced context, bilinguals need to inhibit the nontarget language and select the correct one as indicated by cues for naming the picture, leading to a greater reliance on cognitive processes such as conflict resolution and inhibitory control. In contrast, in a voluntary context, bilinguals have the autonomy to choose the language for naming pictures, potentially reducing the need for cognitive control. Our findings are in line with this hypothesis: voluntary language switching recruited fewer brain regions associated with cognitive control, and certain regions show less activation during voluntary switching than forced switching.
In contrast to our findings, the conjunction analysis by Zhang et al. (2015) on voluntary and forced language switching did not reveal any significantly co-activated brain regions. The differences between our findings and Zhang et al.’s (2015) could stem from several factors: First, while both studies recruited Chinese–English bilinguals, the participants in our study had notably lower English proficiency compared with those in Zhang et al.’s study. Second, Zhang et al. limited their naming items to Arabic digits (1–9), which is a narrower set of stimuli than what was used in our study. This limitation might account for the different brain activation patterns observed. Third, the VTS paradigm (Arrington and Logan 2004) used in Zhang et al.’s study, which asks participants to choose languages in an equally frequent and random order, could have limited the autonomy in language selection. Lastly, they specifically examined the switch or repeat effect within pairs of Arabic numerals while neglecting the continuous effects across trials.
To summarize, by examining both behavioral and neuroimaging data, we found significant similarities between voluntary and forced language switching in unbalanced bilinguals, with comparable behavioral switching costs, along with the shared activation of brain regions such as the bilateral caudate nucleus and the left inferior frontal gyrus, between the two switching types. On the other hand, a nuanced difference between the two types of language switching was also revealed: voluntary switching engaged fewer brain regions associated with language control and showed lower level of activation in certain areas compared with forced language switching.
We acknowledge that empirically investigating voluntary behavior is challenging due to the potential dichotomy between the need for experimental control and voluntariness itself. Haggard (2008) emphasizes that participants acting on a voluntary basis may not be entirely voluntary in nature. Likewise, obliging participants to perform tasks based on external cues may not necessarily render those entirely involuntary behaviors. For instance, if a participant was directed to use their dominant language for naming pictures, this might naturally correspond to their personal preferences. Despite these constraints, ongoing research in both behavioral and neuroimaging fields persevere in its endeavor to understand voluntary behaviors, such as voluntary language switching. This endeavor is focused on enhancing the ecological validity of experimental paradigms to bridge the divide between bilingual language processing in controlled experiments and in natural environments.
Future research could consider L2 proficiency as a variable to investigate its impact on the cognitive and neural mechanisms involved in voluntary language switching. In addition, while most voluntary switching research focus on switching between different languages (e.g. Spanish–English, Spanish–Basque), there is a lack of studies investigating voluntary switching between dialects of the same language, which often share a broad range of features alongside variations. In China, for instance, there is a substantial population of bidialectals proficient in both a local dialect and Mandarin (Yi et al. 2018; Wu et al. 2023). To enhance our understanding of language processing, it is also important for future language switching studies to consider the diversity of bilingual and bidialectal population.
Conclusion
The present study used fMRI techniques, combined with behavioral and neuroimaging data to jointly explore the cognitive and neural mechanisms of voluntary versus forced language switching in unbalanced Chinese–English bilinguals. We found that voluntary language switching incurred a consistent switching cost and activated brain regions associated with inhibitory control, such as the bilateral caudate nucleus and the left inferior frontal. This provides consistent evidence for the involvement of inhibitory control in voluntary language switching. Additionally, we found both similarities and differences in the cognitive and neural mechanisms underlying voluntary and forced language switching. While switching costs were observed in both switching contexts, voluntary language switching involved fewer brain areas associated with language control.
Acknowledgments
This research is supported by the Basque Government through the BERC 2022-2025 program and the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation CEX2020-001010/AEI/10.13039/501100011033 provided to Q.X.
Author contributions
Xinyu Zhao (Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Writing—original draft, Writing—review & editing), Qihui Xu (Conceptualization, Supervision, Validation, Visualization), Haiyan Wu (Conceptualization, Supervision, Validation, Visualization), Xueping Hu (Conceptualization, Methodology, Supervision, Validation), Zhiyuan Liu (Methodology, Supervision, Validation), Lili Ming (Investigation, Methodology, Software), Zixuan Xue (Investigation, Methodology, Visualization), Chenyi Yue (Investigation, Methodology), Libo Geng (Conceptualization, Funding acquisition, Project administration, Supervision, Validation, Visualization, Writing—review & editing), and Yiming Yang (Conceptualization, Funding acquisition, Project administration, Resources, Writing—review & editing)
Funding
This work was supported by the National Social Science Fund of China (17DZ301), National Program on Key Basic Research Project (2014CB340502), and Jiangsu Qing Lan Project.
Conflict of interest statement: None declared.
References
Author notes
Libo Geng, Xinyu Zhao, and Qihui Xu have contributed equally to this work.
Qihui Xu majority of the work was conducted at her previous affiliation at the Basque Center on Cognition, Brain and Language, Spain.