-
PDF
- Split View
-
Views
-
Cite
Cite
Zhiwei Wang, Kristina Zeljic, Qinying Jiang, Yong Gu, Wei Wang, Zheng Wang, Dynamic Network Communication in the Human Functional Connectome Predicts Perceptual Variability in Visual Illusion, Cerebral Cortex, Volume 28, Issue 1, January 2018, Pages 48–62, https://doi.org/10.1093/cercor/bhw347
- Share Icon Share
Abstract
Ubiquitous variability between individuals in visual perception is difficult to standardize and has thus essentially been ignored. Here we construct a quantitative psychophysical measure of illusory rotary motion based on the Pinna-Brelstaff figure (PBF) in 73 healthy volunteers and investigate the neural circuit mechanisms underlying perceptual variation using functional magnetic resonance imaging (fMRI). We acquired fMRI data from a subset of 42 subjects during spontaneous and 3 stimulus conditions: expanding PBF, expanding modified-PBF (illusion-free) and expanding modified-PBF with physical rotation. Brain-wide graph analysis of stimulus-evoked functional connectivity patterns yielded a functionally segregated architecture containing 3 discrete hierarchical networks, commonly shared between rest and stimulation conditions. Strikingly, communication efficiency and strength between 2 networks predominantly located in visual areas robustly predicted individual perceptual differences solely in the illusory stimulus condition. These unprecedented findings demonstrate that stimulus-dependent, not spontaneous, dynamic functional integration between distributed brain networks contributes to perceptual variability in humans.
Introduction
Individual differences in visual perception are ubiquitous and nuanced. In contrast to the percept of a real object, illusory perception is remarkably different between individuals (Wade and Swanston 2013). Levels of illusory perception differ among individuals in a manner dependent on stimulus configuration, and biological (e.g., gender, age, and health status) and experience-related factors (e.g., cultural environment) (Segall et al. 1963). For instance, stimulus features can account for differences in illusory visual motion perception in healthy subjects (Fraser and Wilcox 1979), whereas biological factors contribute to perceptual differences among patients with a particular brain disorder, for example, dyslexia or schizophrenia (Notredame et al. 2014; Gori et al. 2015). Because illusory percepts are diverse and difficult to quantify, the mechanism underlying perceptual variability remains largely unknown thus far. Moreover, most illusory or multistable percepts that have been extensively examined elicit an “all-or-none” subjective experience (Lumer et al. 1998; Leopold and Logothetis 1999; Brouwer and van Ee 2007; Sterzer et al. 2009), making it impossible to study varying perceptual levels in the human population. It is known that illusory perception of clockwise or counter-clockwise rotation can be induced when an observer approaches or recedes from the static Pinna-Brelstaff figure (PBF) (Pinna and Brelstaff 2000; Gurnsey and Page 2006) (Fig. 1A). We recently found that different observers report dramatically graded rotary speeds in identical stimulus conditions (Pan et al. 2016). Through quantification of perceived illusory rotation speed, the PBF test provides us with an opportunity to tap into the neural processes behind perceptual variability in visual illusion.

Psychophysical experiment of the Pinna illusion. (A) An example of the Pinna-Brelstaff figure. (B) Psychophysical paradigm. After a 0.5 s fixation cue, the Pinna-Brelstaff figure expanded for a duration of 1 s. Subjects then selected the physical rotation speed most closely matching the speed at which they perceived illusory rotation. (C) Distribution of mean perceived illusory rotation speeds for all 73 subjects (gray bar) and 42 subjects who participated in the fMRI experiment (black bar). Black curve: fitted probability distribution function curve for the mean perceived speed distribution following a normal distribution, μ = 11.71, σ = 4.52. (D) Distribution of varied percepts for all 73 subjects (gray bar) and for 42 subjects who participated in the fMRI experiment (black bar). Black curve: fitted probability distribution function curve for perceptual level following a normal distribution, μ = 37, σ = 18.23.
Visual information processing relevant to veridical and illusory experiences is hypothetically carried out along 2 anatomically and functionally segregated streams—the dorsal visual pathway projecting from V1 to the lateral intraparietal cortex through V3, and the ventral visual pathway projecting from V1 to the inferior temporal cortex through V2 and V4 (Felleman and Van Essen 1991; Goodale and Milner 1992; Tootell et al. 1997; Smith et al. 1998). Although the dorsal pathway is putatively designated as the “visuospatial or motion” pathway and the ventral pathway as the “form” pathway, growing evidence suggests that motion information is also processed in the ventral pathway (Nassi and Callaway 2009; Gilaie-Dotan et al. 2013). A number of visual motion-responsive areas largely encompassing the visual streams together with areas in the frontal and parietal lobes likely constitute a functional neural circuitry for motion perception (Lumer and Rees 1999; Culham et al. 2001; Burr and Thompson 2011). Furthermore, experimental studies (Lamme and Roelfsema 2000; Gaillard et al. 2009) and theoretical models (Dehaene and Changeux 2011) both suggest that recurrent processing (particularly functional recruitment of the frontal and parietal areas) in brain circuitry is crucial for conscious perception. As such, the mixture of feed forward and feedback effects embedded in widespread brain regions continues to pose a challenge for elucidating neural processes that may underlie many well-known illusory phenomena, including the Pinna illusion. There has been no evidence available until now at the whole-brain scale to address whether multiple sets of brain regions preferentially interact with one another during illusion perception.
Complex brain networks derived from functional magnetic resonance imaging (fMRI) data tend to exhibit economical and efficient organization of network “communities” or “modules” (also called as “subnetworks” or “subgraphs”) (Achard and Bullmore 2007; Fair et al. 2009; Power et al. 2011; Bullmore and Sporns 2012; Power et al. 2014). A characteristic description is that dense or clustered local connectivity is wired through relatively few long-range connections mediating a short path length between any possible pair of regions (Kashtan and Alon 2005; Newman 2006; Sporns 2014; Deco et al. 2015). Recent large-scale network analyses of brain connectivity have aided in characterizing the topological dynamics of brain organization and inter-regional communication under task conditions (Cole et al. 2013; Crossley et al. 2013). This prominent “small-world” topology supports both segregated and integrated information processing, minimizes wiring costs and cultivates network resilience against pathological insult (Deco et al. 2015). Quantitative assessment of these functional connectivity patterns often refers to graphic performance metrics such as strength and efficiency of community communication at multiple scales of network constructs, denoting the capability and capacity of the network for parallel information transfer (Achard and Bullmore 2007). Importantly, modular communication adapts flexibly in accordance with brain state, that is, between-module strengths increases during threat and reward status in contrast to safety and no-reward status (Kinnison et al. 2012), and varies swiftly with learning (Bassett et al. 2015). Nevertheless, interactions between- and within-modules are tightly dependent on task-specific involvement when subjects switch from resting to task mode across a multitude of different tasks (Cole et al. 2014). Basic intuition implies that illusory perception might be closely associated with an intimate interplay between stimulus-evoked networks. If so, we next ask whether and how the dynamics of network communication explain perceptual variation between individual observers perceiving the same illusory pattern.
To test this hypothesis, we assess individuals’ perceptual level of the Pinna illusion using a carefully designed psychophysical paradigm. We then acquire fMRI data from subjects during the resting state and viewing of 3 visual patterns: expanding PBF (which induces illusory rotation, henceforth referred to as illusory rotation), expanding modified-PBF (which is illusion-free, henceforth referred to as expansion only), and expanding modified-PBF with physical rotation (henceforth referred to as physical rotation). Given the intriguing role of spontaneous connectivity networks in the prediction of performance in tasks related to somatosensory perception (Boly et al. 2007), intellectual performance (van den Heuvel et al. 2009), visual perception (Zhu et al. 2011; Baldassarre et al. 2012), and individual differences in brain activity during task performance (Tavor et al. 2016), we are interested in whether a model built upon the resting-state network can predict qualitative differences in subjective perception among subjects. We construct a visual stimulus-evoked network by calculating the functional connectivity between pairs of regions identified as robustly activated by visual stimuli. Using a spectral optimization algorithm (Newman 2006; Fortunato 2010), we detect 3 network communities that are consistently active in all conditions, including resting state, with 2 communities located largely in the visual system and the third occupying part of frontal and parietal areas. With cross-validated, data-driven analysis, we demonstrate that strength and efficiency of the primary and intermediate visual communities predict individual perceptual level of the Pinna illusion only in the illusory rotation condition. These results suggest a novel mechanism that stimulus-specific, not spontaneous, dynamic functional integration between distributed brain networks contributes to perceptual variability in visual illusion.
Materials and Methods
Participants
A total of 73 subjects (age 26.48 ± 3.35 [mean ± SD]; 41 males, age 26.76 ± 3.36; 32 females, age 26.13 ± 3.37; 17 of which took part in our previous study (Pan et al. 2016)) with no history of psychiatric or neurological illness and no metallic implants participated in the psychophysical experiment. Newly recruited volunteers whose perceptual levels (to be explained later) of the Pinna illusion fell into the medium range of the population distribution were excluded from the fMRI experiment to maintain an approximately uniform distribution of perceptual level in this subgroup; failure of general screening for MRI procedure also resulted in study exclusion. Overall, 42 subjects (age 26.29 ± 3.16; 23 males, age 27.35 ± 3.56; 19 females, age 25.00 ± 2.03) from the original cohort underwent fMRI scanning. Psychophysical and fMRI scan protocols were approved by the Biomedical Research Ethics Committee, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, where all subjects were recruited. Subjects were required to provide informed written consent prior to the experimental procedure in accordance with institutional guidelines and the Declaration of Helsinki. All subjects had normal or corrected-to-normal vision, and were right-handed and naïve to the purpose of the study.
Psychophysical and fMRI Paradigms
The experimental setup, psychophysics paradigm and fMRI are briefly introduced here; more details can be found in our previous work (Pan et al. 2016). The expanding PBF with incongruent configuration (both rings exhibiting illusory rotation in opposite directions) was presented on a 19-in LCD monitor (Philips 190E2 plus, 1440 × 900, 60 Hz), 50 cm in front of the subject, whose head rested on a chin rest. Ten visual patterns with different physical rotation speeds were displayed immediately after the expanding rings disappeared, and subjects were asked to select the one that most closely matched their perceived speed of illusory rotation (Fig. 1B). A total of 40 conditions with 6 different viewing distances (from 60 to 160 cm) and heading speeds were tested; approaching speeds corresponding to angular velocities of the Pinna pattern on the retina ranged from 2 to 26 °/s (Supplementary Table S1). Subjects’ eye position was continuously monitored using an infrared eye tracker system (iSCAN ETL-200, ISCAN Inc.). Trials in which eye movement was greater than 2° were discarded. A block-design stimulus paradigm was employed for fMRI scanning with 3 stimulus conditions randomly interleaved: (1) the “illusory rotation (IR)” condition, in which illusory rotation was elicited upon physical expansion of a regular PBF at a viewing distance of 140 cm and a heading speed of 140 cm/s; (2) the “expansion only (EO)” condition, in which a modified PBF expanded but exhibited neither illusory nor physical rotation; and (3) the “physical rotation (PR)” condition, identical to (2), except with physical rotation of the 2 rings at 20°/s.
MRI Data Collection
This study utilized 3 MRI datasets. One (n = 17) from our previous study (Pan et al. 2016), a second (n = 25) newly obtained dataset, and a third (n = 27) from a separate study used only to cross-validate network architecture in the resting-state condition; All datasets were acquired on a Siemens Tim Trio 3.0 T scanner (Erlangen, Germany) with a standard 32-channel phased-array head coil. During scanning sessions, participants were instructed to lie on their backs wearing ear plugs to muffle scanner noise, and keep their heads still. Foam pads were fitted around the head to ensure stable positioning. To ensure a stable physiological state and as a safety caution, electrocardiogram signals and heart rate were monitored online throughout the scan. For the first 2 datasets, after the calibration of eye position coordinates, subjects were required to maintain fixation on a small point (0.1° in radius) at the center of the screen. Eye position was continuously monitored at 1000 Hz during the scan using the MRI-compatible Eyelink ETL-200 eye tracker system positioned at the back of the magnet. A trigger pulse from the MRI scanner synchronized the onset of stimulus presentation with the beginning of image acquisition and the recording of eye movement. For dataset one, each session included 2 scans: a set of structural images, 115 volumes of resting state functional images followed by 240 volumes of task-fMRI experiment. Subjects were instructed to keep their eyes open and fixate on a black dot at the center of the screen throughout the resting state scan. For dataset 2, another 240 volumes of task-fMRI experiment were added. MRI scan parameters were identical for datasets 1 and 2, but different for dataset 3. For datasets 1 and 2, 3D whole-brain high-resolution T1-weighted images of each subject were acquired using the Magnetization Prepared Rapid Acquisition Gradient Echo (MPRAGE) pulse sequence (TR = 2530 ms; TE = 3.14 ms; inversion time [TI] = 1100 ms; flip angle [FA] = 7°; field of view [FOV] = 256 × 256 × 176 mm3; matrix = 256 × 256; voxel size = 1.0 × 1.0 × 1.0 mm3). T2*-weighted functional data were acquired using a gradient echo, echo-planar imaging (GE-EPI) sequence (TR = 3000 ms, TE = 30 ms, FA = 84°, FOV = 220 × 220 mm2, matrix = 88 × 88, slice thickness = 3.0 mm and 49 slices with no gap oriented parallel to the AC-PC line covering the entire brain, voxel size = 2.5 × 2.5 × 3.0 mm3). For the third dataset, 3D whole-brain high-resolution T1-weighted anatomical images were acquired using MPRAGE sequence (TR = 2300 ms; TE = 3 ms, TI = 1000 ms, FA = 9°, FOV = 256 × 256 × 176 mm3, voxel size = 1.0 × 1.0 × 1.0 mm3). Resting-state fMRI runs were acquired for 300 volumes using GE-EPI sequence (TR = 3000 ms; TE = 30ms, FA = 90°, FOV = 192 × 192 mm2, matrix = 96 × 96, 3 mm slice thickness with no gap, 47 axial slices, voxel size = 2.0 × 2.0 × 3.0 mm3).
Assessment of Subjects’ Perceptual Variability to the Pinna Illusion
A flow chart illustrating the complete pipeline of both psychophysics and fMRI data analysis is included in the Supplementary Material. The relationships between perceived illusory rotation speeds across all 40 test conditions, their corresponding initial angles and angular velocities of the PBF on the retina were characterized with a general linear model. We conducted a relative assessment to quantify each subject's perceptual level within this cohort by ranking subjects’ reports of perceived illusory rotation speed in each condition, and defined the averaged rank of forty conditions as the index of perceptual level. Rank consistency across 40 conditions was evaluated using Kendall's coefficient of concordance, a statistic often used to assess inter-rater agreement (Kendall and Smith 1939).
Construction of the Visual Stimulus-Evoked Network
The visual stimulus-evoked network was constructed in 4 steps: (1) Deriving activation maps in response to each of the 3 task conditions; (2) Defining network nodes; (3) Calculating edge connections; and (4) Determining a threshold sparsity of the network.
Activation maps: Surface-based classical univariate analysis was performed using the Freesurfer Functional Analysis stream package (Dale et al. 1999; Fischl et al. 1999). Functional images were motion-corrected, slice-timing corrected, registered onto the reconstructed individual cortical surface, and normalized to the “fsaverage” surface template. Surface-based smoothing was performed with 5 mm FWHM Gaussian kernels. The data were high-pass filtered (cut-off frequency: 0.0167 Hz) to remove low-frequency signal drift. First level analysis was performed using the classical general linear model (GLM), in which 3 regressors of 3 experimental conditions convolved with a gamma function (δ = 2.25, τ = 1.25) and 6 motion correction parameters were included. Activation maps of 3 contrasts—illusory rotation versus baseline, expansion only versus baseline and physical rotation versus baseline—were generated at the individual level. A mixed-effect analysis was performed to obtain group-level responses for each stimulus. Images of parameter estimate maps and residual variance maps for the contrast of interest were created for each subject (first-level analysis), and then smoothed with an 8 mm FWHM Gaussian kernel. They were then entered into second-level analysis using a one-sample t-test across subjects. Group level activation maps (P < 0.01 after FDR correction, cluster size > 10 mm2) were converted to Caret software (http://www.nitrc.org/projects/caret) (Van Essen et al. 2001) to define network nodes. In light of recent discussions on cluster-level thresholding (Woo et al. 2014; Eklund et al. 2016; Flandin and Friston 2016), we used the FDR correction to generate the activation maps. We also tested group level activation maps using cluster-level correction (P < 0.01, cluster size > 50 mm2) to control family-wise error and identified identical brain regions as network nodes in the present dataset. The activation maps were registered to Population-Average, Landmark- and Surface-based (PALS-B12) (Van Essen 2005) templates for visualization.
Where
Network nodes: Regions activated in any of these 3 conditions, that is, the composite set of activation maps of all conditions, were divided into a total of 37 nodes on both hemispheres (Fig. 2 and Table 1). The biggest cluster encompassed almost the entire visual cortex and was merged with the visuotopic map (Van Essen 2004) from the PALS-B12 atlas (Van Essen 2005) to create 10 nodes including V1, V2 (V2v and V2d were merged to form 1 node), V3, VP, V3A, V4v, hMT+, lateral occipital, V7, and V8 in each hemisphere. Three additional nodes were manually defined: the putative human posterior inferior temporal (phPIT) cortex (Kolster et al. 2010; Abdollahi et al. 2014) in the lateral occipital cortex inferior to hMT+ and superior to V8, the fusiform gyrus in the ventral temporal cortex and the intraparietal sulcus (IPS) in the parietal cortex (Fig. 2A). The second largest cluster, located in the precentral sulcus was parcellated into 2 nodes separated by the upper bank of the inferior frontal sulcus. The frontal eye field and inferior frontal junction constituted the upper and lower divisions, respectively. Node labels, Montreal Neurological Institute (MNI) coordinates of central point, and area sizes are listed in Table 1.
. | Node . | . | X . | Y . | Z . | Area (mm2) . | Anatomical description . |
---|---|---|---|---|---|---|---|
1 | V1 | L | −13.2 | −97.2 | 4.1 | 1010.9 | |
2 | R | 12.7 | −93.4 | 3.5 | 1186.3 | ||
3 | V2 | L.d | −17.3 | −98 | 15 | 303.1 | |
L.v | −11.4 | −90.4 | −8.1 | 620.4 | |||
4 | R.d | 15.1 | −95.7 | 15.6 | 275.9 | ||
R.v | 11.2 | −87.3 | −7.9 | 668.2 | |||
5 | V3 | L | −20.1 | −93.7 | 21.3 | 271.1 | |
6 | R | 18.1 | −92.9 | 21.3 | 297.7 | ||
7 | VP | L | −21.8 | −85.7 | −11.9 | 684.7 | |
8 | R | 22.2 | −82.5 | −10.3 | 685.8 | ||
9 | V3A | L | −24 | −85.4 | 25.8 | 578.2 | |
10 | R | 23 | −86.4 | 28.5 | 535.6 | ||
11 | V4v | L | −30.8 | −74.4 | −10.9 | 690.2 | |
12 | R | 31.5 | −70.7 | −11.1 | 540 | ||
13 | LO | L | −43.6 | −81.7 | 8.5 | 379 | Lateral occipital cortex |
14 | R | 43.7 | −78.8 | 7.7 | 471.5 | ||
15 | hMT+ | L | −48.6 | −66.1 | 6.5 | 450.5 | Human middle temporal complex |
16 | R | 48.1 | −61.6 | 7.6 | 531.4 | ||
17 | phPIT | L | −49.3 | −65.3 | −7.4 | 455.4 | Putative human posterior inferior temporal cortex |
18 | R | 48.4 | −65.6 | −7.2 | 445.3 | ||
19 | V7 | L | −26.1 | −72.5 | 31.9 | 612.8 | |
20 | R | 28 | −72.4 | 32.6 | 724.3 | ||
21 | V8 | L | −42.3 | −61.5 | −16.1 | 779.8 | |
22 | R | 43 | −57.8 | −17.2 | 793.3 | ||
23 | FG | L | −34 | −49.3 | −20.7 | 477.7 | Fusiform gyrus |
24 | R | 34.5 | −43.6 | −22.3 | 440.3 | ||
25 | IPS | L | −28.2 | −57.2 | 49.7 | 877.3 | Intraparietal sulcus |
26 | R | 27 | −56.7 | 52.3 | 1130.8 | ||
27 | SMG | L | −46.4 | −38.3 | 23.4 | 109.6 | Supramarginal gyrus |
28 | R | 59.9 | −37.9 | 16.3 | 305.9 | ||
29 | FEF | L | −41.9 | −3.3 | 48.5 | 510.2 | Frontal eye field |
30 | R | 41.1 | −2.2 | 49.2 | 676.6 | ||
31 | IFJ | L | −50.2 | −0.2 | 36.9 | 347.3 | Inferior frontal junction |
32 | R | 47.1 | 5.6 | 29.4 | 652.9 | ||
33 | SMA | L | −6 | 3.4 | 60.8 | 184.4 | Supplementary motor area |
34 | R | 6 | 4.6 | 65.1 | 57.0 | ||
35 | PCL | L | −4.7 | −31.9 | 62.7 | 31.4 | Paracentral lobule |
36 | STS | L | −52.1 | −48.2 | 10.9 | 42.2 | Superior temporal sulcus |
37 | IPL | L | −48.7 | −62.2 | 33.5 | 82.2 | Inferior parietal lobe |
. | Node . | . | X . | Y . | Z . | Area (mm2) . | Anatomical description . |
---|---|---|---|---|---|---|---|
1 | V1 | L | −13.2 | −97.2 | 4.1 | 1010.9 | |
2 | R | 12.7 | −93.4 | 3.5 | 1186.3 | ||
3 | V2 | L.d | −17.3 | −98 | 15 | 303.1 | |
L.v | −11.4 | −90.4 | −8.1 | 620.4 | |||
4 | R.d | 15.1 | −95.7 | 15.6 | 275.9 | ||
R.v | 11.2 | −87.3 | −7.9 | 668.2 | |||
5 | V3 | L | −20.1 | −93.7 | 21.3 | 271.1 | |
6 | R | 18.1 | −92.9 | 21.3 | 297.7 | ||
7 | VP | L | −21.8 | −85.7 | −11.9 | 684.7 | |
8 | R | 22.2 | −82.5 | −10.3 | 685.8 | ||
9 | V3A | L | −24 | −85.4 | 25.8 | 578.2 | |
10 | R | 23 | −86.4 | 28.5 | 535.6 | ||
11 | V4v | L | −30.8 | −74.4 | −10.9 | 690.2 | |
12 | R | 31.5 | −70.7 | −11.1 | 540 | ||
13 | LO | L | −43.6 | −81.7 | 8.5 | 379 | Lateral occipital cortex |
14 | R | 43.7 | −78.8 | 7.7 | 471.5 | ||
15 | hMT+ | L | −48.6 | −66.1 | 6.5 | 450.5 | Human middle temporal complex |
16 | R | 48.1 | −61.6 | 7.6 | 531.4 | ||
17 | phPIT | L | −49.3 | −65.3 | −7.4 | 455.4 | Putative human posterior inferior temporal cortex |
18 | R | 48.4 | −65.6 | −7.2 | 445.3 | ||
19 | V7 | L | −26.1 | −72.5 | 31.9 | 612.8 | |
20 | R | 28 | −72.4 | 32.6 | 724.3 | ||
21 | V8 | L | −42.3 | −61.5 | −16.1 | 779.8 | |
22 | R | 43 | −57.8 | −17.2 | 793.3 | ||
23 | FG | L | −34 | −49.3 | −20.7 | 477.7 | Fusiform gyrus |
24 | R | 34.5 | −43.6 | −22.3 | 440.3 | ||
25 | IPS | L | −28.2 | −57.2 | 49.7 | 877.3 | Intraparietal sulcus |
26 | R | 27 | −56.7 | 52.3 | 1130.8 | ||
27 | SMG | L | −46.4 | −38.3 | 23.4 | 109.6 | Supramarginal gyrus |
28 | R | 59.9 | −37.9 | 16.3 | 305.9 | ||
29 | FEF | L | −41.9 | −3.3 | 48.5 | 510.2 | Frontal eye field |
30 | R | 41.1 | −2.2 | 49.2 | 676.6 | ||
31 | IFJ | L | −50.2 | −0.2 | 36.9 | 347.3 | Inferior frontal junction |
32 | R | 47.1 | 5.6 | 29.4 | 652.9 | ||
33 | SMA | L | −6 | 3.4 | 60.8 | 184.4 | Supplementary motor area |
34 | R | 6 | 4.6 | 65.1 | 57.0 | ||
35 | PCL | L | −4.7 | −31.9 | 62.7 | 31.4 | Paracentral lobule |
36 | STS | L | −52.1 | −48.2 | 10.9 | 42.2 | Superior temporal sulcus |
37 | IPL | L | −48.7 | −62.2 | 33.5 | 82.2 | Inferior parietal lobe |
X, Y, and Z are MNI standard coordinates (mm), and area stands for the surface area size of individual node. Note that the dorsal and ventral divisions of V2 were merged during data analysis to form 1 node. L, left hemisphere; R, right hemisphere.
. | Node . | . | X . | Y . | Z . | Area (mm2) . | Anatomical description . |
---|---|---|---|---|---|---|---|
1 | V1 | L | −13.2 | −97.2 | 4.1 | 1010.9 | |
2 | R | 12.7 | −93.4 | 3.5 | 1186.3 | ||
3 | V2 | L.d | −17.3 | −98 | 15 | 303.1 | |
L.v | −11.4 | −90.4 | −8.1 | 620.4 | |||
4 | R.d | 15.1 | −95.7 | 15.6 | 275.9 | ||
R.v | 11.2 | −87.3 | −7.9 | 668.2 | |||
5 | V3 | L | −20.1 | −93.7 | 21.3 | 271.1 | |
6 | R | 18.1 | −92.9 | 21.3 | 297.7 | ||
7 | VP | L | −21.8 | −85.7 | −11.9 | 684.7 | |
8 | R | 22.2 | −82.5 | −10.3 | 685.8 | ||
9 | V3A | L | −24 | −85.4 | 25.8 | 578.2 | |
10 | R | 23 | −86.4 | 28.5 | 535.6 | ||
11 | V4v | L | −30.8 | −74.4 | −10.9 | 690.2 | |
12 | R | 31.5 | −70.7 | −11.1 | 540 | ||
13 | LO | L | −43.6 | −81.7 | 8.5 | 379 | Lateral occipital cortex |
14 | R | 43.7 | −78.8 | 7.7 | 471.5 | ||
15 | hMT+ | L | −48.6 | −66.1 | 6.5 | 450.5 | Human middle temporal complex |
16 | R | 48.1 | −61.6 | 7.6 | 531.4 | ||
17 | phPIT | L | −49.3 | −65.3 | −7.4 | 455.4 | Putative human posterior inferior temporal cortex |
18 | R | 48.4 | −65.6 | −7.2 | 445.3 | ||
19 | V7 | L | −26.1 | −72.5 | 31.9 | 612.8 | |
20 | R | 28 | −72.4 | 32.6 | 724.3 | ||
21 | V8 | L | −42.3 | −61.5 | −16.1 | 779.8 | |
22 | R | 43 | −57.8 | −17.2 | 793.3 | ||
23 | FG | L | −34 | −49.3 | −20.7 | 477.7 | Fusiform gyrus |
24 | R | 34.5 | −43.6 | −22.3 | 440.3 | ||
25 | IPS | L | −28.2 | −57.2 | 49.7 | 877.3 | Intraparietal sulcus |
26 | R | 27 | −56.7 | 52.3 | 1130.8 | ||
27 | SMG | L | −46.4 | −38.3 | 23.4 | 109.6 | Supramarginal gyrus |
28 | R | 59.9 | −37.9 | 16.3 | 305.9 | ||
29 | FEF | L | −41.9 | −3.3 | 48.5 | 510.2 | Frontal eye field |
30 | R | 41.1 | −2.2 | 49.2 | 676.6 | ||
31 | IFJ | L | −50.2 | −0.2 | 36.9 | 347.3 | Inferior frontal junction |
32 | R | 47.1 | 5.6 | 29.4 | 652.9 | ||
33 | SMA | L | −6 | 3.4 | 60.8 | 184.4 | Supplementary motor area |
34 | R | 6 | 4.6 | 65.1 | 57.0 | ||
35 | PCL | L | −4.7 | −31.9 | 62.7 | 31.4 | Paracentral lobule |
36 | STS | L | −52.1 | −48.2 | 10.9 | 42.2 | Superior temporal sulcus |
37 | IPL | L | −48.7 | −62.2 | 33.5 | 82.2 | Inferior parietal lobe |
. | Node . | . | X . | Y . | Z . | Area (mm2) . | Anatomical description . |
---|---|---|---|---|---|---|---|
1 | V1 | L | −13.2 | −97.2 | 4.1 | 1010.9 | |
2 | R | 12.7 | −93.4 | 3.5 | 1186.3 | ||
3 | V2 | L.d | −17.3 | −98 | 15 | 303.1 | |
L.v | −11.4 | −90.4 | −8.1 | 620.4 | |||
4 | R.d | 15.1 | −95.7 | 15.6 | 275.9 | ||
R.v | 11.2 | −87.3 | −7.9 | 668.2 | |||
5 | V3 | L | −20.1 | −93.7 | 21.3 | 271.1 | |
6 | R | 18.1 | −92.9 | 21.3 | 297.7 | ||
7 | VP | L | −21.8 | −85.7 | −11.9 | 684.7 | |
8 | R | 22.2 | −82.5 | −10.3 | 685.8 | ||
9 | V3A | L | −24 | −85.4 | 25.8 | 578.2 | |
10 | R | 23 | −86.4 | 28.5 | 535.6 | ||
11 | V4v | L | −30.8 | −74.4 | −10.9 | 690.2 | |
12 | R | 31.5 | −70.7 | −11.1 | 540 | ||
13 | LO | L | −43.6 | −81.7 | 8.5 | 379 | Lateral occipital cortex |
14 | R | 43.7 | −78.8 | 7.7 | 471.5 | ||
15 | hMT+ | L | −48.6 | −66.1 | 6.5 | 450.5 | Human middle temporal complex |
16 | R | 48.1 | −61.6 | 7.6 | 531.4 | ||
17 | phPIT | L | −49.3 | −65.3 | −7.4 | 455.4 | Putative human posterior inferior temporal cortex |
18 | R | 48.4 | −65.6 | −7.2 | 445.3 | ||
19 | V7 | L | −26.1 | −72.5 | 31.9 | 612.8 | |
20 | R | 28 | −72.4 | 32.6 | 724.3 | ||
21 | V8 | L | −42.3 | −61.5 | −16.1 | 779.8 | |
22 | R | 43 | −57.8 | −17.2 | 793.3 | ||
23 | FG | L | −34 | −49.3 | −20.7 | 477.7 | Fusiform gyrus |
24 | R | 34.5 | −43.6 | −22.3 | 440.3 | ||
25 | IPS | L | −28.2 | −57.2 | 49.7 | 877.3 | Intraparietal sulcus |
26 | R | 27 | −56.7 | 52.3 | 1130.8 | ||
27 | SMG | L | −46.4 | −38.3 | 23.4 | 109.6 | Supramarginal gyrus |
28 | R | 59.9 | −37.9 | 16.3 | 305.9 | ||
29 | FEF | L | −41.9 | −3.3 | 48.5 | 510.2 | Frontal eye field |
30 | R | 41.1 | −2.2 | 49.2 | 676.6 | ||
31 | IFJ | L | −50.2 | −0.2 | 36.9 | 347.3 | Inferior frontal junction |
32 | R | 47.1 | 5.6 | 29.4 | 652.9 | ||
33 | SMA | L | −6 | 3.4 | 60.8 | 184.4 | Supplementary motor area |
34 | R | 6 | 4.6 | 65.1 | 57.0 | ||
35 | PCL | L | −4.7 | −31.9 | 62.7 | 31.4 | Paracentral lobule |
36 | STS | L | −52.1 | −48.2 | 10.9 | 42.2 | Superior temporal sulcus |
37 | IPL | L | −48.7 | −62.2 | 33.5 | 82.2 | Inferior parietal lobe |
X, Y, and Z are MNI standard coordinates (mm), and area stands for the surface area size of individual node. Note that the dorsal and ventral divisions of V2 were merged during data analysis to form 1 node. L, left hemisphere; R, right hemisphere.

Group-level activation map of all 3 conditions and topological depiction of community organization. (A) Merged group-level activation map of 3 stimulus conditions. The color bar indicates conjunct conditions. Activation threshold: P < 0.01 after FDR correction, cluster size > 10 mm2. The defined network nodes are charted by black boundaries. For list of nodes, see Table 1. (B) Community organization and topology. The primary visual community mainly contains low-level visual regions (blue), the intermediate visual community includes high-level visual regions (green), and the top community spans the frontal and association cortical regions (red). (C) Community layout of network in illusory rotation condition at K = 8 (with a corresponding sparsity of 0.31). Visualization was done through using Pajek 4.03 (http://mrvar.fdv.uni-lj.si/pajek/ ) software.
Calculating edge connections: Preprocessing of stimulus-dependent and resting-state functional connectivity was performed using SPM8 (http://www.fil.ion.ucl.ac.uk/spm) and Matlab software (MATLAB 2012a, The MathWorks, Inc.). Preprocessing steps included slice-timing correction, motion correction, spatial normalization into MNI space, reslicing to 2 × 2 × 2 mm3 voxels and smoothing with a Gaussian kernel (FWHM = 4 mm). Here, the connectivity analysis was performed on a 3D volume rather than a 2D surface (on which fMRI activation maps were generated), requiring the use of a different smoothing kernel with twice the voxel size of resliced images. Nuisance time series (6 head motion parameters, ventricle and white matter signals) were removed from the smoothed volumes using linear regression. Linear drift of the volumes was removed and then subject to a temporal filter. Given the complex impact of global signal regression (Murphy et al. 2009; Weissenbacher et al. 2009), this procedure was not performed here. For resting-state data, the first 10 volumes were discarded to avoid effects of system instability and environmental adaptation and a band-pass filter (0.01–0.08 Hz) was used. For stimulus-dependent data, a high-pass temporal filter (>0.01 Hz) was applied. A lowpass temporal filter was not applied at this stage due to the increased likelihood of task signals at higher frequencies as compared with relatively slower resting state fluctuations (Cole et al. 2013).
Where rijk denotes the partial correlation coefficients between nodes i and j for kth of 3 task conditions, that is, the task functional connectivity between nodes i and j at kth condition. “partialcorr” denotes calculation of the partial correlation coefficient between Ii,k and Ij,k, the PPI terms for node time courses ti and tj, respectively, for the kth of the 3 task conditions. The PPI term reflects the psychophysiological interaction between the seed region's activity and the specified experimental manipulation (Friston et al. 1997). Ii,k = ti × Ψk. Ψ denotes the task regressor derived from task timing for a given task convolved with the canonical hemodynamic response function in SPM8 toolbox. G denotes a matrix with columns that contain covariates of no interest, that is, 6 head motion parameters. Three 37 × 37 association matrices were created for each subject.
Resting-state functional connectivity was estimated using pairwise Pearson's correlations between pairs of nodes. A 37 × 37 association matrix was created for each subject. In addition to the present resting-state dataset (n = 42), we used another resting-state dataset of 27 subjects acquired in a separate study to construct the resting connectivity matrix, and detected their functional architecture as external validation.
Determination of network threshold: As there is no consensus on the selection of network threshold in graph theoretical analysis, network properties are often explored over a range of plausible thresholds (Bullmore and Sporns 2009; Bullmore and Bassett 2011). Global thresholding, by which any edge smaller than the threshold is set to zero, is usually applied to construct functional brain networks. A drawback of global thresholding is that the graph is disconnected at sparse densities (i.e., setting higher thresholds), which can affect the quantitative values of many network metrics. Given that the present visual stimulus-evoked network consisted of a small set of brain regions, we opted to utilize a local thresholding approach that guarantees graph connectedness even at sparse densities (Alexander-Bloch et al. 2010). This method combines the concepts of the minimum spanning tree (Kruskal 1956) and the K nearest neighbor graph (Paterson and Yao 1992). As this local thresholding method may lead to an asymmetric matrix, we mirrored the matrix along the main diagonal to maintain symmetry. As such, we set the threshold of K to the range of 6–13, which corresponds to a typical network sparsity range of 0.2–0.5 (Bullmore and Bassett 2011).
Community Organization and Communication Analysis
Community communication: To fully explore the relationship between community communication and individual perceptual differences, analyses were performed at 3 scales of network constructs: community, node, and edge.
Edge level: At the individual edge level, no sparsity threshold was applied to the original matrices. After the original matrices were Fisher's Z-transformed, we calculated the Pearson's correlation coefficient between each Z value and perceptual level of each subject at a P value threshold of 0.01. Edges with a significant Pearson's correlation for the illusory rotation condition were binarized to form a new graph. We then calculated the percentage of within- and between-community edges in this graph and extracted the edges between primary and intermediate visual communities to construct a subgraph and calculate its node degree (the sum of a node's edges), and corresponding percentage of the total degree.
Statistical Analysis
Pearson's correlations between the perceptual level of each subject and network metrics were used to evaluate the relationship between network communication and individual perceptual differences. The P-value for Pearson's correlation coefficients was computed using a Student's t distribution to transform the correlation. The statistical significance threshold was set at P < 0.05 after correction for multiple comparisons, unless otherwise specified in the text.
At community and node level, we used the area under curve of the network metrics within the selected sparsity range to calculate Pearson's correlation coefficient with perception level. To reduce the occurrence of false positives due to multiple comparisons at the community level, P < 0.05 after Bonferroni correction (P < 0.05/6 (3 within- and 3 between-community communications)) was applied for assessment of statistical significance. We utilized a leave-one-out cross-validation procedure to test robustness of the relationship between perceptual level and network communication (strength or efficiency between the primary and intermediate visual communities) at the community level. Using ordinary least squares, a linear model relating within- and between-community strength and efficiency to observed perceptual level was constructed in each set of n-1 individuals to predict each left-out individual's perceptual level based on his or her within- and between-community strength or efficiency. A Pearson's correlation between predicted and observed perceptual levels was obtained as a measure of predictive power for each metric. Considering that the predictive power on the basis of the present linear model should be positive, a one-tailed Student's t-test was performed to test against the null hypothesis that predictive power was less than zero. We conducted 5000 permutations and obtained t-test similar results. This procedure was then applied to results at node level, where nodal communication correlated negatively with individual differences in perception. We considered P < 0.05 after FDR correction (Pk < 0.05*k/(37*3), where k = 1, 2, 3, … ; note that there were 37 nodes here) to be statistically significant.
To correct for multiple comparisons between edge connections, we used a network-based statistical approach corresponding to cluster-level correction for control of false positives (Zalesky et al. 2010). Edges with a Pearson's correlation P value lower than the threshold (P = 0.01) were defined as suprathreshold connections and decomposed into several connected components. To estimate the significance of each clustered component, the null distribution of connected component size was empirically derived using a nonparametric permutation approach (10 000 permutations). For each permutation, psychophysical data of individual percepts were randomly shuffled, and the Pearson's correlation coefficient was recalculated for each edge. The same threshold (P = 0.01) was then applied to define the suprathreshold connections, and maximal component size in the set of suprathreshold links was recorded. The statistical significance of a connected component with size S was computed as the percentage of the null distribution that had a maximal component size larger than S. P values of each connected component (i.e., cluster) after cluster-level correction were reported and the significance threshold was set at a corrected level of P = 0.05.
Results
Psychophysical Measurements of the Pinna Illusion
We recruited 73 healthy volunteers and quantitatively evaluated their susceptibility to the illusory rotation induced by the PBF (Fig. 1A). Perceptual susceptibility of each subject was assessed using expansion of the figure (Fig. 1B), similarly described in our previous work (Pan et al. 2016). Increasing the diameter of rings in the pattern produces an illusory effect equivalent to that observed when approaching the figure. To measure the illusory speed of rotation, we tested 40 angular speeds of the PBF on the retina, ranging from 2 to 26°/s (Supplementary Table S1). We found that the speed of illusory rotation (Sperceived) averaged over all subjects positively correlated with the angular speed of the Pinna pattern (Aspeed) and negatively correlated with initial angles of the Pinna pattern on the retina (Ainitial) (Sperceived = I + α*Aspeed + β*Ainitial, I = 10.25, PI = 5.54e-17, α = 0.62, Pα = 1.5e-21, β = −1.94, Pβ = 7.06e-8, adjusted R2 , a modified version of R2 adjusted for number of regressors in the model = 0.91). Results of all subjects, but one (adjusted R2 = 0.11), adhered to a linear model (mean adjusted R2 = 0.79, minimal adjusted R2 = 0.42) (Supplementary Fig. 1). Averaged illusory rotation speed followed a normal probability distribution in all participants (μ = 11.71, σ = 4.52, P = 0.942, one-sample Kolmogorov–Smirnov test) (Fig. 1C).
We obtained an average score for each participant representing the corresponding perceptual level in this cohort according to relative rank (between 1 and 73) of the perceived rotary speed reported by subjects. Kendall's coefficient of concordance, which is used to assess inter-rater agreement between all ranks was 0.74 (P = 0, χ2 test), demonstrating that the defined rank was highly consistent across 40 speed conditions. Perceptual levels, represented by this averaged rank, of all 73 participants also followed a normal distribution (μ = 37, σ = 18.23, P = 0.609, one-sample Kolmogorov–Smirnov test) (Fig. 1D). Overall, 42 subjects with uniformly distributed perceptual levels (P = 0.66, two-sample Kolmogorov–Smirnov test) from the original cohort were screened to undergo fMRI; distributions of mean perceived illusory rotation speeds and perceptual levels are shown in Figure 1C and D.
Functional Architecture of the Visual Stimulus-evoked Network
We began by conducting group-level analysis of fMRI response to visual stimulation and found that 3 stimuli activated roughly the same brain regions (dice coefficient = 0.95, Fig. 2A and Supplementary Fig. S2). Activated regions spanned almost the entire visual system and some areas in the frontal and parietal cortices including the bilateral frontal eye field, inferior frontal junction, supplementary motor cortex, supramarginal gyrus, left inferior parietal lobe and left paracentral lobule, and temporal cortex—left superior temporal sulcus (Fig. 2A). Based on these activation maps, we parcellated these regions into 37 nodes (Fig. 2A and Table 1) to construct a visual stimulus-evoked network for each stimulus condition. These 37 identified nodes were also used to obtain a connectivity network for the resting-state condition.
We next sought to determine whether there existed a functionally segregated architecture in these group-level connectivity matrices (Supplementary Fig. S3). As illustrated in Figure 2B, we detected 3 communities in this stimulus-evoked network, all of which were consistently present across stimulus conditions but differed slightly from those in the resting-state condition (e.g., the hMT+ belongs to the top module). Community charts contain communities whose composition remains rather constant over a wide range of network thresholds (Supplementary Fig. S4). Note that the modular architecture of the functional connectivity network in the resting state highly resembles that of stimulus conditions (Supplementary Fig. S4). The primary visual module consisted of the bilateral V1-V4v, lateral occipital cortex, and left inferior parietal lobe. The intermediate visual module was composed of the bilateral V7, V8, hMT+, phPIT cortex (Kolster et al. 2010), the fusiform gyrus and IPS, and the top module included the bilateral frontal eye field, inferior frontal junction, supplementary motor area, supramarginal gyrus, left paracentral lobule and left superior temporal sulcus. As the group-level graph represented the best modular partition of a group of subjects (Fair et al. 2009; Kinnison et al. 2012), we assigned the group-level partition to the stimulus-evoked network of each subject for subsequent analysis. Group-level modular organization of the identified network for the illusory rotation condition at sparsity = 0.31 is shown as an example in Figure 2C.
Network Correlates of Perceptual Level
We estimated Pearson's correlations to explore the relationships between individual perceptual level and graph performance metrics including strength and efficiency at 3 network construct scales: community, node, and edge. We found that strength and efficiency between the primary and intermediate visual communities significantly correlated with perceptual level of subjects only in the illusory rotation condition (strength: r = −0.41, P = 0.007; efficiency: r = −0.46, P = 0.002, Bonferroni corrected, Table 2). These results were independent of the sparsity at which the network was constructed (Supplementary Fig. S5). To validate the results, a leave-one-out-cross-validation procedure was employed. After leaving one subject out, the Pearson's correlation coefficient between perceptual level of the remaining n-1 subjects and communication between primary and intermediate visual communities was calculated. The results showed that all the correlations between the perceptual levels and network efficiency or strength were consistently statistically significant (Fig. 3A). To further determine whether communication efficiency and strength could predict perceptual levels of novel individuals, simple linear models were constructed relating between and within module strength or efficiency at each experimental condition to perceptual level in each set of n-1 individuals. These models were then used to predict the left-out individual's perceptual level. Pearson correlations between observed and predicted perceptual levels were used to reflect predictive power. Demonstrating that dynamic network communication can be used to predict perceptual level in novel individuals, observed and predicted perceptual levels were found significantly correlated for communication efficiency (r = 0.38, P = 0.006, one-tailed t-test) and for communication strength (r = 0.30, P = 0.025, one-tailed t-test) between the primary and intermediate visual communities only in the illusory rotation condition (Fig. 3B and Supplementary Table S2). There was no significant difference between the 2 predictive powers (Steiger's Z = 0.90, P = 0.369). The predictive power for strength and efficiency together conducted by using a general linear model was also significant (r = 0.33, P = 0.017), but lower than that for efficiency only (Steiger's Z = -3.04, P = 0.002) and not significantly different from that for strength only (Steiger's Z = 0.33, P = 0.741). In contrast to the other 3 conditions including physical rotation, expansion only, and resting-state, prediction of individual perceptual level of the Pinna illusion was significant solely for the illusory rotation condition, indicating strong stimulus specificity and state-dependence.
Correlations between measures of network communication (community strength and efficiency) and perceptual level
Community communication . | Strength . | Efficiency . | |||||||
---|---|---|---|---|---|---|---|---|---|
IR . | PR . | EO . | REST . | IR . | PR . | EO . | REST . | ||
Within community | Primary | −0.18 | −0.17 | −0.19 | −0.33 | −0.23 | −0.23 | −0.23 | −0.28 |
(0.242) | (0.275) | (0.217) | (0.035) | (0.134) | (0.146) | (0.142) | (0.069) | ||
Intermediate | −0.25 | −0.01 | −0.17 | −0.20 | −0.28 | −0.02 | −0.21 | −0.22 | |
(0.112) | (0.958) | (0.280) | (0.193) | (0.075) | (0.911) | (0.188) | (0.153) | ||
Top | −0.03 | −0.09 | −0.09 | 0.13 | −0.10 | −0.11 | −0.11 | 0.03 | |
(0.839) | (0.561) | (0.590) | (0.412) | (0.515) | (0.502) | (0.499) | (0.863) | ||
Between community | Primary–intermediate | −0.41 | −0.20 | −0.11 | 0.08 | −0.46 | −0.24 | −0.14 | −0.06 |
(0.007) | (0.201) | (0.489) | (0.610) | (0.002) | (0.128) | (0.381) | (0.682) | ||
Primary–top | −0.08 | 0.12 | 0.02 | −0.17 | −0.23 | −0.06 | −0.14 | −0.19 | |
(0.593) | (0.445) | (0.903) | (0.281) | (0.148) | (0.710) | (0.361) | (0.223) | ||
Intermediate–top | −0.11 | −0.02 | −0.00 | −0.15 | −0.19 | −0.07 | −0.05 | −0.18 | |
(0.478) | (0.921) | (0.998) | (0.329) | (0.217) | (0.679) | (0.754) | (0.246) |
Community communication . | Strength . | Efficiency . | |||||||
---|---|---|---|---|---|---|---|---|---|
IR . | PR . | EO . | REST . | IR . | PR . | EO . | REST . | ||
Within community | Primary | −0.18 | −0.17 | −0.19 | −0.33 | −0.23 | −0.23 | −0.23 | −0.28 |
(0.242) | (0.275) | (0.217) | (0.035) | (0.134) | (0.146) | (0.142) | (0.069) | ||
Intermediate | −0.25 | −0.01 | −0.17 | −0.20 | −0.28 | −0.02 | −0.21 | −0.22 | |
(0.112) | (0.958) | (0.280) | (0.193) | (0.075) | (0.911) | (0.188) | (0.153) | ||
Top | −0.03 | −0.09 | −0.09 | 0.13 | −0.10 | −0.11 | −0.11 | 0.03 | |
(0.839) | (0.561) | (0.590) | (0.412) | (0.515) | (0.502) | (0.499) | (0.863) | ||
Between community | Primary–intermediate | −0.41 | −0.20 | −0.11 | 0.08 | −0.46 | −0.24 | −0.14 | −0.06 |
(0.007) | (0.201) | (0.489) | (0.610) | (0.002) | (0.128) | (0.381) | (0.682) | ||
Primary–top | −0.08 | 0.12 | 0.02 | −0.17 | −0.23 | −0.06 | −0.14 | −0.19 | |
(0.593) | (0.445) | (0.903) | (0.281) | (0.148) | (0.710) | (0.361) | (0.223) | ||
Intermediate–top | −0.11 | −0.02 | −0.00 | −0.15 | −0.19 | −0.07 | −0.05 | −0.18 | |
(0.478) | (0.921) | (0.998) | (0.329) | (0.217) | (0.679) | (0.754) | (0.246) |
Values indicate the Pearson's correlation coefficient (with P value in parentheses) between AUC of modular strength or efficiency and perceptual level. Bold font, statistically significant correlation after Bonferroni correction for multiple comparisons.
Correlations between measures of network communication (community strength and efficiency) and perceptual level
Community communication . | Strength . | Efficiency . | |||||||
---|---|---|---|---|---|---|---|---|---|
IR . | PR . | EO . | REST . | IR . | PR . | EO . | REST . | ||
Within community | Primary | −0.18 | −0.17 | −0.19 | −0.33 | −0.23 | −0.23 | −0.23 | −0.28 |
(0.242) | (0.275) | (0.217) | (0.035) | (0.134) | (0.146) | (0.142) | (0.069) | ||
Intermediate | −0.25 | −0.01 | −0.17 | −0.20 | −0.28 | −0.02 | −0.21 | −0.22 | |
(0.112) | (0.958) | (0.280) | (0.193) | (0.075) | (0.911) | (0.188) | (0.153) | ||
Top | −0.03 | −0.09 | −0.09 | 0.13 | −0.10 | −0.11 | −0.11 | 0.03 | |
(0.839) | (0.561) | (0.590) | (0.412) | (0.515) | (0.502) | (0.499) | (0.863) | ||
Between community | Primary–intermediate | −0.41 | −0.20 | −0.11 | 0.08 | −0.46 | −0.24 | −0.14 | −0.06 |
(0.007) | (0.201) | (0.489) | (0.610) | (0.002) | (0.128) | (0.381) | (0.682) | ||
Primary–top | −0.08 | 0.12 | 0.02 | −0.17 | −0.23 | −0.06 | −0.14 | −0.19 | |
(0.593) | (0.445) | (0.903) | (0.281) | (0.148) | (0.710) | (0.361) | (0.223) | ||
Intermediate–top | −0.11 | −0.02 | −0.00 | −0.15 | −0.19 | −0.07 | −0.05 | −0.18 | |
(0.478) | (0.921) | (0.998) | (0.329) | (0.217) | (0.679) | (0.754) | (0.246) |
Community communication . | Strength . | Efficiency . | |||||||
---|---|---|---|---|---|---|---|---|---|
IR . | PR . | EO . | REST . | IR . | PR . | EO . | REST . | ||
Within community | Primary | −0.18 | −0.17 | −0.19 | −0.33 | −0.23 | −0.23 | −0.23 | −0.28 |
(0.242) | (0.275) | (0.217) | (0.035) | (0.134) | (0.146) | (0.142) | (0.069) | ||
Intermediate | −0.25 | −0.01 | −0.17 | −0.20 | −0.28 | −0.02 | −0.21 | −0.22 | |
(0.112) | (0.958) | (0.280) | (0.193) | (0.075) | (0.911) | (0.188) | (0.153) | ||
Top | −0.03 | −0.09 | −0.09 | 0.13 | −0.10 | −0.11 | −0.11 | 0.03 | |
(0.839) | (0.561) | (0.590) | (0.412) | (0.515) | (0.502) | (0.499) | (0.863) | ||
Between community | Primary–intermediate | −0.41 | −0.20 | −0.11 | 0.08 | −0.46 | −0.24 | −0.14 | −0.06 |
(0.007) | (0.201) | (0.489) | (0.610) | (0.002) | (0.128) | (0.381) | (0.682) | ||
Primary–top | −0.08 | 0.12 | 0.02 | −0.17 | −0.23 | −0.06 | −0.14 | −0.19 | |
(0.593) | (0.445) | (0.903) | (0.281) | (0.148) | (0.710) | (0.361) | (0.223) | ||
Intermediate–top | −0.11 | −0.02 | −0.00 | −0.15 | −0.19 | −0.07 | −0.05 | −0.18 | |
(0.478) | (0.921) | (0.998) | (0.329) | (0.217) | (0.679) | (0.754) | (0.246) |
Values indicate the Pearson's correlation coefficient (with P value in parentheses) between AUC of modular strength or efficiency and perceptual level. Bold font, statistically significant correlation after Bonferroni correction for multiple comparisons.

Network communication models predict visual perception performance. (A) Robust correlations between perceptual levels and communication efficiency (left) and strength (right) between the primary and intermediate visual communities during a leave-one-out cross-validation procedure. (B) Scatter plots show correlations between the observed perceptual levels and predictions by general linear models that take into account communication efficiency or strength between primary and intermediate community. Network models were iteratively computed on network communication metrics of the illusory rotation condition and observed perceptual level from each set of n-1 subjects, then tested on the network data of the left-out individual.
To pinpoint whether specific brain areas in these communities contribute preferentially to differential illusory percepts across subjects, we probed differences in community strength and efficiency of each node as a function of perception level. For the illusory rotation condition, node efficiency between the primary and intermediate visual communities at the bilateral V3, right V3A, right V8, and left IPS negatively correlated with individual perceptual level of the Pinna illusion (P < 0.05, FDR corrected, Fig. 4 and Supplementary Fig. S6). Node efficiency within the intermediate visual module inversely correlated with the perceptual levels of all subjects only for the right phPIT (P < 0.05, FDR corrected, Fig. 4 and Supplementary Fig. S6). In other conditions including physical rotation, expansion only and resting-state, we found that between- and within-community efficiency of all brain nodes showed no significant relationship with perceptual level of the Pinna illusion. Similarly, no significant correlation was observed between the perceptual level and between- and within-community strength of individual nodes in all conditions (Supplementary Fig. S6).

Statistical correlation between node efficiency and perceptual level. Node efficiencies correspond to the AUC at selected sparsities. Only negative correlations between node efficiency and perceptual level are statistically significant (after FDR correction) at multiple areas, indicated by asterisks. IR, illusory rotation; PR, physical rotation; EO, expansion only; REST, resting state.
We next asked whether specific functional connections within these communities were associated with the perceptual variation observed among subjects. Hence, an edge-level analysis was applied to the original matrices without a priori threshold setting, in which we calculated the Pearson's correlation coefficient between each edge connection and the perceptual levels of all subjects. In the illusory rotation condition, 46 edges (considerably above chance) were found to correlate with individual differences in perception (P = 0.011, cluster-level correction for multiple comparisons), illustrated in Figure 5A–D together with the other 3 conditions (see Supplementary Table S3 for detailed information). These connections were predominantly (63.1%) bounded in the primary and intermediate visual modules (Fig. 5E), consistent with the community-level observation (Table 2). The heterogeneity of this observation can be quantified by computing the degree of nodes: defined as the number of edges emanating from that area. We extracted the edges between the primary and intermediate visual communities in Figure 5A (indicated by Cyan colored lines) and calculated the degree of the linked nodes. The top 10 nodes were the right V3, right phPIT, left V7, right V3A, left V3, right V7, left V3A, right V8, and bilateral IPS (Fig. 5F and Supplementary Table S4), in line with the results obtained at node level analysis. By contrast, only a small number of functional connections were correlated, though not statistically significant, with individual perception of physical rotation (12 connections forming 2 clusters, P = 0.086, 0.135 (cluster-level correction) for each cluster, respectively), expansion only (3 connections forming 1 cluster, P = 0.176, cluster-level correction) and resting-state conditions (1 connection, P = 0.357, cluster-level correction) (Fig. 5B–D).

Edge level analysis. Edges with a significant correlation (P < 0.01) to perceptual level in the illusory rotation condition (A), physical rotation condition (B), expansion only condition (C) and resting state (D) are shown. Edge width indicates –log10(P), where P is the significance value of the Pearson's correlation coefficient between perceptual level and original stimulus-dependent functional connectivity. See Supplementary Table 3 for detailed information about significantly correlated edges in the illusory rotation condition. (E) Distribution of significantly correlated edges in the illusory rotation condition by community. (F) Rank of node degree in the illusory rotation condition. Sectors reflect each node's degree as a percentage of total degree. See Supplementary Table S4 for more details of node size and percentage of degree.
Discussion
The present study, for the first time, tests the hypothesis that an interplay between communities within visual stimulus-evoked network architecture may contribute to individual differences in the perception of a prominent visual illusory motion. We first quantified individual's differential percepts of the Pinna illusion, and then collected fMRI data during both resting and stimulus conditions with 2 comparable visual patterns as controls. In this brain-wide exploration of the underlying circuitry mechanisms of perceptual variability, we used network-analysis to map data-derived functional modules of stimulus-activated brain areas to anatomical charts, and calculated their communication efficiency and strength. We observed that the primary visual, intermediate visual, and parts of frontal, parietal and temporal areas formed 3 discrete modules that were persistently active during spontaneous activity and stimulus conditions. Noticeably, network communication between the primary and intermediate visual modules alone, without changes in communication involving the top module, enabled robust and significant prediction of individual perceptual differences.
It is well-recognized that a distributed but integrated circuit mediates visual perception, rather than a set of individual, specialized regions with each subserving a particular visual behavior (Fregnac and Bathellier 2015). Under 1 illusory and 2 control conditions, we detected an almost identical number of activated brain areas that constituted one large-scale visual stimulus-evoked network in which 3 segregated modules emerged. Intriguingly, we observed a closely resembling network architecture in the resting state when extracting corresponding regionwise correlations among these areas. This implies that the brain essentially maintains a stable functional segregation, in accordance with previous findings of modular or community architectures involving higher cognitive functions such as learning, memory, and emotion (Fornito et al. 2012; Kinnison et al. 2012; Laird et al. 2013). It is crucial to point out that the communities identified here reflect a prominent hierarchical segregation, in contrast to prior experimental findings (Kinnison et al. 2012; Bassett et al. 2015). In the current setting, using carefully designed visual patterns as single input entry to the visual system, we found that the primary visual module mainly included the low-level visual cortices from V1 to V4v, and the second module evidently spanned intermediate visual regions. The third consisted of several areas such as inferior frontal junction and superior temporal sulcus in the frontal, parietal and temporal lobes that are strongly implicated in a variety of cognitive functions including subjective perception, focused attention, etc. (Lumer et al. 1998; Gaillard et al. 2009; Dehaene and Changeux 2011). Note that we did not find statistically significant activation in classic regions of higher function such as the dorsolateral prefrontal cortex, and they were evidently not included in the top module. Although this result seems counterintuitive, it fits in reasonably with our experimental paradigm. Unlike other illusory phenomena demanding extensive functional recruitment of frontoparietal areas, the Pinna illusion is likely associated with low-level perception (Fregnac and Bathellier 2015). What is more, we found that hMT+, a region crucial for visual motion perception (Born and Bradley 2005) and mediation of Pinna illusion perception (Pan et al. 2016), belonged to different communities in the task and resting states. This area has been reported to constitute a node in the dorsal attention network in the resting state and during attention-based tasks (Corbetta et al. 2008). As such, it may imply that dynamic reconfiguration of some network nodes concurs with ever-changing brain states or functions, as reflected by changes in the modularity membership of those brain regions in the present study. Nevertheless, our observation demonstrates that modular or community organization of brain machinery is completely context-dependent, enabling the functional repertoire and dynamic reconfiguration of network architecture to rapidly adapt to capricious cognitive demands (Bullmore and Sporns 2012).
Prompted by the identical modular architecture commonly shared among the Pinna illusion and 2 physical motion stimulation conditions, in addition to the task-related nature of connections between- and within-modules (Kinnison et al. 2012; Cole et al. 2014), we examined whether network dynamics, beyond the static network architecture, encode perceptual variability in illusion accordingly. Recall that economical organization of brain circuitry presumably leads to high efficiency with low wiring cost (Achard and Bullmore 2007; Bullmore and Sporns 2012). Community efficiency, defined as the inverse of the harmonic mean of minimum path length, is likely to reflect the capacity for parallel information transfer between modules via multiple series of edges. Community strength, defined as the mean summed weights of all connections either between or within modules, is considered the connection capability of a network. Interestingly, both communication efficiency and strength were negatively correlated with individual perceptual level of the Pinna illusion in a stimulus-specific manner. It has been proposed that communication within and between modules promotes functional segregation and integration (Deco et al. 2015), giving rise to the vast flexibility that accommodates the full repertoire of brain functions (Kashtan and Alon 2005; Bullmore and Sporns 2012). Our results offer novel insight into the neurocircuitry mechanism of perceptual variability that heightened efficiency and strength of community communication could lead to more faithful neural representation of the external physical world, that is, perception of minimal illusory rotation. Meanwhile, the present findings strongly suggest that dynamic integration of distributed networks, not circumscribed centers, may underlie interindividual differences in visual perception. Remarkably, we found that network communication at resting state was not able to predict the perceptual level of the leave-one-out subject. These results collectively suggest that intrinsic functional organization and inherited spontaneous network dynamics are not ubiquitously predictive of brain activity in the stimulus or behavioral state as indicated by many previous reports (Boly et al. 2007; van den Heuvel et al. 2009; Zhu et al. 2011; Baldassarre et al. 2012; Tavor et al. 2016). We posit an alternative explanation that differential perception of illusion might be an emerging functional property of human subjects in response to external visual input, uniquely reflected by stimulus-specific network dynamics in widely distributed brain regions.
Moreover, the model based on communication between the primary and intermediate visual modules was sufficient to make a linear prediction of subjects’ perception of the Pinna illusion. Despite the prevailing view supporting the central role of the frontoparietal network in conscious perception, communication between the top module spanning the frontal, parietal and temporal areas and other 2 modules did not significantly contribute to the perceptual level of each subject here. This evidently shows that information processing between the low-level and intermediate visual cortices, rather than top-down feedback modulation from the top module, is largely responsible for varied percepts between subjects. Thus, the neurobiological mechanism behind the Pinna illusion likely differs from that of multistable perception, which is thought to result from continuous interaction between the visual system and frontoparietal cortical areas (Leopold and Logothetis 1999; Lumer and Rees 1999; Sterzer et al. 2009).
We speculate that individual nodes of the distributed and modularized network tend to make somewhat differential contributions to visual perception, although the nature of the informational contribution from different brain areas remains elusive. This idea is substantiated by our in-depth analysis of nodes and edges in these communities showing that brain regions along the dorsal pathway including the bilateral V3, V3A, V7, and IPS, and the ventral pathway including the right phPIT, right V8 were predominantly associated with perceptual variation across subjects. This expands upon previous conclusions regarding the pivotal roles of V3 (Smith et al. 1998; Takemura et al. 2012), V3A (Tootell et al. 1997; McKeefry et al. 2010; Chen et al. 2016), and V7 (Brouwer and van Ee 2007) in the information processing of visual motion. It also converges with the emerging view that motion information is processed in the ventral pathway as well, which interacts with the dorsal pathway in parallel (Nassi and Callaway 2009). Furthermore, our results unveil that 2 cortical regions, phPIT (Kolster et al. 2010) and V8 (Hadjikhani et al. 1998), in the ventral stream of the right hemisphere are closely related to differential percepts of the Pinna illusion. Gilaie-Dotan et al. 2013 found that the right, but not the left, ventral visual regions are critical for different types of motion perception by examining patients with a circumscribed lesion to the ventral visual areas. These lines of evidence suggest that the involvement of the ventral pathway in visual motion processing exhibits a tendency of right lateralization. Note that only right-handed participants were included the present study. Previous reports have shown differences in visual perception (Bryden 1973; Wang et al. 2007; Willems et al. 2014) and brain activity (Wang et al. 2007; Willems et al. 2014) between right- and left-handed individuals. Therefore, right lateralization of the ventral pathway in visual motion processing remains to be substantiated in the left-handed human population. Notably, the middle superior temporal cortex, a subarea embedded in hMT+, reported to mediate perception of Pinna illusion (Pan et al. 2016), did not contribute to perceptual differences between individuals. We speculate that the current network modeling which treats hMT+ as a single node may suffer a loss of sensitivity to visual rotation, despite hMT+ as a whole being visual motion-sensitive. Furthermore, hMT+ may still not contribute to individual differences in its perception even though it is crucial to the perception of the Pinna illusion per se. These, on the other hand, occur due to differences in communication among regions upstream and downstream from hMT+.
In summary, we quantitatively assess the perceptual level of illusory rotation based on the PBF in a large cohort and identify 3 discrete, hierarchical communities in the visual stimulus-evoked network. We demonstrate that communication strength and efficiency between the primary and intermediate visual communities remarkably predict perceptual variability in the Pinna illusion. This result is nontrivial, considering it withstands a leave-one-out cross-validation procedure. Our efforts establish a link between performance metrics of the neural network and performance of visual perception, highlighting the mechanistic role of communication economy in brain circuitry that may underpin the heterogeneity of subjective perception in humans. Moreover, we demonstrate that the performance metrics of a neural network are specific to behavioral performance rather than intrinsic. If generalizable beyond the Pinna illusion, this strategy with large-scale network model analysis paves a new avenue of investigation into the neurocircuitry mechanisms of other perceptual and cognitive functions through exploration of the relationship between individual variability and network dynamics. Nevertheless, given the emergent property of particular brain function or state, the present study underscores the importance of task-based investigation of individual cognitive variability in terms of dynamic brain activity, as might not necessarily be encoded in the connectomic network at resting-state.
Supplementary Material
Supplementary data is available at Cerebral Cortex online.
Funding
Hundred Talent Program of the Chinese Academy of Sciences (Technology), Strategic Priority Research Program (B) of the Chinese Academy of Sciences (XDB02050000) and National Natural Science Foundation of China Grant 81571300 (to Z.W.).
Notes
The authors thank Jingyan Mao, Wenwen Yu and Jinqiang Peng for their excellent assistance during data collection. The authors thank Drs. Muming Poo, Liping Wang, and Lothar Spillmann for their insightful comments on the article and related topics; and Drs. John Gore, Ravi Menon, Lawrence Wald, Franz Schmitt, Renate Jerecic, Thomas Benner, Kecheng Liu, Ignacio Vallines, and Hui Liu for their generous help and contributions to the construction of our custom-tuned 3T MRI facility.