Abstract

Current brain mapping methods highly depend on the regularity, or commonality, of anatomical structure, by forcing the same atlas to be matched to different brains. As a result, individualized structural information can be overlooked. Recently, we conceptualized a new type of cortical folding pattern called the 3-hinge gyrus (3HG), which is defined as the conjunction of gyri coming from three directions. Many studies have confirmed that 3HGs are not only widely existing on different brains, but also possess both common and individual patterns. In this work, we put further effort, based on the identified 3HGs, to establish the correspondences of individual 3HGs. We developed a learning-based embedding framework to encode individual cortical folding patterns into a group of anatomically meaningful embedding vectors (cortex2vector). Each 3HG can be represented as a combination of these embedding vectors via a set of individual specific combining coefficients. In this way, the regularity of folding pattern is encoded into the embedding vectors, while the individual variations are preserved by the multi-hop combination coefficients. Results show that the learned embeddings can simultaneously encode the commonality and individuality of cortical folding patterns, as well as robustly infer the complicated many-to-many anatomical correspondences among different brains.

Introduction

Accumulating evidence suggest that the underlying mechanisms of brain organization are embedded in cortical folding patterns (Zilles et al. 1988; Roth and Dicke 2005; Hilgetag and Barbas 2006; Fischl et al. 2008; Dubois et al. 2008; Giedd and Rapoport 2010; Honey et al. 2010; Li et al. 2014; Holland et al. 2015). Alterations or deficits in cortical folding are strongly associated with abnormal brain structure–function, impaired cognition, behavioral disorganization (Thompson et al. 2004; Tortori-Donati et al. 2005; Bullmore and Sporns 2009; Honey et al. 2010), and various neurodevelopmental disorders (Sallet et al. 2003; Hardan et al. 2004; Harris et al. 2004; Nordahl et al. 2007). Unfortunately, quantitative and effective representation of cortical folding has been challenging due to the remarkable complexity and variability of convex gyral/concave sulcal shapes. Recently, we first conceptualized a new type of brain folding pattern termed 3-hinge gyrus (3HG) (Li et al. 2010; Chen et al. 2017), which is the conjunction of gyri coming from three directions in cortical folding. Interestingly, 3HGs are not only evolutionarily preserved across multiple species of primates (Li et al. 2017), but also robustly existed on human brains despite different populations or brain conditions (Chen et al. 2017; Ge et al. 2018; Zhang et al. 2020c, 2020d). Previous studies already confirmed that 3HGs tend to have thicker cortices (Li et al. 2010), higher DTI-derived fiber density (Ge et al. 2018), and more pronounced connective diversities in both structural and functional domains (Li et al. 2017, 2010; Zhang et al. 2020d). In addition, comparing to other gyral regions (i.e. 2-hinge gyrus), 3HGs possess significantly higher brain connectivity measures (Zhang et al. 2020d) such as degree, strength and betweenness. All these findings suggest that 3HGs may play a key role (e.g. hubs in the cortico-cortical connective network) in brain anatomical architecture (Zhang et al. 2020d). Meanwhile, 3HGs can be achieved conveniently from widely existing T1 images and therefore, can better serve as meso-scale anatomical landmarks of human brain.

It is worth noting that 3HGs are identified on individual space, which means cross-subject correspondences of 3HGs need to be constructed before conducting any population-level analysis. Zhang et al. (2020c) developed a two-view & group-wise graph matching method to use both cortical folding patterns and DTI-derived fiber shape features to estimate the 3HG correspondences. The core idea is to jointly optimize the axonal connective and anatomical topological patterns, as two views, and maximize the consistency between the corresponding 3HGs on different brains. However, this method suffered from three challenges: (i) because of group-wise optimization scheme, the robustness and computing time highly rely on the number of samples; (ii) the graph matching is conducted independently (from scratch), making it difficult to generalize the obtained correspondence of 3HGs on training data to new brains; and (iii) the features of two views are handcrafted and directly used to seek the consistency, thus weakening the effectiveness of the method for inflexibility to adapt tremendous individual variations. Therefore, it is more desirable to automatically learn an intrinsic representation of folding patterns that can be used for finding reliable corresponding 3HGs across different brains. More importantly, this representation should be able to simultaneously characterize commonality and individuality of cortical folding and be generalized well on new datasets.

Recent advances in deep modeling have triggered a new era in representation learning field, and a variety of powerful representation learning algorithms have been proposed. For example, in natural language processing (NLP), many word embedding methods have been developed to learn semantically meaningful embeddings. Those embeddings have shown superior performances on various downstream tasks (Mikolov et al. 2013; Pennington et al. 2014; Devlin et al. 2018; Peters et al. 2018). Similarly, there has been a surge of graph-based embedding approaches that can encode the nodes/edges based on graph structure information (Perozzi et al. 2014; Tang et al. 2015; Grover and Leskovec 2016; Wang et al. 2016; Narayanan et al. 2017; Chen et al. 2018). All these remarkable works have demonstrated the superiority of learning-based embedding methods when targeting an effective representation in latent space. Inspired by these representation learning studies, in this work, we aim to design a learning-based embedding framework to encode the complex and variable cortical folding patterns into a group of anatomically meaningful embedding vectors (cortex2vector). The similarity between different embedding vectors, in turn, can appropriately represent the relations between the corresponding brain landmarks—3HGs.

In word embedding, each single word in a sentence can be explicitly embedded via alphabet combinations and this makes it easy to build the word vocabulary. When using 3HGs as the elementary anatomical units for cortical embedding, we do not have similar “vocabulary,” since every single 3HG is unique. To solve this problem, instead of conducting embedding on 3HG directly, we choose to learn the embeddings of the anatomical features associated to 3HGs. Our previous work (Chen et al. 2017) has developed an innovative and effective algorithm to build brain anatomical graph, named GyralNet, that automatically and accurately extracts all gyral crest lines as edges, and 3HGs as nodes. Hence, for each 3HG, its location and the connections with other 3HGs within individual GyralNet can be used as two key features for embedding. Specifically, we used the anatomical regions of interest (ROI) (from FreeSurfer atlas) to index the location of each 3HG. As all the 3HGs on the same hemisphere are connected by gyri hinges, we considered multi-hop neighbors of each 3HG and build the local connections. Though this way, each 3HG can be represented as a hierarchical combination of multi-hop ROIs via a set of specific multi-hop combination coefficients. That is, we learned a high-dimensional embedding vector for each anatomical ROI via an autoencoder model, and these learned ROI embedding could serve as the basic elements to represent each 3HG, like alphabet in words. By training the proposed model in a self-supervised manner, the regularity of folding pattern is encoded into the embedding vectors and the variability is preserved by the multi-hop coefficients. Our experiment results show that the learned embeddings can successfully encode the common patterns and variability of 3HGs simultaneously and can also accurately infer the cross-subject many-to-many correspondence under complex cortical landscapes. Moreover, our developed learning-based framework generalizes well when applied to large-scale new datasets.

Materials and Methods

Datasets Description and Data Pre-processing

In this work, we used structure MRI (T1-weighted) of 1064 subjects from Human Connectome Project (HCP) S1200 release. For the T1-weighted structure MRI, the imaging parameters are TR = 2.4s, TE = 2.14 ms, flip angle = 8|${}^{\circ}$|⁠, field of view (FOV) = 224 |$\times$| 224 |$\textrm{mm}$| and resolution = 0.7 |$\times$| 0.7 |$\times$| 0.7 mm3. We applied the same standard pre-processing procedures as in references (Zhang et al. 2020b, 2021, 2022) for T1 imaging data. In brief, pre-processing steps included brain skull removal, tissue segmentation and cortical surface reconstruction via FreeSurfer package (Fischl 2012). Destrieux Atlas (Fischl et al. 1999) was used to conduct ROI labeling for reconstructed white matter surface.

Identification of GyralNet and 3HGs

The 3HGs were identified automatically via our recently developed pipeline (Chen et al. 2017), which consists of four key steps, including gyral altitude mapping, gyral crest segmentation, tree marching and extraction of GyralNet and 3HGs.

Gyral Altitude Mapping

The gyral altitude is defined as the displacement of a vertex from its original location to a hypothetical “mid-surface,” which separates gyri from sulci (Fischl et al. 1999). This “mid-surface” is chosen to make the summation of the displacement of all vertices from their original locations to be zero. We mapped gyral altitude to all vertices on surface in Fig. 1a1.

Schematic of the proposed embedding framework. (a) Identification of GyralNet and 3HGs. a1 White matter cortical surface mapped by gyral altitude; a2 segmentation of gyral crest (white regions) from sulci basins (color regions); a3 connecting the gyral crest regions into a completed graph by tree marching (black curves). A magnification view of the circled patch is displayed between a2 and a3; a4 pruning the redundant branches to preserve the main trunk of the graph (black curves)—GyralNet; a5 identification of 3HGs (labeled by green bubbles). (b) 3HG’s multi-hop feature encoding. b1 Parcellating the entire cortex into 75 ROIs via Destrieux Atlas and assigning each 3HG with an ROI label as node feature; b2 Numerically representing each ROI label by one-hot encoding; b3 by considering multi-hop neighbors, 3HGs are encoded by multi-hop features. (c) The proposed learning-based embedding framework (details can be found in Section 2.4).
Figure 1

Schematic of the proposed embedding framework. (a) Identification of GyralNet and 3HGs. a1 White matter cortical surface mapped by gyral altitude; a2 segmentation of gyral crest (white regions) from sulci basins (color regions); a3 connecting the gyral crest regions into a completed graph by tree marching (black curves). A magnification view of the circled patch is displayed between a2 and a3; a4 pruning the redundant branches to preserve the main trunk of the graph (black curves)—GyralNet; a5 identification of 3HGs (labeled by green bubbles). (b) 3HG’s multi-hop feature encoding. b1 Parcellating the entire cortex into 75 ROIs via Destrieux Atlas and assigning each 3HG with an ROI label as node feature; b2 Numerically representing each ROI label by one-hot encoding; b3 by considering multi-hop neighbors, 3HGs are encoded by multi-hop features. (c) The proposed learning-based embedding framework (details can be found in Section 2.4).

Gyral Crest Segmentation

The watershed algorithm (Bertrand 2005) was applied to the gyral altitude map in Fig. 1a1 to separate the gyral crest (regions above a predefined gyral altitude level) from the sulcal basins (regions below the altitude level). The obtained gyral crests (white) and sulcal basins (labeled by different colors) were displayed in Fig. 1(a2). More details about the watershed segmentation progress can be referred to (Chen et al. 2017).

Tree Marching

A distance transform was firstly performed to gyral crest regions to assign a distance value to each vertex and highlight the centers of the gyral crests. The distance was defined based on the displacement from the vertex of interest to the boundaries between gyral crest regions and sulcal basins. As a result, a field with decreasing gradient from gyral crest centers to the boundaries was generated. Then a tree marching algorithm was applied to the distance map to connect vertices from crest centers to boundaries. A tree root was placed in each gyral crest center and progressively connected other vertices following the descending gradient of the distance map till the boundaries between gyral crest regions and sulcal basins are reached. During this process, when two trees met, connections will be made between the two trees. By this way, the gyral crests on the same hemisphere can be connected into a graph. The constructed graph structure was shown in Fig. 1a3 by black curves and the zoom in view of a circled area was displayed between Fig. 1a2 and Fig. 1a3.

Extraction of GyralNet and 3HGs

During the tree marching process, all the vertices in gyral crest regions were connected and some redundant branches were generated. We trimmed these redundant branches when their length was shorter than a predefined threshold. The main trunks of the graph structure were preserved, and this trimmed graph named as GyralNet (black curves in Fig. 1a4). The conjunctions with three branches on the GyralNet were defined as 3HGs (green bubbles in Fig. 1a5).

Multi-Hop Features Encoding

By taking 3HGs as nodes, the GyralNet can be represented as an undirected graph. Let |$\mathcal{G}=(\mathcal{V},\mathcal{E})$| denote the undirected graph, where |$\mathcal{V}=\{{v}_1,{v}_2,\cdots, {v}_N\}$| is the set of |$N$| 3HGs, and |$\mathcal{E}\subseteq \{\{{v}_i,{v}_j\}|{v}_i,{v}_j\in \mathcal{V}\}$| is the set of unweighted edges, which are the gyral crest lines connecting 3HGs. Its adjacency matrix is denoted by |$\mathcal{A}=[{a}_{i,j}]\in{R}^{N\times N}$|⁠, where |${a}_{i,j}=1$| if there is a connection between |${v}_i$| and |${v}_j$|⁠, and |${a}_{i,j}=0$| otherwise. We conducted ROI labeling via Destrieux Atlas and divide the whole surface into 75 ROIs. Each 3HG was assigned an ROI label as node feature (Fig. 1b1). We numerically represented the ROI labels by one-hot encoding; i.e. the |${k}^{th}$| label was denoted by |${e}_k\in{R}^{75}$| with 1 in the |${k}^{th}$| location and 0 elsewhere. Accordingly, the |${i}^{th}$| 3HG in |${k}^{th}$| ROI can be denoted by |${x}_i={e}_k$|⁠.By far, the undirected graph of 3HGs can be represented by the adjacency matrix |$\mathcal{A}$| and the feature matrix |$\mathcal{X}=\{{x}_1;{x}_2;\cdots; {x}_N\}\in{R}^{N\times 75}$| (Fig. 1b2). Based on the two matrices, the two key components of multi-hop features encoding are defined as:

|${l}^{th}$| hop features: In a 3HG graph, the |${l}^{th}$| hop neighborhood of the 3HG |$i$| is the set of 3HGs connecting to 3HG|$i$|via the shortest path with l steps (l-hop), denoted by |${N}_l(i)$|⁠. For |${v}_j\in{N}_l(i)$|⁠, its feature vector is denoted by |${x}_j$|⁠, and accordingly the |${l}^{th}$| hop features of 3HG |$i$| are defined as |$\sum_{v_j\in{N}_l(i)}{x}_j$|⁠. Given the adjacency matrix |$\mathcal{A}$| and the feature matrix |$\mathcal{X}$|⁠, the |${l}^{th}$| hop features of 3HG |$i$| can be calculated by |${[{\mathcal{A}}^l\mathcal{X}]}_{i,\ast }$|⁠, where |${\mathcal{A}}^l$| is the |${l}^{th}$|power of |$\mathcal{A}$|⁠. As the adjacency matrix |$\mathcal{A}$| defines the direct connections between the graph nodes, in the process of recurrently multiplying by itself, just like the graph convolution operation, the undirected connections of further neighbors are propagated and gathered along with the direct connections. When multiplying |$\mathcal{A}$|  |$l$| times, the features of the neighbors that can reach the center node by |$l$| steps are congregated. As each row of |${\mathcal{A}}^l\mathcal{X}$| corresponds to one 3HG, the |${l}^{th}$| hop feature of 3HG |$i$| thereby can be denoted by |${[{\mathcal{A}}^l\mathcal{X}]}_{i,\ast }$|⁠.

|$l$|  hop features (multi-hop features): Based on the definition of |${\mathrm{l}}^{\mathrm{th}}$| hop features, we further defined the |$\mathrm{l}$| hop features of 3HG |$\mathrm{i}$| as follows (Fig. 1b3):

(1)

where the |${0}^{th}$| hop feature |${[\mathcal{X}]}_{i,\ast }={e}_k$|⁠, which indicates that 3HG |$i$| is in the ROI |$k$|⁠, and the |${l}^{th}$| hop feature |${[{\mathcal{A}}^l\mathcal{X}]}_{i,\ast }={\sum}_{k=1}^{75}{a}_{lk}^i{e}_k$|⁠, with the multi-hop coefficient |${a}_{lk}^i$| indicating the number of different |$l$|-step paths that are available from 3HG |$i$| to the ROI |$k$|⁠. If there is no |$l$|-step path between them, |${a}_{lk}^i=0$|⁠, otherwise, |${a}_{lk}^i$| will be a positive integer. In this work, we set |$l>1$|⁠; hence, the |$l$| hop features cover multiple hops of the 3HG graph and are also called multi-hop features. By organizing the multi-hop features in this manner, the hierarchical multi-hop relationships between 3HGs with all the 75 ROIs are encoded into |${F}_{MH}$|⁠.

Learning-based Embedding Framework

Our learning-based embedding framework (Fig. 1c) is designed in a self-supervised manner: it includes two-stage encoding to hierarchically map the input multi-hop features to a latent representation, and a two-stage decoding that aims to hierarchically reconstruct the original input from the latent representation. The embedding learning process can be formulated as (2):

(2)

where |$\sigma$| is the non-linear activation function, |${F}_{MH}^i\in{R}^{(l+1)\times 75}$| is the multi-hop feature of 3HG |$i$| defined by (1); |${W}^{Embedding}=\{{w}_1;{w}_2;\cdots; {w}_{75}\}\in{R}^{75\times d}$| is the learnable embedding matrix. In our setting, there are 75 ROIs in total, we initialized a learnable embedding vector |$w\in{R}^d$| for each ROI and organized them as the same order as the one-hot encoding in (1) to form |${W}^{Embedding}$|⁠. The input multi-hop features are embedded via |${W}^{Embedding}$| (hop by hop) to generate multi-hop embeddings |${E}_{MH}^i\in{R}^{(l+1)\times d}$|⁠. To further fuse the multi-hop embeddings into a single embedding vector that contains the complete multi-hop information, we conducted the second encoding by learnable combination parameters |${W}^{Fusion}\in{R}^{(l+1)\times 1}$| to integrate the multi-hop embeddings into one embedding vector |${E}_F^i\in{R}^{1\times d}$|⁠. In (2), the transpose of |${W}^{Fusion}$||${({W}^{Fusion})}^T$|⁠, was used for matrix multiplication at multi-hop dimension (rows of |${E}_{MH}^i$|⁠).

We used a symmetric design (as traditional autoencoder) for the two-stage decoding with the parameters |${W}^{D_1}\in{R}^{1\times (l+1)}$| and |${W}^{D_2}\in{R}^{d\times 75}$|⁠, respectively. The first decoding reconstructs the hierarchical multi-hop embeddings from the combined embedding vector, which ensures that the combined embedding vector |${E}_F^i$| has captured the complete information to restore the embeddings for each hop. Then, upon the restored multi-hop embeddings |$ \widetilde{E}_{MP}^{i} $|⁠, the second decoding was applied to recover the multi-hop input features |$ \widetilde{F}_{MH}^{i} $|⁠. We adopted the MSE loss to evaluate the two-stage decoding and the objective function can be defined by (3):

(3)

where |$\alpha$| and |$\beta$| are the hyper-parameters to control the contribution of the two-stage decoding. The entire model was trained in a self-supervised manner, which avoids introducing any bias from supervised term when representing cortical folding as embedding vectors. Through a two-stage encoding-decoding process, the intrinsic patterns buried in the complex and variable cortical folding can be effectively instilled into the embedding vectors.

Evaluation of the Embedding Effectiveness

In this work, there are two kinds of embeddings: the ROI embeddings learned by population—|${W}^{\mathrm{Embedding}}$| and the individual 3HG embeddings—|${E}_F^i$|⁠. The ROI embeddings are learned by recovering the 3HG’s multi-hop features on population level and can be served as the basic elements to represent each 3HG. Therefore, they should have the capability to characterize regularity of the anatomical pattern of 3HGs shared by the population. We adopted the strength of 3HG’s multi-hop connection to describe the 3HGs’ connection patterns between ROI pairs at different hop levels, which is defined as:

Strength of 3HG’s multi-hop connection (ground truth matrix): If 3HG |$\mathrm{i}$| is in ROI |$\mathrm{k}$| and 3HG |$\mathrm{j}$| is in ROI |$\mathrm{m}$|⁠, and they are connected in the GyralNet via the shortest path with |$\mathrm{l}$| steps, then ROI |$\mathrm{k}$| and ROI |$\mathrm{m}$| have one |$\mathrm{l}$|-hop 3HG connection. In this work, we added up the connections between each pair of ROIs across the whole population and used the resulting matrix as the ground truth (Fig. 2a).

Evaluation of the learned ROI embeddings using strength of 3HG’s multi-hop connections (defined in Section 2.5). (a) The strength of 3HG’s multi-hop connections calculated by the whole population with 1064 subjects; (b) the cosine similarity of each pair of the ROI embedding vectors learned by different multi-hop features (1-hop features, 2-hop features, and 3-hop features, respectively). For the matrices in both (a) and (b), the order of the brain regions is the same as the order defined in Destrieux atlas (Fischl et al. 1999), where most of the first 44 regions are gyri while the last 31 regions are sulci.
Figure 2

Evaluation of the learned ROI embeddings using strength of 3HG’s multi-hop connections (defined in Section 2.5). (a) The strength of 3HG’s multi-hop connections calculated by the whole population with 1064 subjects; (b) the cosine similarity of each pair of the ROI embedding vectors learned by different multi-hop features (1-hop features, 2-hop features, and 3-hop features, respectively). For the matrices in both (a) and (b), the order of the brain regions is the same as the order defined in Destrieux atlas (Fischl et al. 1999), where most of the first 44 regions are gyri while the last 31 regions are sulci.

In the ground truth matrix, if two ROIs have a larger number of |$\mathrm{l}$|-hop 3HG connections, it means the two ROIs are closely related in GyralNet at the |$\mathrm{l}$|-hop level. Thus, their |$\mathrm{l}$|-hop embedding vectors should capture this close relationship by possessing higher similarity in the latent space.

The individual 3HG embedding |${E}_F^i$| is used to represent a unique 3HG. As defined in (1) and (2), each 3HG embedding is a specific combination of ROI embeddings |${W}^{\mathrm{Embedding}}$| via a set of 3HG-specific coefficients |$\{{a}_{lk}^i\}$| and the learned fusion parameters |${W}^{\mathrm{Fusion}}$|⁠. An effective embedding vector |${E}_F^i$| is expected to be able to preserve the individuality of different 3HGs and provides reliable cross-subject 3HG anatomical correspondence. To evaluate this capability, we applied the generated 3HG embeddings to the anatomical correspondence task to infer the complicated many-to-many cross-subject anatomical correspondence of 3HGs.

Results

We applied the proposed multi-hop feature encoding method (Section 2.3) and the learning-based embedding framework (Section 2.4) to the identified 3HGs (Section 2.2). By training the model end-to-end in a self-supervised task, we learned a set of ROI embeddings—|${W}^{\mathrm{Embedding}}$|⁠. The effectiveness of |${W}^{\mathrm{Embedding}}$| was evaluated by the strength of 3HG’s multi-hop connection (Section 2.5). Then we generalized the learned ROI embeddings and the well-trained model to a new dataset and generated an individual embedding vector for each 3HG. The effectiveness of the generated individual embedding vectors was evaluated in the anatomical correspondence task to infer the complicated cross-subject anatomical correspondence of 3HGs. The result section is organized as follows: Section 3.1 introduces the experimental setting; Section 3.2 evaluates the effectiveness of the learned ROI embeddings; Section 3.3 and 3.4 show the inference results of the 3HGs’ anatomical correspondence task; Section 3.5 assesses the regression performance of the proposed two-stage decoding framework.

Experimental Setting

Data Setting. We randomly divided the 1064 subjects from HCP dataset into two datasets. Dataset-1 is used to train the model and learn the embeddings. Then the well-trained model and the learned ROI embeddings are applied to dataset-2 to infer the 3HGs’ anatomical correspondence. In dataset-1, there are 564 subjects, and 186,915 3HGs are identified. In dataset-2, there are 500 subjects and 169,923 3HGs are identified. Each 3HG is treated as a data sample.

Model Setting. For multi-hop features, we generated 1-hop features (⁠|$l=1$| in (1) and (2)), 2-hop features (⁠|$l=2$|⁠) and 3-hop features (⁠|$l=3$|⁠). For each kind of feature, we trained the model to learn the corresponding ROI embeddings. In our experiments, the learnable ROI embeddings were initialized by identity matrix to ensure the initial distances between any two embedding vectors are the same. We adopted the embedding dimension |$d=128$|⁠. The fusion operation—|${W}^{\mathrm{Fusion}}$|⁠, and the two decoder operations—|${W}^{D_1}$| and |${W}^{D_2}$| were implemented by fully connected layers and the parameters were initialized following the Xavier scheme. The entire model was trained in an end-to-end manner. The Adam optimizer was used to train the whole model with standard learning rate 0.001, weight decay 0.01, and momentum rates (0.9, 0.999).

Effectiveness of ROI Embeddings

In the experiments, we used dataset-1 to generate different multi-hop features including 1-hop features, 2-hop features, and 3-hop features to train the model and learn the ROI embeddings. We evaluated the learned ROI embeddings via the strength of 3HG’s multi-hop connection (Section 2.5) and displayed the results in Fig. 2. Figure 2(a) shows the statistical results of the strength of 3HG’s multi-hop connections based on the whole population of 1064 subjects. Figure 2(b) shows the cosine similarity between the learned ROI embedding vectors. For each matrix, the order of the brain regions is the same as the order defined in Destrieux atlas (Fischl et al. 1999), where most of the first 44 regions are gyri and the last 31 regions are sulci. For the ease of better analyzing the results, we divided all the 3HG’s multi-hop connections into 3 different groups: gyri-gyri connections, gyri-sulci connections, and sulci-sulci connections. The gyri-gyri connections mean both two connecting 3HGs are in gyral regions. The gyri-gyri connections are located at the top left of each matrix in Fig. 2. The gyri-sulci connections and sulci-sulci connections are defined in the same manner. In the matrices, the gyri-sulci connections are located at top right and bottom left, sulci-sulci connections are located at the bottom right. It is worth noting that the ground truth matrix (Fig. 2(a)) reflects the strength of actual anatomical relationship between two ROIs on cortical surface, e.g. the number of GyralNet edges between these two ROIs. While the embedding similarity matrix is defined based on cosine similarity between the embedding vectors of these two ROIs, which represents the relationship of these two ROIs in the latent (embedding) space. Therefore, if the two matrices have similar patterns, we can conclude that our ROI embeddings can effectively represent anatomical ROIs in the latent space including their relations. In Fig. 2, we can see that when only considering 1-hop features (the first column), the ground truth matrix shows strong gyri-gyri connections, weak gyri-sulci connections and almost no sulci-sulci connections. Our embeddings effectively capture this pattern, though some weak connections in the ground truth matrix are missing due to the sparsity of the input 1-hop features. When using 2-hop and 3-hop features (the second and third columns), the ground truth matrices show stronger connections in some regions (highlighted by red squares), and the same pattern can also be found in our embedding similarity matrices. Besides the visualization, we also adopted three measures, including structural similarity index measure (SSIM), Pearson correlation coefficient (PCC), and cosine similarity (CS), to quantitatively measure the similarity between the ground truth matrices and our embedding similarity matrices. The results are reported in Table 1. For all the three measures, the learned embedding similarity matrices show highly consistent pattern to the ground truth matrices, especially for the 2-hop embedding whose SSIM measure is over 0.8. These results demonstrate that our learned embeddings can effectively encode the population-level common connection patterns in the data: if two ROIs have strong/weak connections on GyralNet (cortical space), they will also have large/small similarities in the latent space (embedding space).

Table 1

Similarity between ground truth matrix and embedding similarity matrix.

MeasuresMultihops
1-hop2-hop3-hop
SSIM0.610.810.28
PCC0.480.480.43
CS0.470.490.43
MeasuresMultihops
1-hop2-hop3-hop
SSIM0.610.810.28
PCC0.480.480.43
CS0.470.490.43
Table 1

Similarity between ground truth matrix and embedding similarity matrix.

MeasuresMultihops
1-hop2-hop3-hop
SSIM0.610.810.28
PCC0.480.480.43
CS0.470.490.43
MeasuresMultihops
1-hop2-hop3-hop
SSIM0.610.810.28
PCC0.480.480.43
CS0.470.490.43

In addition to evaluating the learned embeddings by the overall pattern, we also assessed it at a finer scale by selecting the top-9 connections (pair of ROIs) in the three ground truth matrices and three embedding similarity matrices in Fig. 2. The results are shown in Fig. 3. Most of the top-9 connections in the ground truth matrices can be found in the embedding similarity matrices (missed 1 in hop-1 embedding, missed 2 in hop-2 embedding and missed 3 in hop-3 embedding). In addition, we found that all the missing connections (top-9 connections in ground truth but not in embedding similarity matrices) or redundant connections (top-9 connections in embedding similarity matrices but not in the ground truth) can be found in the top 40 connections in embedding similarity matrices and the ground truth matrices. In general, the learned embedding vectors can reliably represent 3HGs and their connections in a latent space, which can be used to construct their similarities and correspondences across individuals.

Top-9 connections (pair of ROIs) of the three ground truth matrices and three embedding similarity matrices in Fig. 2. In each subfigure, we used the dotted frame of the same color to represent the same connections. The involved region labels are listed below the subfigure and marked by the dots of the same color with the dotted frame. Within each block, the connections are arranged by their rank following the order from top to bottom and left to right. The connections without the dotted frame represent they are only found in either ground truth matrices or embedding similarity matrices, and their labels are mark by black.
Figure 3

Top-9 connections (pair of ROIs) of the three ground truth matrices and three embedding similarity matrices in Fig. 2. In each subfigure, we used the dotted frame of the same color to represent the same connections. The involved region labels are listed below the subfigure and marked by the dots of the same color with the dotted frame. Within each block, the connections are arranged by their rank following the order from top to bottom and left to right. The connections without the dotted frame represent they are only found in either ground truth matrices or embedding similarity matrices, and their labels are mark by black.

Effectiveness of 3HGs Individual Embeddings

After the model is well trained by dataset-1, we applied the learned ROI embeddings and the model to the new dataset-2 to generate individual 3HG embeddings—|${E}_F^i$| (defined in (2)). According to the discussion in Section 3.2, the 2-hop features can provide the best embedding performance; hence, we adopted 2-hop embedding in this section. As discussed in Section 2.5, the individual 3HG embeddings are expected to be able to preserve the individuality of 3HGs and infer reliable cross-subject anatomical correspondence. Therefore, in this section we will evaluate the effectiveness of the individual embeddings from two aspects.

Inferring Reliable 3HGs Cross-Subject Anatomical Correspondence

To evaluate the effectiveness of 3HG individual embeddings on inferring cross-subject anatomical correspondence, we used the 3HGs from one randomly selected subject (sub-0 in Fig. 4) as the exemplars to find their corresponding 3HGs in other subjects by the learned embedding vector—|${E}_F^i$|⁠. For each exemplar 3HG in sub-0, the correspondence inferring process is based on the following steps: (i) we examined all the 3HGs in different subjects and calculated the cosine similarity between the embedding vector of exemplar 3HG with each of the other 3HGs; (ii) for each subject, the 3HGs that have the cosine similarity of 1.0 to the exemplar 3HG will be identified as the corresponding 3HGs in this subject; (iii) if there is no 3HGs having the cosine similarity of 1.0, the one with the largest cosine similarity (above a threshold) will be identified as the corresponding 3HG. Following these steps, we obtained the corresponding 3HGs for each of the exemplar 3HGs in different subjects. In the sub-0, 190 and 175 3HGs have been identified in the left and right hemisphere by the 3HGs identification pipeline (Section 2.2), respectively. For better visualization, we selected 60 3HGs on each hemisphere, which spread over the whole cerebral cortex and showed the corresponding 3HGs on 10 randomly selected subjects in Fig. 4. Bubbles indicate the locations of 3HGs. The corresponding 3HGs in different subjects were color-coded by correspondence indices. From the results we can see that the corresponding 3HGs identified on different individuals have consistent locations in terms of common anatomical landscapes: for example, 3HG #157 in left hemisphere and 3HG #96 in right hemisphere (marked by red arrows) are found in the middle of left front superior gyri and middle of the right precentral gyri, respectively, across all the subjects.

Cross-subject correspondences of 3HGs. The 3HGs of a randomly selected subject in HCP (sub-0) were used as exemplar 3HGs. To find the corresponding 3HGs on other subjects, the following pipeline was adopted: (i) we examined all the 3HGs in different subjects and calculated the cosine similarity between the embedding vector of exemplar 3HG with each of the other 3HGs; (ii) for each subject, the 3HGs that have the cosine similarity of 1.0 to the exemplar 3HG will be identified as the corresponding 3HGs in this subject; (iii) if there is no 3HGs having the cosine similarity of 1.0, the one with the largest cosine similarity (above a threshold) will be identified as the corresponding 3HG. Following these steps, we obtained the corresponding 3HGs for each of the exemplar 3HGs in different subjects. In the sub-0, 190 and 175 3HGs have been identified in the left and right hemisphere by the 3HGs identification pipeline (Section 2.2), respectively. For better visualization, we selected 60 3HGs on each hemisphere, which spread over the whole cerebral cortex and showed the corresponding 3HGs on 10 randomly selected subjects (from sub-1 to sub-10). The corresponding 3HGs in different subjects are represented using the same color. Two 3HGs (#157 and #96) and their correspondences are highlighted as examples.
Figure 4

Cross-subject correspondences of 3HGs. The 3HGs of a randomly selected subject in HCP (sub-0) were used as exemplar 3HGs. To find the corresponding 3HGs on other subjects, the following pipeline was adopted: (i) we examined all the 3HGs in different subjects and calculated the cosine similarity between the embedding vector of exemplar 3HG with each of the other 3HGs; (ii) for each subject, the 3HGs that have the cosine similarity of 1.0 to the exemplar 3HG will be identified as the corresponding 3HGs in this subject; (iii) if there is no 3HGs having the cosine similarity of 1.0, the one with the largest cosine similarity (above a threshold) will be identified as the corresponding 3HG. Following these steps, we obtained the corresponding 3HGs for each of the exemplar 3HGs in different subjects. In the sub-0, 190 and 175 3HGs have been identified in the left and right hemisphere by the 3HGs identification pipeline (Section 2.2), respectively. For better visualization, we selected 60 3HGs on each hemisphere, which spread over the whole cerebral cortex and showed the corresponding 3HGs on 10 randomly selected subjects (from sub-1 to sub-10). The corresponding 3HGs in different subjects are represented using the same color. Two 3HGs (#157 and #96) and their correspondences are highlighted as examples.

Preserving Cross-subject Individuality

As an essential characteristic of human cerebral cortex, the folds of different subjects have shown intensive variability. To illustrate that the learned 3HG individual embeddings can preserve individuality of different subjects, we randomly selected three 3HGs as exemplars to find their corresponding 3HGs in different subjects. That is, within each subject, all the 3HGs that have a cosine similarity over 0.9 to the exemplar 3HG will be identified as the corresponding 3HG of that exemplar. We randomly selected 12 subjects and showed the results in Fig. 5. The cerebral cortex of the 12 subjects displays distinct cortical folding patterns. For example, the first exemplar 3HG is located at the conjunction of precentral gyri and front middle gyri, and there is no other 3HGs in the neighborhood. However, the folding patterns of the fifth subject (highlighted in yellow) is more convoluted, as a result, multiple 3HGs are aggregated at the same location. For the 11th subject (highlighted in pink), there is no conjunction that can connect precentral gyri and front middle gyri, therefore, there is no 3HGs located at this location. More examples have been found and marked in exemplar 2 and 3. Despite the widely existed individuality of cortical folding patterns, our embedding method can provide a reliable way to identify the complex many-to-many correspondences without mapping different individual brains to the same space, and thus, the variabilities of 3HGs can be preserved. Notably, the embeddings and models used in this correspondence task were trained in a different dataset—dataset-1 by a self-supervised regression task, but they can generalize well on the new dataset-2 that shows promising transferable capability in other datasets. Therefore, the proposed framework can provide an effective way to design practical pre-training paradigms and facilitate downstream tasks in brain anatomy studies.

Cross-subject individuality and variability. We used three 3HGs (three blocks: top, middle, and bottom) as examples to show their correspondences in different subjects. We randomly selected 12 subjects and adopted the cosine similarity of 0.9 as the threshold. The corresponding 3HGs are represented by bubbles, and the cosine similarity was encoded by the color of the bubbles. Due to the individuality, it is possible to find zero, one or multiple correspondences across different subjects.
Figure 5

Cross-subject individuality and variability. We used three 3HGs (three blocks: top, middle, and bottom) as examples to show their correspondences in different subjects. We randomly selected 12 subjects and adopted the cosine similarity of 0.9 as the threshold. The corresponding 3HGs are represented by bubbles, and the cosine similarity was encoded by the color of the bubbles. Due to the individuality, it is possible to find zero, one or multiple correspondences across different subjects.

Commonality of 3HGs Correspondence

In this section, we studied the commonality of 3HGs. For each 3HG, the commonality is defined as the percentage of the subjects in which the corresponding 3HGs can be found. If a 3HG has higher commonality, that represents its corresponding 3HGs exist on a large population. Similarly, higher individuality (less commonality) means the 3HG can only be found in a small number of individuals. We used all the 169,923 3HGs in dataset-2 (testing dataset) to calculate the commonality. The calculation pipeline is as follows: (i) Taking the 169,923 3HG embeddings as input data, we adopted agglomerative hierarchical clustering algorithms (Murtagh and Contreras 2012) to conduct clustering. Agglomerative hierarchical clustering algorithms is a bottom-up approach that starts with many small clusters (each cluster is one data point at the beginning) and merges the closest ones gradually to create larger clusters, until the shortest distance between clusters is beyond a predefine distance threshold. In this work, we took cosine similarity as the metric to calculate the distance between data points. Complete linkage was used as the linkage criterion to define the distance between two clusters: the longest distance between two points in the two clusters. Complete linkage can make sure that the distance between any two data points in the same cluster will smaller than the predefined distance threshold. (ii) After the clustering process, the 3HGs in the same cluster are treated as the corresponding 3HGs in different individuals. (iii) For each cluster, representing a distinct 3HG, we examined how many subjects on which we can find the correspondences. Thus, we obtained the percentage of the subjects (to the entire testing dataset) that have this specific 3HG. Note that the derived commonality may be affected by the threshold of cosine similarity used, therefore, we evaluated different cosine similarity values as the predefined distance to conduct clustering and reported the results in Table 2. We listed the number of common 3HGs (clusters) that can be found when using different thresholds and commonality levels. For example, when we set the cosine similarity threshold as 0.85, 49 common 3HGs (clusters) can be identified on 90% of the subjects in the testing dataset. From the results we can see that when the cosine similarity decreases from 1.0 to 0.9, the number of 3HGs at all commonality levels (from 40 to 90%) increases. When the cosine similarity further decreases (from 0.9 to 0.85), the number of common 3HGs keeps relatively stable. This situation suggests that when the cosine similarity threshold decreases below 0.9, the new common 3HGs identified incline to join the existing clusters.

Table 2

Commonality of 3HGs.

Com. (⁠|$\ge$|⁠)Cosine similarity
0.850.8750.900.950.9751.0
90%49444323111
80%68616149342
70%82798068473
60%1009810192655
50%118123130119966
40%14415516416113810
Com. (⁠|$\ge$|⁠)Cosine similarity
0.850.8750.900.950.9751.0
90%49444323111
80%68616149342
70%82798068473
60%1009810192655
50%118123130119966
40%14415516416113810

Com. (Commonality) = percentage of the subjects that have the corresponding 3HGs

Table 2

Commonality of 3HGs.

Com. (⁠|$\ge$|⁠)Cosine similarity
0.850.8750.900.950.9751.0
90%49444323111
80%68616149342
70%82798068473
60%1009810192655
50%118123130119966
40%14415516416113810
Com. (⁠|$\ge$|⁠)Cosine similarity
0.850.8750.900.950.9751.0
90%49444323111
80%68616149342
70%82798068473
60%1009810192655
50%118123130119966
40%14415516416113810

Com. (Commonality) = percentage of the subjects that have the corresponding 3HGs

We further visualized the distribution of all 3HGs as well as the common 3HGs across brain regions and presented the results in Fig. 6. The distribution of all 3HGs was calculated by adding the number of 3HGs across populations for each region and then dividing by the number of individuals (Fig. 6(a)). For common 3HGs distribution, we presented the results of five commonality levels ranging from 50% to 90% with cosine similarity equal to 0.9 (Fig. 6(b)-(f)), which correspond to the results highlighted by the red borders in Table 2. The distribution of common 3HGs under each commonality level was quantified by the number of common 3HGs of each brain region. To visualize the distribution, we color coded the 3HGs number and mapped it back to the cortical surface in Fig. 6. As shown in Fig. 6(a), 3HGs are found in all gyri regions and are more abundant in certain brain regions, such as G_front_sup (mean = 21), G_front_middle (mean = 10), G_orbital (mean = 8), and G_pariet_inf-Angular (mean = 7). The same pattern is also found in the distributions of common 3HGs. Taking commonality |$\ge$|0.5 as an example, although 130 common 3HGs are spreading over all the gyri regions, the number of common 3HGs in different regions varies greatly. G_pariet_inf-Angular, for example, has eight common 3HGs, whereas G_occipital_sup has only one.

Distribution of 3HGs across brain regions. (a) Distribution of all 3HGs across brain regions. It was calculated by adding the number of 3HGs across populations for each region and then dividing by the number of individuals. (b-f) Distribution of common 3HGs under five commonality levels. The distribution under each commonality level was quantified by the number of common 3HGs in each brain region.
Figure 6

Distribution of 3HGs across brain regions. (a) Distribution of all 3HGs across brain regions. It was calculated by adding the number of 3HGs across populations for each region and then dividing by the number of individuals. (b-f) Distribution of common 3HGs under five commonality levels. The distribution under each commonality level was quantified by the number of common 3HGs in each brain region.

Regression Performance

The proposed framework was trained through a self-supervised regression task in a hierarchical two-stage decoding manner. The first stage (Stage-1) reconstructs the hierarchical multi-hop embeddings from the combined embedding vector, whereas the second stage (Stage-2) recovers the multi-hop features from the hierarchical multi-hop embeddings. In this section, we used four metrics to evaluate the regression performance of the two decoding stages from various perspectives, including mean absolute error (MAE) and mean squared error (MSE) for magnitude, and Structural Similarity Index Measure (SSIM) and cosine similarity (CS) for overall pattern. In addition, we evaluated the regression performance of both the multi-hop embeddings/features and the embedding/feature vector of each single hop. The results were calculated using 169,923 3HGs in dataset-2 (independent testing dataset) and reported in Table 3. Our results show that the reconstructed embeddings/features in the first/second decoding stage have low MAE and MSE (⁠|$<$|0.1/0.15) and high SSIM and CS (⁠|$>$|0.9/0.6), indicating that the two-stage decoding framework performs well in the regression task. Furthermore, it is worth noting that the performance of stage-1 is slightly better than stage-2, with lower MAE and MSE, and higher SSIM and CS. This may be because the ground truths in stage-2 are highly sparse matrices—the multi-hop features created by one-hot encoding, whereas the ground truths in stage-1 are dense embedding vectors. In the discussion section, we will compare the one-hot encoding and the learnable embedding vectors further.

Table 3

Regression performance of 2-hop embedding.

Decoding stageHopMetrics (mean|$\pm$|std)
MAE ↓MSE ↓SSIM ↑CS ↑
Stage-1|${0}^{\mathrm{th}}$| hop0.007|$\pm$|0.0040.001|$\pm$|0.0010.910|$\pm$|0.0670.900|$\pm$|0.135
|${1}^{\mathrm{th}}$| hop0.014|$\pm$|0.0070.004|$\pm$|0.0030.941|$\pm$|0.0340.993|$\pm$|0.011
|${2}^{\mathrm{th}}$| hop0.043|$\pm$|0.0200.038|$\pm$|0.0180.969|$\pm$|0.0100.999|$\pm$|0.001
All0.021|$\pm$|0.0090.014|$\pm$|0.0070.972|$\pm$|0.0080.997|$\pm$|0.003
Stage-2|${0}^{\mathrm{th}}$| hop0.022|$\pm$|0.0050.005|$\pm$|0.0040.622|$\pm$|0.0630.792|$\pm$|0.231
|${1}^{\mathrm{th}}$| hop0.033|$\pm$|0.0120.018|$\pm$|0.0170.645|$\pm$|0.0820.934|$\pm$|0.104
|${2}^{\mathrm{th}}$| hop0.079|$\pm$|0.0310.150|$\pm$|0.1630.632|$\pm$|0.1050.961|$\pm$|0.077
All0.045|$\pm$|0.0140.058|$\pm$|0.0600.669|$\pm$|0.0850.956|$\pm$|0.080
Decoding stageHopMetrics (mean|$\pm$|std)
MAE ↓MSE ↓SSIM ↑CS ↑
Stage-1|${0}^{\mathrm{th}}$| hop0.007|$\pm$|0.0040.001|$\pm$|0.0010.910|$\pm$|0.0670.900|$\pm$|0.135
|${1}^{\mathrm{th}}$| hop0.014|$\pm$|0.0070.004|$\pm$|0.0030.941|$\pm$|0.0340.993|$\pm$|0.011
|${2}^{\mathrm{th}}$| hop0.043|$\pm$|0.0200.038|$\pm$|0.0180.969|$\pm$|0.0100.999|$\pm$|0.001
All0.021|$\pm$|0.0090.014|$\pm$|0.0070.972|$\pm$|0.0080.997|$\pm$|0.003
Stage-2|${0}^{\mathrm{th}}$| hop0.022|$\pm$|0.0050.005|$\pm$|0.0040.622|$\pm$|0.0630.792|$\pm$|0.231
|${1}^{\mathrm{th}}$| hop0.033|$\pm$|0.0120.018|$\pm$|0.0170.645|$\pm$|0.0820.934|$\pm$|0.104
|${2}^{\mathrm{th}}$| hop0.079|$\pm$|0.0310.150|$\pm$|0.1630.632|$\pm$|0.1050.961|$\pm$|0.077
All0.045|$\pm$|0.0140.058|$\pm$|0.0600.669|$\pm$|0.0850.956|$\pm$|0.080

Stage-1: Reconstruction of hierarchical multi-hop embeddings—|$\widetilde{E}_{MH}^{i}$|

Stage-2: Reconstruction of the input multi-hop features—|$\widetilde{F}_{MH}^{i}$| (for more information, see eq-(1) and Fig. 1).

|${\mathrm{l}}^{\mathrm{th}}$| hop: considers the single hop; all: considers multiple hops covering 0th, 1th, 2th hops.

↓: a smaller value indicates better performance; ↑: a larger value indicates better performance.

Table 3

Regression performance of 2-hop embedding.

Decoding stageHopMetrics (mean|$\pm$|std)
MAE ↓MSE ↓SSIM ↑CS ↑
Stage-1|${0}^{\mathrm{th}}$| hop0.007|$\pm$|0.0040.001|$\pm$|0.0010.910|$\pm$|0.0670.900|$\pm$|0.135
|${1}^{\mathrm{th}}$| hop0.014|$\pm$|0.0070.004|$\pm$|0.0030.941|$\pm$|0.0340.993|$\pm$|0.011
|${2}^{\mathrm{th}}$| hop0.043|$\pm$|0.0200.038|$\pm$|0.0180.969|$\pm$|0.0100.999|$\pm$|0.001
All0.021|$\pm$|0.0090.014|$\pm$|0.0070.972|$\pm$|0.0080.997|$\pm$|0.003
Stage-2|${0}^{\mathrm{th}}$| hop0.022|$\pm$|0.0050.005|$\pm$|0.0040.622|$\pm$|0.0630.792|$\pm$|0.231
|${1}^{\mathrm{th}}$| hop0.033|$\pm$|0.0120.018|$\pm$|0.0170.645|$\pm$|0.0820.934|$\pm$|0.104
|${2}^{\mathrm{th}}$| hop0.079|$\pm$|0.0310.150|$\pm$|0.1630.632|$\pm$|0.1050.961|$\pm$|0.077
All0.045|$\pm$|0.0140.058|$\pm$|0.0600.669|$\pm$|0.0850.956|$\pm$|0.080
Decoding stageHopMetrics (mean|$\pm$|std)
MAE ↓MSE ↓SSIM ↑CS ↑
Stage-1|${0}^{\mathrm{th}}$| hop0.007|$\pm$|0.0040.001|$\pm$|0.0010.910|$\pm$|0.0670.900|$\pm$|0.135
|${1}^{\mathrm{th}}$| hop0.014|$\pm$|0.0070.004|$\pm$|0.0030.941|$\pm$|0.0340.993|$\pm$|0.011
|${2}^{\mathrm{th}}$| hop0.043|$\pm$|0.0200.038|$\pm$|0.0180.969|$\pm$|0.0100.999|$\pm$|0.001
All0.021|$\pm$|0.0090.014|$\pm$|0.0070.972|$\pm$|0.0080.997|$\pm$|0.003
Stage-2|${0}^{\mathrm{th}}$| hop0.022|$\pm$|0.0050.005|$\pm$|0.0040.622|$\pm$|0.0630.792|$\pm$|0.231
|${1}^{\mathrm{th}}$| hop0.033|$\pm$|0.0120.018|$\pm$|0.0170.645|$\pm$|0.0820.934|$\pm$|0.104
|${2}^{\mathrm{th}}$| hop0.079|$\pm$|0.0310.150|$\pm$|0.1630.632|$\pm$|0.1050.961|$\pm$|0.077
All0.045|$\pm$|0.0140.058|$\pm$|0.0600.669|$\pm$|0.0850.956|$\pm$|0.080

Stage-1: Reconstruction of hierarchical multi-hop embeddings—|$\widetilde{E}_{MH}^{i}$|

Stage-2: Reconstruction of the input multi-hop features—|$\widetilde{F}_{MH}^{i}$| (for more information, see eq-(1) and Fig. 1).

|${\mathrm{l}}^{\mathrm{th}}$| hop: considers the single hop; all: considers multiple hops covering 0th, 1th, 2th hops.

↓: a smaller value indicates better performance; ↑: a larger value indicates better performance.

Discussion

Self-supervised embedding. A common problem in deep learning models is limited samples: the huge architectures usually demand hundreds of millions of labeled data, which are often publicly inaccessible. Especially in neural imaging domains, where designing these labeled data can be a time-consuming and expensive process and impossible in some scenarios. In natural language processing (NLP), this appetite for data has been successfully addressed by self-supervised pretraining, which enables training of generalizable NLP framework containing over one hundred billion parameters, such as BERT (Devlin et al. 2018) and GPT (Radford et al. 2018, 2019; Brown et al. 2020). Inspired by these successful models in NLP, in this study, we adopted an autoencoder architecture—a simple self-supervised method, to design the learning-based embedding framework. In our experiments, the proposed framework generalizes well on new datasets and shows promising transferable capability in downstream tasks.

Disentangling the commonality and individuality. Our proposed embedding framework was trained to indirectly encode brain anatomy using folding pattern derived landmarks—3HGs. Different from NLP methods in which the words are well defined, and the language vocabulary can be easily built up, there is no pre-exist “vocabulary” when representing cerebral cortex, since each brain has unique folding patterns. To solve this problem, we designed a new embedding strategy to disentangle the commonality and individuality of the 3HGs: instead of embedding the 3HG itself, we embedded the ROIs (from brain atlas) into a set of ROI embedding vectors, serving as basic blocks for representing commonality, and then used these ROI embedding vectors to distill individuality.

Integrating multi-modal data. In this work, we limited our interest in the folding patterns of 3HGs and focused on the effectiveness of the proposed method on anatomical correspondences. We did not include the white matter structure into our study. Although it has been widely reported that brain folding patterns are closely related to brain structural connectivity patterns (Van Essen 1997; Thompson et al. 2004; Hilgetag and Barbas 2006; Fischl et al. 2008; Honey et al. 2010; Xu et al. 2010; Nie et al. 2012; Budde and Annese 2013; Reveley et al. 2015), there is still no consensus about the relationship between them. For example, some studies suggested that the tension on the axon pulls the cortex closer and forms the gyri (Van Essen 1997; Hilgetag and Barbas 2006), while in other works gyri were reported to be connected by axons with greater density than sulci at different scales (Xu et al. 2010; Nie et al. 2012; Budde and Annese 2013). There are also some studies suggesting that there exists a superficial axonal system at the boarder of white matters and gray matters, which could impede the detection of axonal connections, especially in sulci regions (Reveley et al. 2015). In addition, the disease related alterations of white matter structures make the situation even more complicated (Stewart et al. 1975; Landrieu et al. 1998; Sallet et al. 2003; Hardan et al. 2004; Harris et al. 2004; Thompson et al. 2004; Tortori-Donati et al. 2005; Nordahl et al. 2007; Pang et al. 2008; Bullmore and Sporns 2009; Barkovich 2010; Honey et al. 2010; Manzini and Walsh 2011; Wang et al. 2019; Wang et al. 2020; Zhang et al. 2019, 2020a; Yu et al. 2021). However, it is undeniable that white matter plays an important role in the formation of folding patterns, and we intend to include it in our future studies: (i) Investigating whether the corresponding 3HGs with similar anatomical characteristics also have similar fiber connection patterns; (ii) Identifying a group of 3HGs with similar anatomical features and fiber connection patterns and investigating their functional homogeneity/heterogeneity; (iii) Incorporating fiber connection patterns and function into current frameworks to establish a more comprehensive 3HGs map with superb functional homogeneity and intrinsically established cross-subject cortical correspondences. Then, based on this map of 3HGs, we can radiate the research scope to larger area and to include more landscapes, such as 2-hinges.

One-hot encoding vs learned embeddings. The ROI features of 3HGs were initially represented by one-hot encodings and used as input features to learn anatomical meaningful embeddings. Compared to the learned embeddings, one-hot vectors cannot be directly used as an embedding vector to infer cross-subject correspondence for two reasons: (i) One-hot vectors are anatomical meaningless and cannot provide reliable cross-subject 3HGs correspondences. Each one-hot vector contains a single one and N-1 zeros where N is the number of dimensions. As a result, ROIs are embedded in isolation and are equal distance apart, making it impossible to represent the underlying relationships between ROIs. In this way, the similarity of two 3HGs based on one-hot encoding is only related to the number of common ROIs shared by their multi-hop features, while the underlying connections between ROIs are ignored, rendering it powerless when inferring correspondences of 3HGs. For example, under one-hot encoding, the anatomical similarities of all 3HG pairs with no common ROIs of their multi-hop features are zero. In contrast, the learned embeddings can effectively encode the population-level common connection patterns between ROIs, with closely connected ROIs having similar embedding vectors, and thus can provide more reliable anatomical correspondences; (ii) One-hot vectors are sparse and grow with vocabulary size, which can easily lead to the curse of dimensionality, whereas embeddings are dense and low-dimensional, making them more computationally efficient. In general, the learned embeddings are more efficient than the one-hot vectors in both anatomical correspondence reliability and computational efficiency.

Conclusion

In this work, we proposed a learning-based embedding framework to embed the anatomically meaningful patterns of 3HGs, into a group of learnable embedding vectors. Each 3HG can be represented as a hierarchical combination of the learned embedding vectors via a set of multi-hop combination coefficients. By this way, the regularity of folding pattern is encoded into the embedding vectors, and the variability is preserved by the individual-specific coefficients. We evaluated the proposed method using HCP dataset and the experiment results show that the learned embeddings can successfully encode the cortical folding patterns and reliably infer the cross-subject complex many-to-many correspondences of 3HGs.

Funding

This work was supported by STARTs from UT system and partially supported by National Institutes of Health (R01AG075582, RF1NS128534) and National Science Foundation (IIS-2011369).

Conflict of interest statement: The authors declare that no commercial or financial relationships that could be construed as a potential conflict of interest existed during the research.

Reference

Barkovich
 
AJ
.
Current concepts of polymicrogyria
.
Neuroradiology
.
2010
:
52
:
479
487
.

Bertrand
 
G
.
On topological watersheds
.
Journal of Mathematical Imaging and Vision
.
2005
:
22
:
217
230
.

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S.  

Language models are few-shot learners
.
Advances in neural information processing systems
.
2020
:
33
:1877–1901.

Budde
 
MD
,
Annese
 
J
.
Quantification of anisotropy and fiber orientation in human brain histological sections
.
Front Integr Neurosci
.
2013
:
7
:
3
.

Bullmore
 
E
,
Sporns
 
O
.
Complex brain networks: graph theoretical analysis of structural and functional systems
.
Nat Rev Neurosci
.
2009
:
10
:
186
198
.

Chen
 
H
,
Li
 
Y
,
Ge
 
F
,
Li
 
G
,
Shen
 
D
,
Liu
 
T
.
Gyral net: A new representation of cortical folding organization
.
Med Image Anal
.
2017
:
42
:
14
25
.

Chen
,
H.
,
Perozzi
,
B.
,
Hu
,
Y.
, &
Skiena
,
S.
(
2018
). Harp: Hierarchical representation learning for networks.
In Proceedings of the AAAI Conference on Artificial Intelligence
.
32
.

Devlin
 
J
,
Chang
 
M-W
,
Lee
 
K
,
Toutanova
 
K
.
Bert: Pre-training of deep bidirectional transformers for language understanding
.
2018
:
arXiv preprint arXiv:1810.04805
.

Dubois
 
J
,
Benders
 
M
,
Borradori-Tolsa
 
C
,
Cachia
 
A
,
Lazeyras
 
F
,
HaVinh Leuchter
 
R
,
Sizonenko
 
S
,
Warfield
 
S
,
Mangin
 
J
,
Hu ̈ppi
 
PS
.
Primary cortical folding in the human newborn: an early marker of later functional development
.
Brain
.
2008
:
131
:
2028
2041
.

Fischl
 
B
.
Freesurfer
.
NeuroImage
.
2012
:
62
:
774
781
.

Fischl
 
B
,
Sereno
 
MI
,
Dale
 
AM
.
Cortical surface-based analysis: Ii: inflation, flattening, and a surface-based coordinate system
.
NeuroImage
.
1999
:
9
:
195
207
.

Fischl
 
B
,
Rajendran
 
N
,
Busa
 
E
,
Augustinack
 
J
,
Hinds
 
O
,
Yeo
 
BT
,
Mohlberg
 
H
,
Amunts
 
K
,
Zilles
 
K
.
Cortical folding patterns and predicting cytoarchitecture
.
Cereb Cortex
.
2008
:
18
:
1973
1980
.

Ge
 
F
,
Li
 
X
,
Razavi
 
MJ
,
Chen
 
H
,
Zhang
 
T
,
Zhang
 
S
,
Guo
 
L
,
Hu
 
X
,
Wang
 
X
,
Liu
 
T
.
Denser growing fiber connections induce 3-hinge gyral folding
.
Cereb Cortex
.
2018
:
28
:
1064
1075
.

Giedd
 
JN
,
Rapoport
 
JL
.
Structural mri of pediatric brain development: what have we learned and where are we going?
 
Neuron
.
2010
:
67
:
728
734
.

Grover
,
A.
, &
Leskovec
,
J.
(
2016
). node2vec: Scalable feature learning for networks. In
Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining
.
855
864
.

Hardan
 
AY
,
Jou
 
RJ
,
Keshavan
 
MS
,
Varma
 
R
,
Minshew
 
NJ
.
Increased frontal cortical folding in autism: a preliminary mri study
.
Psychiatry Res Neuroimaging
.
2004
:
131
:
263
268
.

Harris
 
JM
,
Whalley
 
H
,
Yates
 
S
,
Miller
 
P
,
Johnstone
 
EC
,
Lawrie
 
SM
.
Abnormal cortical folding in high-risk individuals: a predictor of the development of schizophrenia?
 
Biol Psychiatry
.
2004
:
56
:
182
189
.

Hilgetag
 
CC
,
Barbas
 
H
.
Role of mechanical factors in the morphology of the primate cerebral cortex
.
PLoS Comput Biol
.
2006
:
2
:e22.

Holland
 
MA
,
Miller
 
KE
,
Kuhl
 
E
.
Emerging brain morphologies from axonal elongation
.
Ann Biomed Eng
.
2015
:
43
:
1640
1653
.

Honey
 
CJ
,
Thivierge
 
J-P
,
Sporns
 
O
.
Can structure predict function in the human brain?
 
NeuroImage
.
2010
:
52
:
766
776
.

Landrieu
 
P
,
Husson
 
B
,
Pariente
 
D
,
Lacroix
 
C
.
Mrineuropathological correlations in type 1 lissencephaly
.
Neuroradiology
.
1998
:
40
:
173
176
.

Li
 
K
,
Guo
 
L
,
Li
 
G
,
Nie
 
J
,
Faraco
 
C
,
Cui
 
G
,
Zhao
 
Q
,
Miller
 
LS
,
Liu
 
T
.
Gyral folding pattern analysis via surface profiling
.
NeuroImage
.
2010
:
52
:
1202
1214
.

Li
 
G
,
Wang
 
L
,
Shi
 
F
,
Lyall
 
AE
,
Lin
 
W
,
Gilmore
 
JH
,
Shen
 
D
.
Mapping longitudinal development of local cortical gyrification in infants from birth to 2 years of age
.
J Neurosci
.
2014
:
34
:
4228
4238
.

Li G, Wang L, Shi F, Lyall AE, Lin W, Gilmore Li X, Chen H, Zhang T, Yu X, Jiang X, Li K, Li L, Razavi MJ, Wang X, Hu X et al.  

Commonly preserved and species-specific gyral folding patterns across primate brains
.
Brain Struct Funct
.
2017
:
222
:
2127
2141
.

Manzini
 
MC
,
Walsh
 
CA
.
What disorders of cortical development tell us about the cortex: one plus one does not always make two
.
Curr Opin Genet Dev
.
2011
:
21
:
333
339
.

Mikolov
 
T
,
Chen
 
K
,
Corrado
 
G
,
Dean
 
J
.
Efficient estimation of word representations in vector space
.
2013
:
arXiv preprint arXiv:1301.3781
.

Murtagh
 
F
,
Contreras
 
P
.
Algorithms for hierarchical clustering: an overview
.
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
.
2012
:
2
:
86
97
.

Narayanan
 
A
,
Chandramohan
 
M
,
Venkatesan
 
R
,
Chen
 
L
,
Liu
 
Y
,
Jaiswal
 
S
.
graph2vec: Learning distributed representations of graphs
.
2017
:
arXiv preprint arXiv:1707.05005
.

Nie
 
J
,
Guo
 
L
,
Li
 
K
,
Wang
 
Y
,
Chen
 
G
,
Li
 
L
,
Chen
 
H
,
Deng
 
F
,
Jiang
 
X
,
Zhang
 
T
 et al.  
Axonal fiber terminations concentrate on gyri
.
Cereb Cortex
.
2012
:
22
:
2831
2839
.

Nordahl
 
CW
,
Dierker
 
D
,
Mostafavi
 
I
,
Schumann
 
CM
,
Rivera
 
SM
,
Amaral
 
DG
,
Van Essen
 
DC
.
Cortical folding abnormalities in autism revealed by surface-based morphometry
.
J Neurosci
.
2007
:
27
:
11725
11735
.

Pang
 
T
,
Atefy
 
R
,
Sheen
 
V
.
Malformations of cortical development
.
Neurologist
.
2008
:
14
:
181
.

Pennington
,
J.
,
Socher
,
R.
, &
Manning
,
C. D.
(
2014
). Glove: Global vectors for word representation.
In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)
 
1532
1543
.

Perozzi
,
B.
,
Al-Rfou
,
R.
, &
Skiena
,
S.
(
2014
). Deepwalk: Online learning of social representations.
In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
.
701
710
.

Peters
 
ME
,
Neumann
 
M
,
Iyyer
 
M
,
Gardner
 
M
,
Clark
 
C
,
Lee
 
K
,
Zettlemoyer
 
L
.
Deep contextualized word representations
.
2018
:
arXiv preprint arXiv:1802.05365
.

Radford
 
A
,
Narasimhan
 
K
,
Salimans
 
T
,
Sutskever
 
I
.
Improving language understanding by generative pre-training
.
2018
.

Radford
 
A
,
Wu
 
J
,
Child
 
R
,
Luan
 
D
,
Amodei
 
D
,
Sutskever
 
I
.
Language models are unsupervised multitask learners
.
OpenAI blog
.
2019
:
1
:
9
.

Reveley
 
C
,
Seth
 
AK
,
Pierpaoli
 
C
,
Silva
 
AC
,
Yu
 
D
,
Saunders
 
RC
,
Leopold
 
DA
,
Frank
 
QY
.
Superficial white matter fiber systems impede detection of long-range cortical connections in diffusion mr tractography
.
Proc Natl Acad Sci
.
2015
:
112
:
E2820
E2828
.

Roth
 
G
,
Dicke
 
U
.
Evolution of the brain and intelligence
.
Trends Cogn Sci
.
2005
:
9
:
250
257
.

Sallet
 
PC
,
Elkis
 
H
,
Alves
 
TM
,
Oliveira
 
JR
,
Sassi
 
E
,
de
 
Castro
 
CC
,
Busatto
 
GF
,
Gattaz
 
WF
.
Reduced cortical folding in schizophrenia: an mri morphometric study
.
Am J Psychiatr
.
2003
:
160
:
1606
1613
.

Stewart
 
RM
,
Richman
 
DP
,
Caviness
 
VS
.
Lissencephaly and pachygyria
.
Acta Neuropathol
.
1975
:
31
:
1
12
.

Tang
,
J.
,
Qu
,
M.
,
Wang
,
M.
,
Zhang
,
M.
,
Yan
,
J.
, &
Mei
,
Q.
(
2015
). Line: Largescale information network embedding.
In Proceedings of the 24th international conference on world wide web
 
1067
1077
.

Thompson
 
PM
,
Hayashi
 
KM
,
Sowell
 
ER
,
Gogtay
 
N
,
Giedd
 
JN
,
Rapoport
 
JL
,
De Zubicaray
 
GI
,
Janke
 
AL
,
Rose
 
SE
,
Semple
 
J
 et al.  
Mapping cortical change in alzheimer’s disease, brain development, and schizophrenia
.
NeuroImage
.
2004
:
23
:
S2
S18
.

Tortori-Donati
 
P
,
Rossi
 
A
,
Biancheri
 
R
. Brain malformations. In:
Pediatric neuroradiology
.
Springer, Berlin, Heidelberg
;
2005
. pp.
71
198

Van Essen
 
DC
.
A tension-based theory of morphogenesis and compact wiring in the central nervous system
.
Nature
.
1997
:
385
:
313
318
.

Wang
,
D.
,
Cui
,
P.
, &
Zhu
,
W.
 Structural deep network embedding.
In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining
.
2016
:
1225
1234
.

Wang
 
L
,
Zhang
 
L
,
Zhu
 
D
. Accessing latent connectome of mild cognitive impairment via discriminant structure learning. In:
In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)
.
IEEE
;
2019
. pp.
164
168

Wang
 
L
,
Zhang
 
L
,
Zhu
 
D
. Learning latent structure over deep fusion model of mild cognitive impairment. In:
In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)
.
IEEE
;
2020
. pp.
1039
1043

Xu
 
G
,
Knutsen
 
AK
,
Dikranian
 
K
,
Kroenke
 
CD
,
Bayly
 
PV
,
Taber
 
LA
.
Axons pull on the brain, but tension does not drive cortical folding
;
2010

Yu
 
X
,
Scheel
 
N
,
Zhang
 
L
,
Zhu
 
DC
,
Zhang
 
R
,
Zhu
 
D
.
Free water in t2 flair white matter hyperintensity lesions
.
Alzheimers Dement
.
2021
:
17
:e057398.

Zhang
 
L
,
Zaman
 
A
,
Wang
 
L
,
Yan
 
J
,
Zhu
 
D
. A cascaded multi- modality analysis in mild cognitive impairment. In:
International Workshop on Machine Learning in Medical Imaging
.
Springer
;
2019
. pp.
557
565

Zhang
 
L
,
Wang
 
L
,
Zhu
 
D
. Jointly analyzing alzheimer’s disease related structure-function using deep cross-model attention network. In:
In 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)
.
IEEE
;
2020a
. pp.
563
567

Zhang
 
L
,
Wang
 
L
,
Zhu
 
D
. Recovering brain structural connec- tivity from functional connectivity via multi-gcn based generative adversar- ial network. In:
International Conference on Medical Image Computing and Computer-Assisted Intervention
.
Springer
;
2020b
. pp.
53
61

Zhang
 
T
,
Huang
 
Y
,
Zhao
 
L
,
He
 
Z
,
Jiang
 
X
,
Guo
 
L
,
Hu
 
X
,
Liu
 
T
.
Identifying cross-individual correspondences of 3-hinge gyri
.
Med Image Anal
.
2020c
:
63
:101700.

Zhang
 
T
,
Li
 
X
,
Jiang
 
X
,
Ge
 
F
,
Zhang
 
S
,
Zhao
 
L
,
Liu
 
H
,
Huang
 
Y
,
Wang
 
X
,
Yang
 
J
 et al.  
Cortical 3-hinges could serve as hubs in cortico-cortical connective network
.
Brain imaging and behavior
.
2020d
:
14
(6):
2512
2529
.

Zhang
 
L
,
Wang
 
L
,
Gao
 
J
,
Risacher
 
SL
,
Yan
 
J
,
Li
 
G
,
Liu
 
T
,
Zhu
 
D
,
Initiative
 
ADN
 et al.  
Deep fusion of brain structure-function in mild cognitive impairment
.
Med Image Anal
.
2021
:
72
:102082.

Zhang
 
L
,
Wang
 
L
,
Zhu
 
D
,
Initiative
 
ADN
 et al.  
Predicting brain structural network using functional connectivity
.
Med Image Anal
.
2022
:
79
:102463.

Zilles
 
K
,
Armstrong
 
E
,
Schleicher
 
A
,
Kretschmann
 
H-J
.
The hu- man pattern of gyrification in the cerebral cortex
.
Anat Embryol
.
1988
:
179
:
173
179
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)