Abstract

Community detection has attracted tremendous interests in network analysis, which aims at finding group of nodes with similar characteristics. Various detection methods have been developed to detect homogeneous communities in multi-layer networks, where inter-layer dependence is a widely acknowledged but severely under-investigated issue. In this paper, we propose a novel stochastic block Ising model (SBIM) to incorporate the inter-layer dependence to help with community detection in multi-layer networks. The community structure is modeled by the stochastic block model (SBM) and the inter-layer dependence is incorporated via the popular Ising model. Furthermore, we develop an efficient variational EM algorithm to tackle the resultant optimization task and establish the asymptotic consistency of the proposed method. Extensive simulated examples and a real example on gene co-expression multi-layer network data are also provided to demonstrate the advantage of the proposed method.

1. Introduction

Community is one of the most widely observed structures in network data, where objects in the same community tend to share some common characteristics and similar connectivity. In a typical gene interaction network, genes sharing similar functions or participating in a common cellular process may form gene communities. For instances, BRCA1, BRCA2, and ATM are widely known DNA damaged response (DDR) genes, as they are deeply implicated in the function of repairing damaged DNA (Lang et al., 2019); epidermal growth factor receptor (EGFR), rearranged during transfection (RET), and mesenchymal epithelial transition factor (MET) have similar functions in non-small cell lung cancer (NSCLC), as EGFR Exon T790M, RET rearrangement and MET exon 14 skipping are well-known NSCLC driver mutations. In the past two decades, detecting community structures have attracted tremendous interests in the literature of network analysis (Lei & Rinaldo, 2015; Newman, 2006; Zhang et al., 2022).

Although most existing community detection methods focus on single-layer networks (Holland et al., 1983; Newman & Girvan, 2004; Newman, 2006; Rohe et al., 2011), multi-layer networks have been popularly encountered in biomedical data. For example, the gene co-expression networks of rhesus monkeys at different developmental stages (Lei et al., 2020) and the brain fMRI imaging networks from schizophrenia and normal subjects (Yuan & Qu, 2021). Yet, methods for multi-layer networks are still in their infancy, and most of the existing methods are extended from those on single-layer networks by assuming the layers are independent from each other. Examples include spectral clustering on the mean of adjacency matrices (Kumar et al., 2011; Paul & Chen, 2020; Tang et al., 2009), maximum likelihood estimation method (Han et al., 2015), least-square estimation methods (LSE; Lei et al., 2020), and the restricted multi-layer SBM model (Paul & Chen, 2016). Although success has been widely reported for discovering communities in multi-layer networks, these methods largely ignore the dependence structure among different layers, which may lead to a sub-optimal performance in community detection.

Inter-layer dependence is a widely acknowledged but severely under-investigated issue in multi-layer network data. For example, some time-varying biological network data (Lei et al., 2020; Zhang et al., 2020) exhibit strong inter-layer dependence among different layers, especially the adjacent layers. Spatial correlation may also exist in multi-layer network constructed from different brain subregions (Lei et al., 2020). Modeling inter-layer dependence in multi-layer network is analogous to modeling covariance structure in dependent data (Bester et al., 2011; Yuan & Qu, 2021), which shall lead to a more appropriate network model and benefit the analysis of multi-layer biological network data.

One of the key challenges for the multi-layer network model is that incorporating inter-layer dependence may substantially increase model complexity as well as computational expenses (Panagiotelis et al., 2012; Sklar, 1959). Only a few attempts have been made in the literature to characterize the inter-layer dependence. For instance, linked matrix factorization (LMF; Tang et al., 2009) approximates each layer by matrix factorization with a layer-specific factor and a common factor across all layers. Co-regularized spectral clustering (COR; Kumar et al., 2011) uses a regularization term to force the eigenvectors of each layer to align with some common matrix. However, these methods are mostly algorithm-based, and characterizing the inter-layer dependence structure with just a common matrix is rather ad hoc and lacks theoretical foundation.

In this paper, we propose a novel SBIM model to incorporate the inter-layer dependence for community detection in multi-layer networks. It models the homogeneous community structure with the SBM model, and employs the popular Ising model (Cheng et al., 2014; Ravikumar et al., 2010) to characterize the inter-layer dependence. The resultant pseudo-likelihood function is then optimized via a variational expectation maximization (EM) algorithm that optimizes a separable lower bound of the objective function. The asymptotic estimation consistency of the SBIM model is established, and its superior numerical performance is also compared against some popular multi-layer community detection methods on extensive simulated examples and a real application on a gene co-expression multi-layer network data.

The rest of the paper is organized as follows. Section 2 introduces the proposed SBIM model. Section 3 develops an efficient computing algorithm based on a variational EM scheme to tackle the resultant optimization task. Section 4 establishes the asymptotic estimation consistency of the SBIM model. In Section 5, we examine the performance of the proposed method with various simulated networks and compare it against some popular competitors in literature. A real application on a gene co-expression multi-layer network data is provided to further demonstrate the strength of the proposed SBIM model. A brief summary is provided in Section 6. Supporting information is also provided for computing details and all technical proofs.

2. Stochastic Block Ising Model

Consider a multi-layer network formula consisting of n common nodes, whose adjacency matrices are denoted as formula, where formula denotes the mth layer, and formula if there is an edge between nodes i and j in the mth layer and formula otherwise. Suppose that these n nodes possess a homogeneous community structure with K communities across all layers, and the community membership matrix is denoted as formula with formula, where formula if node i belongs to the kth community and 0 otherwise. Further, each column of formula contains at least one 1, indicating there is no empty community in the multi-layer network.

To model the joint likelihood of formula, we turn to model formula and formula separately. First, each row of formula is independently modeled with a multinomial distribution

1

where formula with formula for each formula and formula. Therefore, each row of formula will only have exactly one 1, corresponding to the non-overlapped community structure. Thus, the log-likelihood of formula can be written as

Next, we propose a SBIM model to characterize the joint distribution of formula given formula. Specifically, formula's are mutually independent given formula, and the log-likelihood of formula given formula becomes

2

where formula is an order-4 tensor with formula being a symmetric matrix, formula, and formula is the partition function ensuring formula to be a proper probability density function.

The proposed SBIM model has a number of compelling features. On the one hand, formula depends on nodes i and j only through their community memberships, echoing the popular SBM model (Holland et al., 1983). On the other hand, formula can efficiently capture the dependence among formula's for each node pair formula, due to the properties of the Ising model (Cheng et al., 2014). Given formula, formula provides a concise quantification of the conditional dependence between formula and formula given formula, where formula. More specifically, formula and formula are independent given all other formula's if and only if formula, and the strength of their conditional dependence changes monotonically with the value of formula. Furthermore, if formula only when formula, the proposed SBIM model reduces down to the case with no inter-layer dependence (Han et al., 2015).

Let formula, then the joint log-likelihood of formula is

It can be difficult to directly optimize formula due to the intractable partition function formula in formula. A natural treatment is to approximate formula with a pseudo-likelihood function (Besag, 1974; Ravikumar et al., 2010)

where the second equality derives from the binary logistic regression model of formula over formula. Then, the joint pseudo log-likelihood of formula becomes

3

where formula. The pseudo log-likelihood of formula is then formula.

3. Variational EM

Maximizing formula yields the maximum pseudo-likelihood estimate formula. However, note that formula involves the summation over formula possible values of formula, and thus the traditional EM algorithm (Dempster et al., 1977) cannot be directly applied to optimize formula. As a remedy, we exploit a variational EM approach (Mariadassou et al., 2010), which can greatly facilitate computation by considering a lower bound of formula and restricting the distribution of formula to a factorized form.

To proceed, we consider a lower bound of formula that

4

where formula denotes the Kullback–Leibler (KL) divergence, formula, and formula can be any distribution of formula to approximate formula. Particularly, we set formula to be a completely factorized distribution that

where formula denotes the multinomial distribution with parameters formula satisfying formula for each i. Then, formula and formula for any formula. Denote formula as formula with formula for simplicity, then it follows from the definition of the KL divergence that

where formula is the entropy of formula, and

To optimize formula, we proceed to update formula and formula alternatively until convergence. Specifically, given formula at the tth step, we solve the following two sub-optimization tasks separately:

5
6

In (Equation 5), taking partial derivative with respect to formula yields that for each formula

Further, note that

With step size γ0, formula is then updated as

The update of formula in (Equation 6) can be solved by the Lagrange multiplier. Let

where formula are the Lagrangian multipliers. Then, formula is the solution of the following fixed-point problem:

7

The computational details about the derivative of formula and formula are provided in the Supporting information.

It is important to remark that the variational EM algorithm is guaranteed to converge to a local maximum (Mariadassou et al., 2010). Denote the variational estimate as formula with formula. Then, for each formula, formula can be regarded as the estimated membership of node i. To prevent it from a degenerated optimum, it is helpful to warm start the algorithm with a proper initializer. A number of initialization strategies have been developed for the variational EM algorithm, such as the LSE (Lei et al., 2020) and the layer-wise maximum likelihood estimate (MLE; Han et al., 2015). In our numerical experiments, we use the estimates from MLE to initialize formula in the alternative updating algorithm, which has satisfactory performance. Furthermore, to initialize formula, we simply set all of its elements to be 0 as little prior knowledge about formula is available.

4. Theory

In this section, we establish some theoretical results to quantify the asymptotic behavior of the proposed method in terms of estimation of the true parameter formula and the posterior distribution of formula.

Let formula with formula indicating whether node i belongs to true community k. Let formula, and we assume the limits of formula exist for each formula, denoted as formula. Then, the multi-layer network formula is generated from the joint density function formula. The estimation error of formula and formula is measured by

where σ can be any permutation on [K], and formula for any formula with formula. Then, formula is a variant of KL divergence. Lemma 1 shows that the pseudo log-likelihood formula attains its maximum at formula, suggesting that it is an appropriate surrogate loss to facilitate estimation of formula and implying that formula. This leads to an interesting analogy to the popular Fisher consistency in the machine learning literature (Lin, 2004).

Lemma 1. For any formula, if formula, then we have

8

for any formula and formula, and the equality holds if and only if formula.

The following technical assumptions are made. We will write formula if formula and formula if formula.

Assumption 1. For any formula and formula, formula for some formula and there exists a positive sequence formula such that formula.

Assumption 2. For any formula, there exists an formula such that formula for any formula.

Assumption 1 assumes a lower bound for formula. It also admits sparse network structure as well as flexible dependence structure among different layers by allowing a diverging upper bound on formula. Assumption 2 guarantees that the community structure is distinguishable in that there exists some formula, whose rows are sufficiently different one from another.

Theorem 1. Suppose Assumptions 1 and 2 hold. If formula and formula for some formula, then for any formula, it holds true that

9

Theorem 1 establishes the asymptotic estimation consistency of formula. Note that formula is allowed to diverge at a fast rate, indicating strong dependence among different layers. In fact, Theorem 1 still holds true for unbalanced communities when formula tends to 0 not too fast. The technical proofs of this theorem and all other theorems are given in the Supporting information.

Next, the asymptotic consistency of the variational estimate formula can also be established.

Theorem 2. Suppose all assumptions in Theorem 1 hold. For any formula, it holds true that

Theorem 2 assures that the estimates from the variational EM algorithm is also asymptotically consistent. This theoretical result is particularly interesting, as the global optimizer of the joint pseudo-likelihood (3) is difficult to obtain, if not impossible. The variational estimate, on the other hand, provides a computationally feasible and theoretically justifiable estimation of the proposed SBIM model.

We now turn to establish some consistency results of the proposed method in terms of community detection under additional assumptions.

Assumption 3. For any formula, if formula, then

for any formula. Further, there exist formula, such that formula is not permutation invariant.

Note that it follows from Lemma 1 that formula is always positive if formula and formula. Thus, Assumption 3 further assures that it cannot vanish too fast.

Theorem 3. Suppose all assumptions in Theorem 1 and Assumption 3 hold. Let formula, then we have formula. Furthermore, let formula subject to formula and formula, then it also holds true that formula as formula.

Theorem 3 shows that the pseudo posterior distribution formula is consistent in the sense that its maximizer exactly recovers the true community structure as defined by formula. Even though formula is computationally intractable, Theorem 3 further assures that it can be well approximated by formula when formula is given. In the challenging case with unknown formula, it would be necessary to quantify the behavior of formula as in Assumption 3 to obtain similar consistency results as in Theorem 3 under additional assumptions.

5. Numerical Experiments

In this section, we examine the numeric performance of the proposed SBIM model in both synthetic and real networks, and compare it against five popular competitors in the literature, including spectral clustering on mean adjacency matrix (SCM; Paul & Chen, 2020), LSE of multi-layer block models (Lei et al., 2020), LMF Paul & Chen, 2020), co-regularized spectral clustering (COR; Paul & Chen, 2020), and layer-wise MLE of SBM (Han et al., 2015). The implementations for LSE, LMF, and COR are available at the authors' personal website, whereas SCM and MLE are implemented by ourselves.

The performance of each method is evaluated by the community detection error (Wang, 2010), which is defined as

10

where formula is the indicator function, formula and formula denote the true and estimated community labels, respectively. Clearly, formula measures the disagreement between formula and formula, and is invariant to the permutation of community labels.

5.1 Synthetic Examples

We consider three synthetic networks concerning various aspects of the competing methods.

Example 1. In this example, we vary the network size, the layer number, and the dependence strength. We set formula, formula, formula and formula. Then, formula for formula, where formula is the formula identity matrix and formula is the symmetric hollow matrix with formula for formula, and formula for formula. Let formula and formula, where formula can measure the strength of the dependence among different layers. Following the treatment in Ravikumar et al. (2010) and Cheng et al. (2014), Gibbs sampling is exploited to generate formula. Before formula is generated, we add a disturbance formula on formula to slightly blur the community structure, where formula is a symmetric matrix with formula for formula. The averaged community detection errors of the competing methods over 50 independent replications, as well as their standard errors are summarized in Table 1.

It is evident from Table 1 that SBIM outperforms the other five methods in Example 1. The performances of SCM, LMF, and COR appear to be less satisfactory, whereas MLE and LSE are more competitive but still substantially worse than SBIM. Almost all methods perform better as the network size grows. This is due to the fact that more information is available as the number of nodes increases. It is worth noting that compared to the cases of formula, the performances of SBIM, MLE, and LSE on cases of formula are worse. One possible reason is that the dependence relationship is difficult to be fully captured by LSE when the number of layers increases, leading to less accurate initializers for SBIM and MLE. It is also interesting to note that the performances of all methods become worse when μ gets larger or the dependence becomes stronger. Furthermore, the difference between SBIM and MLE when formula is smaller than the difference when μ is large, which demonstrates the importance of incorporating the dependence structure.

Example 2. In this example, we test the performances of the competing methods on networks with unbalanced communities. The data-generating scheme is similar to Example 1, except that formula will vary and other parameters are fixed. We set formula, formula, formula and formula for formula, where formula is the formula diagonal matrix with formula independently for formula, and formula is the symmetric hollow matrix with formula independently for formula and formula for formula. Let formula with formula controlling the unbalance degree. The averaged community detection errors of the competing methods over 50 independent replications, as well as their standard errors are summarized in Table 2.

It is clear from Table 2 that SBIM outperforms the other five competitors in all cases. It is evident that the community detection errors of almost all methods decreases as α1 increases, or the communities become more balanced. It is also interesting to note that SCM, LMF, and COR are less sensitive to α1 compared with the other three methods.

Example 3. In this example, we test the performances of the competing methods on sparse networks. The data-generating scheme is same as Example 2 except that we set formula and formula) for formula, where formula is the formula diagonal matrix with formula for formula. Let formula control the network sparsity. Then, the network densities, defined as the average of elements in formula over 50 independent replications, are 0.11, 0.06, 0.03, 0.01. It is clear that the networks are sparser as μ increases. The averaged community detection errors of the competing methods over 50 independent replications, as well as their standard errors are summarized in Table 3.

TABLE 1

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 1.

nMμSBIMSCMMLELSELMFCOR
20051formula0.0874(0.0121)0.0064(0.0040)0.0103(0.0049)0.1718(0.0079)0.1777(0.0082)
3formula0.1177(0.0149)0.0646(0.0121)0.0712(0.0125)0.1810(0.0107)0.1912(0.0110)
5formula0.1444(0.0157)0.1155(0.0146)0.1251(0.0145)0.1836(0.0115)0.2010(0.0108)
101formula0.0525(0.0114)0.0103(0.0052)0.0108(0.0055)0.1676(0.0083)0.1682(0.0090)
3formula0.1141(0.0145)0.0969(0.0140)0.1102(0.0143)0.1674(0.0124)0.1934(0.0096)
5formula0.1567(0.0151)0.1882(0.0125)0.2059(0.0103)0.1950(0.0107)0.2148(0.0107)
40051formula0.0528(0.0104)0.0004(0.0003)0.0013(0.0012)0.1697(0.0090)0.1621(0.0105)
3formula0.0892(0.0140)0.0425(0.0109)0.0490(0.0113)0.1588(0.0129)0.1753(0.0118)
5formula0.1163(0.0160)0.0766(0.0132)0.0782(0.0133)0.1790(0.0110)0.1744(0.0132)
101formula0.0499(0.0118)0.0055(0.0038)0.0060(0.0038)0.1723(0.0087)0.1742(0.0103)
3formula0.0828(0.0144)0.0456(0.0115)0.0546(0.0121)0.1395(0.0127)0.1687(0.1357)
5formula0.1382(0.0154)0.1237(0.0150)0.1369(0.0146)0.1726(0.0114)0.1992(0.0132)
nMμSBIMSCMMLELSELMFCOR
20051formula0.0874(0.0121)0.0064(0.0040)0.0103(0.0049)0.1718(0.0079)0.1777(0.0082)
3formula0.1177(0.0149)0.0646(0.0121)0.0712(0.0125)0.1810(0.0107)0.1912(0.0110)
5formula0.1444(0.0157)0.1155(0.0146)0.1251(0.0145)0.1836(0.0115)0.2010(0.0108)
101formula0.0525(0.0114)0.0103(0.0052)0.0108(0.0055)0.1676(0.0083)0.1682(0.0090)
3formula0.1141(0.0145)0.0969(0.0140)0.1102(0.0143)0.1674(0.0124)0.1934(0.0096)
5formula0.1567(0.0151)0.1882(0.0125)0.2059(0.0103)0.1950(0.0107)0.2148(0.0107)
40051formula0.0528(0.0104)0.0004(0.0003)0.0013(0.0012)0.1697(0.0090)0.1621(0.0105)
3formula0.0892(0.0140)0.0425(0.0109)0.0490(0.0113)0.1588(0.0129)0.1753(0.0118)
5formula0.1163(0.0160)0.0766(0.0132)0.0782(0.0133)0.1790(0.0110)0.1744(0.0132)
101formula0.0499(0.0118)0.0055(0.0038)0.0060(0.0038)0.1723(0.0087)0.1742(0.0103)
3formula0.0828(0.0144)0.0456(0.0115)0.0546(0.0121)0.1395(0.0127)0.1687(0.1357)
5formula0.1382(0.0154)0.1237(0.0150)0.1369(0.0146)0.1726(0.0114)0.1992(0.0132)

Note: The best performer in each case is bold-faced.

TABLE 1

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 1.

nMμSBIMSCMMLELSELMFCOR
20051formula0.0874(0.0121)0.0064(0.0040)0.0103(0.0049)0.1718(0.0079)0.1777(0.0082)
3formula0.1177(0.0149)0.0646(0.0121)0.0712(0.0125)0.1810(0.0107)0.1912(0.0110)
5formula0.1444(0.0157)0.1155(0.0146)0.1251(0.0145)0.1836(0.0115)0.2010(0.0108)
101formula0.0525(0.0114)0.0103(0.0052)0.0108(0.0055)0.1676(0.0083)0.1682(0.0090)
3formula0.1141(0.0145)0.0969(0.0140)0.1102(0.0143)0.1674(0.0124)0.1934(0.0096)
5formula0.1567(0.0151)0.1882(0.0125)0.2059(0.0103)0.1950(0.0107)0.2148(0.0107)
40051formula0.0528(0.0104)0.0004(0.0003)0.0013(0.0012)0.1697(0.0090)0.1621(0.0105)
3formula0.0892(0.0140)0.0425(0.0109)0.0490(0.0113)0.1588(0.0129)0.1753(0.0118)
5formula0.1163(0.0160)0.0766(0.0132)0.0782(0.0133)0.1790(0.0110)0.1744(0.0132)
101formula0.0499(0.0118)0.0055(0.0038)0.0060(0.0038)0.1723(0.0087)0.1742(0.0103)
3formula0.0828(0.0144)0.0456(0.0115)0.0546(0.0121)0.1395(0.0127)0.1687(0.1357)
5formula0.1382(0.0154)0.1237(0.0150)0.1369(0.0146)0.1726(0.0114)0.1992(0.0132)
nMμSBIMSCMMLELSELMFCOR
20051formula0.0874(0.0121)0.0064(0.0040)0.0103(0.0049)0.1718(0.0079)0.1777(0.0082)
3formula0.1177(0.0149)0.0646(0.0121)0.0712(0.0125)0.1810(0.0107)0.1912(0.0110)
5formula0.1444(0.0157)0.1155(0.0146)0.1251(0.0145)0.1836(0.0115)0.2010(0.0108)
101formula0.0525(0.0114)0.0103(0.0052)0.0108(0.0055)0.1676(0.0083)0.1682(0.0090)
3formula0.1141(0.0145)0.0969(0.0140)0.1102(0.0143)0.1674(0.0124)0.1934(0.0096)
5formula0.1567(0.0151)0.1882(0.0125)0.2059(0.0103)0.1950(0.0107)0.2148(0.0107)
40051formula0.0528(0.0104)0.0004(0.0003)0.0013(0.0012)0.1697(0.0090)0.1621(0.0105)
3formula0.0892(0.0140)0.0425(0.0109)0.0490(0.0113)0.1588(0.0129)0.1753(0.0118)
5formula0.1163(0.0160)0.0766(0.0132)0.0782(0.0133)0.1790(0.0110)0.1744(0.0132)
101formula0.0499(0.0118)0.0055(0.0038)0.0060(0.0038)0.1723(0.0087)0.1742(0.0103)
3formula0.0828(0.0144)0.0456(0.0115)0.0546(0.0121)0.1395(0.0127)0.1687(0.1357)
5formula0.1382(0.0154)0.1237(0.0150)0.1369(0.0146)0.1726(0.0114)0.1992(0.0132)

Note: The best performer in each case is bold-faced.

TABLE 2

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 2.

formulaSBIMSCMMLELSELMFCOR
0.10formula0.2431(0.0125)0.0248(0.0098)0.0417(0.0124)0.3553(0.0069)0.3210(0.0095)
0.12formula0.2570(0.0117)0.0095(0.0057)0.0142(0.0069)0.2971(0.0105)0.2824(0.0078)
0.14formula0.2308(0.0133)0.0072(0.0049)0.0084(0.0051)0.2869(0.0071)0.2635(0.0071)
0.16formula0.2374(0.0118)0.0002(0.0001)0.0007(0.0004)0.2513(0.0099)0.2367(0.0085)
formulaSBIMSCMMLELSELMFCOR
0.10formula0.2431(0.0125)0.0248(0.0098)0.0417(0.0124)0.3553(0.0069)0.3210(0.0095)
0.12formula0.2570(0.0117)0.0095(0.0057)0.0142(0.0069)0.2971(0.0105)0.2824(0.0078)
0.14formula0.2308(0.0133)0.0072(0.0049)0.0084(0.0051)0.2869(0.0071)0.2635(0.0071)
0.16formula0.2374(0.0118)0.0002(0.0001)0.0007(0.0004)0.2513(0.0099)0.2367(0.0085)

Note: The best performer in each case is bold-faced.

TABLE 2

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 2.

formulaSBIMSCMMLELSELMFCOR
0.10formula0.2431(0.0125)0.0248(0.0098)0.0417(0.0124)0.3553(0.0069)0.3210(0.0095)
0.12formula0.2570(0.0117)0.0095(0.0057)0.0142(0.0069)0.2971(0.0105)0.2824(0.0078)
0.14formula0.2308(0.0133)0.0072(0.0049)0.0084(0.0051)0.2869(0.0071)0.2635(0.0071)
0.16formula0.2374(0.0118)0.0002(0.0001)0.0007(0.0004)0.2513(0.0099)0.2367(0.0085)
formulaSBIMSCMMLELSELMFCOR
0.10formula0.2431(0.0125)0.0248(0.0098)0.0417(0.0124)0.3553(0.0069)0.3210(0.0095)
0.12formula0.2570(0.0117)0.0095(0.0057)0.0142(0.0069)0.2971(0.0105)0.2824(0.0078)
0.14formula0.2308(0.0133)0.0072(0.0049)0.0084(0.0051)0.2869(0.0071)0.2635(0.0071)
0.16formula0.2374(0.0118)0.0002(0.0001)0.0007(0.0004)0.2513(0.0099)0.2367(0.0085)

Note: The best performer in each case is bold-faced.

Table 3 shows that SBIM outperforms the competitors in all cases. It is clear that the community detection errors of all methods increase as μ increases or the networks become sparser. Yet, SBIM still produces superb numerical performance compared with all five competitors, where MLE appears to perform better than the other four.

TABLE 3

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 3.

μSBIMSCMMLELSELMFCOR
2formula0.2365(0.0129)formulaformula0.2441(0.0069)0.3887(0.0033)
3formula0.2627(0.0126)0.0084(0.0049)0.0091(0.0052)0.2705(0.0164)0.3886(0.0032)
4formula0.3088(0.0116)0.0405(0.0085)0.0532(0.0104)0.3395(0.0111)0.3897(0.0032)
5formula0.3638(0.0081)0.1279(0.0170)0.2120(0.0177)0.3939(0.0065)0.3900(0.0032)
μSBIMSCMMLELSELMFCOR
2formula0.2365(0.0129)formulaformula0.2441(0.0069)0.3887(0.0033)
3formula0.2627(0.0126)0.0084(0.0049)0.0091(0.0052)0.2705(0.0164)0.3886(0.0032)
4formula0.3088(0.0116)0.0405(0.0085)0.0532(0.0104)0.3395(0.0111)0.3897(0.0032)
5formula0.3638(0.0081)0.1279(0.0170)0.2120(0.0177)0.3939(0.0065)0.3900(0.0032)

Note: The best performer in each case is bold-faced.

TABLE 3

The averaged community detection errors of competing methods over 50 independent replications with their standard errors in Example 3.

μSBIMSCMMLELSELMFCOR
2formula0.2365(0.0129)formulaformula0.2441(0.0069)0.3887(0.0033)
3formula0.2627(0.0126)0.0084(0.0049)0.0091(0.0052)0.2705(0.0164)0.3886(0.0032)
4formula0.3088(0.0116)0.0405(0.0085)0.0532(0.0104)0.3395(0.0111)0.3897(0.0032)
5formula0.3638(0.0081)0.1279(0.0170)0.2120(0.0177)0.3939(0.0065)0.3900(0.0032)
μSBIMSCMMLELSELMFCOR
2formula0.2365(0.0129)formulaformula0.2441(0.0069)0.3887(0.0033)
3formula0.2627(0.0126)0.0084(0.0049)0.0091(0.0052)0.2705(0.0164)0.3886(0.0032)
4formula0.3088(0.0116)0.0405(0.0085)0.0532(0.0104)0.3395(0.0111)0.3897(0.0032)
5formula0.3638(0.0081)0.1279(0.0170)0.2120(0.0177)0.3939(0.0065)0.3900(0.0032)

Note: The best performer in each case is bold-faced.

5.2 Real Application

We now apply the proposed SBIM model to analyze a gene co-expression data, which describes different developmental stages of medial prefrontal cortex of rhesus monkeys and helps understand developmental brain disorders. The data were first analyzed in Bakken et al. (2016), and its community structure was also studied in Liu et al. (2018), Lei et al. (2020) and Lei and Lin (2022). We choose the six stages for the prenatal period, which are labeled as E40 to E120 indicating the number of embryonic days of age. Further, we focus on the neural projection guidance-enriched genes, which has been shown to be related to autism spectrum disorder by Liu et al. (2018). This results in a six-layer network with 171 nodes. The edges are constructed by thresholding the sample correlations among node pairs, which is a common practice in gene network pre-processing to help keep most likely biologically-significant relationships (Bellec et al., 2017; Buchanan et al., 2020; Larremore et al., 2013; Perkins & Langston, 2009; Stuart et al., 2003). Specifically, let formula be the mth correlation matrix of the continuous gene co-expression networks with formula. Then, the binary adjacent matrix formula is constructed by setting formula for some formula. We set ξ as the 80% percentile of all formula's, and thus the sparsity of the resulting binary networks is 0.2.

Note that each layer in this six-layer network indicates different developmental stages. Thus, the former layers will have effects on the latter layers. Then, it is natural that different layers are correlated, especially the adjacent layers. To validate such an observation, we calculate the correlation coefficient for any two different layers by converting them as 171 × 171-dimensional vectors. As a comparison, we also generate six networks independently with 171 nodes and same edge density, and calculate its correlation coefficient for any two different layers. The results are plotted with heat maps in Figure 1.

The heat maps of layer-wise correlation coefficient for the gene co-expression multi-layer network (left) and for the randomly generated independent multi-layer network (right). This figure appears in color in the electronic version of this paper, and any mention of color refers to that version.
Figure 1

The heat maps of layer-wise correlation coefficient for the gene co-expression multi-layer network (left) and for the randomly generated independent multi-layer network (right). This figure appears in color in the electronic version of this paper, and any mention of color refers to that version.

It is clear from Figure 1 that the layer-wise correlation coefficient for the gene co-expression multi-layer network is substantially larger than that for the randomly generated independent multi-layer network, which indicates that different layers of the gene co-expression multi-layer network are correlated. More interestingly, the layer-wise correlation coefficient between adjacent layers of the gene co-expression multi-layer network is also larger than others from the left panel of Figure 1, which supports our arguments.

Following the suggestion of Lei et al. (2020), we apply all competing methods to this gene co-expression multi-layer network data with formula. Note that the true community structure of this gene network is not available. Thus, we use the averaged internal density (Xu et al., 2022; Yang & Leskovec, 2012) to evaluate the performances of all methods. Specifically, we define

where formula is the estimated community assignment vector, formula is the size of kth estimated community, formula contains all nodes pair in the kth estimated community, and formula is the edge density and introduced as baseline. The estimated averaged internal densities of all methods are reported in Table 4. It is clear that the proposed SBIM model is the best performer with the most improvement being 8.5%, suggesting that the proposed framework is effective in incorporating the inter-layer dependence to improve the estimated community structure.

TABLE 4

The estimated averaged internal densities of competing methods on the gene co-expression multi-layer network.

SBIMSCMMLELSELMFCOR
formula0.23280.25100.25220.24120.2372
SBIMSCMMLELSELMFCOR
formula0.23280.25100.25220.24120.2372

Note: The best performer is bold-faced.

TABLE 4

The estimated averaged internal densities of competing methods on the gene co-expression multi-layer network.

SBIMSCMMLELSELMFCOR
formula0.23280.25100.25220.24120.2372
SBIMSCMMLELSELMFCOR
formula0.23280.25100.25220.24120.2372

Note: The best performer is bold-faced.

6. Discussion

In this paper, we propose a novel SBIM model to detect communities in multi-layer networks with inter-layer dependence. Whereas the community structure across different layers is modeled via the SBM model, the inter-layer dependence among layers is incorporated via the Ising model. We also develop an efficient variational EM algorithm to tackle the resultant optimization task, and establish the asymptotic estimation consistency of the developed model. Its superior numerical performance is demonstrated in various synthetic networks and a real-life gene co-expression multi-layer network data. As further extensions, it would be desirable to establish theoretical results on the exact recovery of formula with the variational EM estimate in more realistic scenarios without assuming knowledge of true parameters.

Data Availability Statement

The data that support the findings in this paper are available in the website at https://portland-my.sharepoint.com/:u:/g/personal/jingnzhan2-c_my_cityu_edu_hk/EXUx6ii_tOxGq0N6QnL8GtYBsEAXE4Fdnz4DU9w0hub2hQ?e=ybPXOe.

Acknowledgments

The authors are grateful to reviewers, associate editor, and editor for their insightful comments and suggestions which have improved the manuscript significantly. Jingnan Zhang's research is supported in part by “USTC Research Funds of the Double First-Class Initiative” YD2040002020, Junhui Wang's research is supported in part by HK RGC grants GRF-11304520, GRF-11301521, and GRF-11311022, and CUHK Startup Grant 4937091.

References

Bakken
,
T.E.
,
Miller
,
J.A.
,
Ding
,
S.-L.
,
Sunkin
,
S.M.
,
Smith
,
K.A.
,
Ng
,
L.
,
Szafer
,
A.
,
Dalley
,
R.A.
,
Royall
,
J.J.
&
Lemon
,
T.
(
2016
)
A comprehensive transcriptional map of primate brain development
.
Nature
,
535
(
7612
),
367
375
.

Bellec
,
P.
,
Chu
,
C.
,
Chouinard-Decorte
,
F.
,
Benhajali
,
Y.
,
Margulies
,
D.S.
&
Craddock
,
R.C.
(
2017
)
The neuro bureau adhd-200 preprocessed repository
.
Neuroimage
,
144
,
275
286
.

Besag
,
J.
(
1974
)
Spatial interaction and the statistical analysis of lattice systems
.
Journal of the Royal Statistical Society: Series B (Methodological)
,
36
(
2
),
192
225
.

Bester
,
C.A.
,
Conley
,
T.G.
&
Hansen
,
C.B.
(
2011
)
Inference with dependent data using cluster covariance estimators
.
Journal of Econometrics
,
165
(
2
),
137
151
.

Buchanan
,
C.R.
,
Bastin
,
M.E.
,
Ritchie
,
S.J.
,
Liewald
,
D.C.
,
Madole
,
J.W.
,
Tucker-Drob
,
E.M.
,
Deary
,
I.J.
&
Cox
,
S.R.
(
2020
)
The effect of network thresholding and weighting on structural brain networks in the uk biobank
.
NeuroImage
,
211
, 116443.

Cheng
,
J.
,
Levina
,
E.
,
Wang
,
P.
&
Zhu
,
J.
(
2014
)
A sparse Ising model with covariates
.
Biometrics
,
70
(
4
),
943
953
.

Dempster
,
A.P.
,
Laird
,
N.M.
&
Rubin
,
D.B.
(
1977
)
Maximum likelihood from incomplete data via the EM algorithm
.
Journal of the Royal Statistical Society: Series B (Methodological)
,
39
(
1
),
1
22
.

Han
,
Q.
,
Xu
,
K.
&
Airoldi
,
E.
(
2015
)
Consistent estimation of dynamic and multi-layer block models
. International Conference on Machine Learning, PMLR, pp.
1511
1520
.

Holland
,
P.W.
,
Laskey
,
K.B.
&
Leinhardt
,
S.
(
1983
)
Stochastic blockmodels: first steps
.
Social Networks
,
5
(
2
),
109
137
.

Kumar
,
A.
,
Rai
,
P.
&
Daume
,
H.
(
2011
)
Co-regularized multi-view spectral clustering
.
Advances in Neural Information Processing Systems
,
24
,
1413
1421
.

Lang
,
S.H.
,
Swift
,
S.L.
,
White
,
H.
,
Misso
,
K.
,
Kleijnen
,
J.
&
Quek
,
R.G.
(
2019
)
A systematic review of the prevalence of dna damage response gene mutations in prostate cancer
.
International journal of Oncology
,
55
(
3
),
597
616
.

Larremore
,
D.B.
,
Clauset
,
A.
&
Buckee
,
C.O.
(
2013
)
A network approach to analyzing highly recombinant malaria parasite genes
.
PLoS Computational Biology
,
9
(
10
), e1003268.

Lei
,
J.
, &
Kevin
,
Z.L.
, (
2022
)
Bias-adjusted spectral clustering in multi-layer stochastic block models
.
Journal of the American Statistical Association
,
1
13
.

Lei
,
J.
,
Chen
,
K.
&
Lynch
,
B.
(
2020
)
Consistent community detection in multi-layer network data
.
Biometrika
,
107
(
1
),
61
73
.

Lei
,
J.
&
Rinaldo
,
A.
(
2015
)
Consistency of spectral clustering in stochastic block models
.
Annals of Statistics
,
43
(
1
),
215
237
.

Lin
,
Y.
(
2004
)
A note on margin-based loss functions in classification
.
Statistics & Probability Letters
,
68
(
1
),
73
82
.

Liu
,
F.
,
Choi
,
D.
,
Xie
,
L.
&
Roeder
,
K.
(
2018
)
Global spectral clustering in dynamic networks
.
Proceedings of the National Academy of Sciences
,
115
(
5
),
927
932
.

Mariadassou
,
M.
,
Robin
,
S.
&
Vacher
,
C.
(
2010
)
Uncovering latent structure in valued graphs: a variational approach
.
Annals of Applied Statistics
,
4
(
2
),
715
742
.

Newman
,
M.E.
&
Girvan
,
M.
(
2004
)
Finding and evaluating community structure in networks
.
Physical Review E
,
69
(
2
), 026113.

Newman
,
M.E.J.
(
2006
)
Modularity and community structure in networks
.
Proceedings of the National Academy of Sciences
,
103
(
23
),
8577
8582
.

Panagiotelis
,
A.
,
Czado
,
C.
&
Joe
,
H.
(
2012
)
Pair copula constructions for multivariate discrete data
.
Journal of the American Statistical Association
,
107
(
499
),
1063
1072
.

Paul
,
S.
&
Chen
,
Y.
(
2016
)
Consistent community detection in multi-relational data through restricted multi-layer stochastic blockmodel
.
Electronic Journal of Statistics
,
10
(
2
),
3807
3870
.

Paul
,
S.
&
Chen
,
Y.
(
2020
)
Spectral and matrix factorization methods for consistent community detection in multi-layer networks
.
The Annals of Statistics
,
48
(
1
),
230
250
.

Perkins
,
A.D.
&
Langston
,
M.A.
(
2009
)
Threshold selection in gene co-expression networks using spectral graph theory techniques
.
BMC Bioinformatics
,
10
,
1
11
.

Ravikumar
,
P.
,
Wainwright
,
M.J.
&
Lafferty
,
J.D.
(
2010
)
High-dimensional ising model selection using l1-regularized logistic regression
.
The Annals of Statistics
,
38
(
3
),
1287
1319
.

Rohe
,
K.
,
Chatterjee
,
S.
&
Yu
,
B.
(
2011
)
Spectral clustering and the high-dimensional stochastic blockmodel
.
The Annals of Statistics
,
39
(
4
),
1878
1915
.

Sklar
,
M.
(
1959
)
Fonctions de repartition an dimensions et leurs marges
.
Publ. Inst. Statist. Univ. Paris
,
8
,
229
231
.

Stuart
,
J.M.
,
Segal
,
E.
,
Koller
,
D.
&
Kim
,
S.K.
(
2003
)
A gene-coexpression network for global discovery of conserved genetic modules
.
Science
,
302
(
5643
),
249
255
.

Tang
,
W.
,
Lu
,
Z.
&
Dhillon
,
I.S.
(
2009
)
Clustering with multiple graphs
. In 2009 Ninth IEEE International Conference on Data Mining, IEEE, pp.
1016
1021
.

Wang
,
J.
(
2010
)
Consistent selection of the number of clusters via crossvalidation
.
Biometrika
,
97
(
4
),
893
904
.

Xu
,
S.
,
Zhen
,
Y.
&
Wang
,
J.
(
2022
)
Covariate-assisted community detection in multi-layer networks
. Journal of Business & Economic Statistics,
1
31
.

Yang
,
J.
&
Leskovec
,
J.
(
2012
)
Defining and evaluating network communities based on ground-truth
. Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, pp.
1
8
.

Yuan
,
Y.
&
Qu
,
A.
(
2021
)
Community detection with dependent connectivity
.
The Annals of Statistics
,
49
(
4
),
2378
2428
.

Zhang
,
J.
,
He
,
X.
&
Wang
,
J.
(
2022
)
Directed community detection with network embedding
.
Journal of the American Statistical Association
,
117
(
540
),
1809
1819
.

Zhang
,
J.
,
Sun
,
W.W.
&
Li
,
L.
(
2020
)
Mixed-effect time-varying network model and application in brain connectivity analysis
.
Journal of the American Statistical Association
,
115
(
532
),
2022
2036
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data