Predicting miRNA-disease associations based on lncRNA–miRNA interactions and graph convolution networks

Bipartite network construction

The lncRNA–miRNA bipartite network G_l-m is defined by the adjacency matrix A_l-m and its transpose |${A}_{l-m}^T$| as:

$$\begin{equation} {G}_{l-m}=\left[\begin{array}{cc}0& {A}_{l-m}\\{}{A}_{l-m}^T& 0\end{array}\right]\in{\mathbb{R}}^{\left( Nl+ Nm\right)\times \left( Nl+ Nm\right)} \end{equation}$$

(1)

Similarly, the miRNA-disease bipartite network G_m-d is denoted by the adjacency matrix A_m-d and its transpose |${A}_{m-d}^T$| as:

$$\begin{equation} {G}_{m-d}=\left[\begin{array}{cc}0& {A}_{m-d}\\{}{A}_{m-d}^T& 0\end{array}\right]\in{\mathbb{R}}^{\left( Nm+ Nd\right)\times \left( Nm+ Nd\right)} \end{equation}$$

(2)

GCN encoder

We use GCN as encoders to obtain multilayer structural information of lncRNAs, miRNAs and diseases based on G_l-m and G_m-d defined above. We randomly initialize the features of lncRNAs as |$X\in{\mathbb{R}}^{Nl\times f0}$| (⁠|${X}_i=\big\{{x}_{i1},{x}_{i2},\cdots, x{}_{if0}\big\}$|⁠), the features of miRNAs as |$Y\in{\mathbb{R}}^{Nm\times f0}$| and the features of diseases as |$Z\in{\mathbb{R}}^{Nd\times f0}$|⁠, respectively. We then receive the node embedding on G_m-d according to the following equation:

$$\begin{equation} {H}_{m-d}^{(k)}=f\left({H}_{m-d}^{\left(k-1\right)},{G}_{m-d}\right)=\sigma \left({D}^{-\frac{1}{2}}{G}_{m-d}{D}^{-\frac{1}{2}}{H}_{m-d}^{\left(k-1\right)}{W}_{m-d}^{\left(k-1\right)}\right) \end{equation}$$

(3)

where |${H}_{m-d}^{(k)}$| is the k-layer embedding of GCN, |$k$| = 1, …, K, |$D=\operatorname{diag}\Big(\sum \limits_{j=1}^{Nm+ Nd}{G}_{m-d}\big(i,j\big)\Big)$| is the diagonal node degree matrix of G_m-d, |${W}_{m-d}^{(k-1)}\in{\mathbb{R}}^{\big( Nm+ Nd\big)\times p}$| is a weight matrix for the kth neural network, p is the dimension of the GCN embedding and |$\sigma$| (·) is the non-linear activation function ReLU.

Similarly, we also use GCN as encoders to obtain the embedding |${H}_{l-m}^{(k)}$| of the nodes on G_l-m according to the following equation:

$$\begin{equation} {H}_{l-m}^{(k)}=f\left({H}_{l-m}^{\left(k-1\right)},{\mathrm{G}}_{l-m}\right)=\sigma \left({D}^{-\frac{1}{2}}{G}_{l-m}{D}^{-\frac{1}{2}}{H}_{l-m}^{\left(k-1\right)}{W}_{l-m}^{\left(k-1\right)}\right) \end{equation}$$

(4)

In our study, we construct the initial embeddings |${H}_{l-m}^{(0)}$| and |${H}_{m-d}^{(0)}$| for the initial layers of the two bipartite networks G_l-m and G_m-d as follows:

$$\begin{equation} {H}_{l-m}^{(0)}=\left[\begin{array}{l}X\\{}Y\end{array}\right]\in{\mathbb{R}}^{\left( Nl+ Nm\right)\times f0} \end{equation}$$

(5)

$$\begin{equation} {H}_{m-d}^{(0)}=\left[\begin{array}{l}Y\\{}Z\end{array}\right]\in{\mathbb{R}}^{\left( Nm+ Nd\right)\times f0} \end{equation}$$

(6)

We obtain the embeddings of lncRNAs, miRNAs and diseases from the two bipartite graphs by using the embeddings of each layer as different feature vectors, respectively. The embedding of each layer on G_l-m is denoted as

$${H}_{l-m}^k\in \big[\begin{array}{l}{H}_{{(l-m)}_l}^k\\{}{H}_{{(l-m)}_m}^k\end{array}\big]$$

⁠, where |${H}_{{(l-m)}_l}^k\in{\mathbb{R}}^{Nl\times p}$| represents the embedding of lncRNAs at the kth layer, and |${H}_{{(l-m)}_m}^k\in{\mathbb{R}}^{Nm\times p}$| denotes the embedding of miRNAs at layer k. The embedding of each layer on G_m-d is denoted as

$${H}_{m-d}^k\in \Big[\begin{array}{l}{H}_{{(m-d)}_m}^k\\{}{H}_{{(m-d)}_d}^k\end{array}\Big]$$

⁠, where |${H}_{{(m-d)}_m}^k\in{\mathbb{R}}^{Nm\times p}$| represents the embedding of miRNAs at layer k, and |${H}_{{(m-d)}_d}^k\in{\mathbb{R}}^{Nd\times p}$| denotes the embedding of diseases at layer k. Finally, we integrate the node features of the two bipartite graphs to obtain the combined feature spaces of lncRNAs, miRNAs and diseases as follows:

$$\begin{equation} {S}_l=\left\{{H}_{{\left(l-m\right)}_l}^1,{H}_{{\left(l-m\right)}_l}^2,\dots, {H}_{{\left(l-m\right)}_l}^k\right\} \end{equation}$$

(7)

$$\begin{equation} {S}_m=\left\{{H}_{{\left(l-m\right)}_m}^1,{H}_{{\left(l-m\right)}_m}^2,\dots, {H}_{{\left(l-m\right)}_m}^k,{H}_{{\left(m-d\right)}_m}^1,{H}_{{\left(m-d\right)}_m}^2,\dots, {H}_{{\left(m-d\right)}_m}^k\right\} \end{equation}$$

(8)

$$\begin{equation} {S}_d=\left\{{H}_{{\left(m-d\right)}_d}^1,{H}_{{\left(m-d\right)}_d}^2,\dots, {H}_{{\left(m-d\right)}_d}^k\right\} \end{equation}$$

(9)

where |${S}_l$|⁠, |${S}_m$| and |${S}_d$| represent the feature spaces of lncRNAs, miRNAs and diseases, respectively.

Multichannel attention mechanism

The feature spaces of lncRNAs, miRNAs and diseases contain structural information from different layers on the two different bipartite graphs, and different structural information have different contributions to embedding learning. We therefore use attention mechanism to extract the different features to improve the prediction performance of our model. Inspired by the SENet model [38] proposed by Hu et al. in computer vision, we here use the channel attention mechanism to calculate the contribution of different structural information in each space to the final embedding.

To obtain the importance of the different feature matrices, we first use a global average pooling to obtain the representation of each channel. For the miRNA feature space |${S}_m\in{\mathbb{R}}^{Nm\times p\times 2k}$| with 2 k channels, the representation vector |${E}_m\in{\mathbb{R}}^{1\times 1\times 2k}$| is generated by the squeeze operation and |${E}_m=\big\{{e}_m^1,{e}_m^2,...,{e}_m^{2k}\big\}$|⁠. More specifically, given the c-th feature matrix |${S}_m^c$| in the miRNA feature space |${S}_m$|⁠, the corresponding channel representation |${e}_m^c$| can be calculated as follows:

$$\begin{equation} {e}_m^c={\varPsi}_{\mathrm{sq}}\left({S}_m^c\right)=\frac{1}{N_m\times p}\sum \limits_{i=1}^{N_m}\sum \limits_{j=1}^p{S}_m^c\left(i,j\right) \end{equation}$$

(10)

Correspondingly, the channel attention factor |${a}_m$| for the miRNA feature space can be computed as follows:

$$\begin{equation} {a}_m={\varPsi}_{\mathrm{atten}}\left({E}_m,{W}_m\right)=\delta \left({W}_2\sigma \left({W}_1{E}_m\right)\right) \end{equation}$$

(11)

where |${a}_m\in{\mathbb{R}}^{1\times 1\times 2k}$| is the attention factor for miRNAs, |${W}_m=\big\{W1,W2\big\}$| is trainable parameters, |$\delta (\cdot )$| is Sigmoid activation function, and |$\sigma (\cdot )$| is ReLU activation function.

After obtaining the attention coefficients for each channel, the attention coefficients are multiplied with the original features to give different weights for each channel, with the following equation:

$$\begin{equation} {y}_c^{\prime }={\varPsi}_{\mathrm{scale}}\left({S}_m^c,{a}_m^c\right)={a}_m^c\cdotp{S}_m^c \end{equation}$$

(12)

Based on the channel attention mechanism, we obtain lncRNA, miRNA and disease channel information as follows:

$$\begin{equation} {X}_l^{\prime }=\left[{x}_1^{\prime },{x}_2^{\prime },\dots, {x}_k^{\prime}\right] \end{equation}$$

(13)

$$\begin{equation} {Y}_m^{\prime }=\left[{y}_1^{\prime },{y}_2^{\prime },\dots, {y}_{2k}^{\prime}\right] \end{equation}$$

(14)

$$\begin{equation} {Z}_d^{\prime }=\left[{z}_1^{\prime },{z}_2^{\prime },\dots, {z}_k^{\prime}\right] \end{equation}$$

(15)

CNN combiner

Inspired by the successful applications of CNNs in computer vision, we use multiple convolutional kernels of CNNs to extract and integrate the final node features from the channel information computed above. Given miRNA channel information |${Y}_m^{\prime }$|⁠, the final miRNA features |$Ym$| can be defined as follows:

$$\begin{equation} {Y}_m=g\left( conv,{Y}_m^{\prime}\right)= conv\left({y}_1^{\prime },{y}_2^{\prime },\dots, {y}_{2k}^{\prime}\right) \end{equation}$$

(16)

Similarly, we can obtain the final lncRNA embedding |${X}_l$| and disease embedding |${Z}_d$| as

$$\begin{equation} {X}_l=g\left( conv,{X}_l^{\prime}\right)= conv\left({x}_1^{\prime },{x}_2^{\prime },\dots, {x}_k^{\prime}\right) \end{equation}$$

(17)

$$\begin{equation} {Z}_d=g\left( conv,{Z}_d^{\prime}\right)= conv\left({z}_1^{\prime },{z}_2^{\prime },\dots, {z}_k^{\prime}\right) \end{equation}$$

(18)

where conv means convolution operation, the number of convolution kernels is p and kernel size is p × 1.

Bilinear decoder

We use the decoder |${A}_{m-d}^{\prime }=f\big({Y}_m,{Z}_d\big)$| for reconstructing the miRNA-disease adjacency matrix from the final miRNA features and disease features, which is defined as follows:

$$\begin{equation} {A}_{m-d}^{\prime }=f\left({Y}_m,{Z}_d\right)=\mathrm{sigmoid}\left({Y}_m{W_1}^{\prime }{Z}_d^T\right) \end{equation}$$

(19)

where |${W_1}$| is a trainable parameter.

Similarly, the decoder |${A}_{l-m}^{\prime }=f\big({X}_l,{Y}_m\big)$| is for reconstructing the lncRNA–miRNA adjacency matrix from the final lncRNA features and miRNA features, which is calculated as follows:

$$\begin{equation} {A}_{l-m}^{\prime }=f\left({X}_l,{Y}_m\right)=\mathrm{sigmoid}\left({X}_l{W_2}^{\prime }{Y}_m^T\right) \end{equation}$$

(20)

Optimization

In training, we minimize the following logistic loss function to make the miRNA-disease prediction results accurately close to the true values as follows:

$$\begin{equation} {\ell}_1=\frac{1}{N_{m}\times N_{d}}\sum \limits_{i,j}\left(\begin{array}{l}{A}_{m-d}^{train}\left(i,j\right)\log{A}_{m-d}^{\prime}\left(i,j\right)+\\{}\left(1-{A}_{m-d}^{train}\right)\log \left(1-{A}_{m-d}^{\prime}\left(i,j\right)\right)\end{array}\right) \end{equation}$$

(21)

where |${A}_{m-d}^{train}$| is the adjacency matrix of MDAs in the training set, and |${A}_{m-d}^{\prime }$| is the predicted values of MDAs.

Meanwhile, as we use the known LMIs to better train the model, the second loss function is defined as follows:

$$\begin{equation} {\ell}_2={\left\Vert{A}_{l-m}-{A}_{l-m}^{\prime}\right\Vert}_F^2 \end{equation}$$

(22)

where |$ \Vert{\cdot} \Vert_{F} $| is the Frobenius norm, |${A}_{l-m}$| is the original LMIs, and |${A}_{l-m}^{\prime }$| is the predicted lncRNA–miRNA values.

The final loss function can be defined as follows:

$$\begin{equation} {\ell}_{total}={\ell}_1+\alpha \ast{\ell}_2+\lambda{\left\Vert \varTheta \right\Vert}^2 \end{equation}$$

(23)

where |$\alpha$| and |$\lambda$| are hyperparameters, |$\alpha$| is used to balance the weight of the |${\ell}_2$| loss function, |$\lambda$| is for controlling the strength of L2 regularization and |$\varTheta$| represents the trainable parameters in the model.

In our model, we use the Adam optimizer [39] to minimize the loss function. The Adam optimizer iteratively updates the weights of the neural networks based on the training data. To prevent overfitting, we add L2 regularization to the loss function. In addition, the learning rate is adjusted according to the number of epochs trained during the optimization procedure, and we set the learning rate to decrease gradually as the epochs increase to achieve better training results.

Results

We first analyze the effects of the hyperparameters in our model MAGCN using 5-fold cross-validation (5-CV) based on the benchmark datasets in our study. Then, we perform ablation experiments on different components of our model to test the prediction performance. We further use three cross-validation strategies (2-CV, 5-CV and 10-CV) to comprehensively evaluate the performance of MAGCN, and compare it with other existing approaches under 5-CV experimental conditions. Finally, we use MAGCN to conduct case studies on three diseases to test its applications in reality.

Experimental setting and evaluation metrics

We used k-fold cross-validation (k = 2, 5, 10) to evaluate the performance of our method MAGCN by randomly dividing all MDAs into k approximately equal parts, with k–1 being used in turn for training and the remaining one for testing. To analyze the performance of our method, we use evaluation metrics including the area under the receiver operating characteristic (ROC) curve (AUROC) and the area under the precision/recall (PR) curve (AUPRC). We also calculate recall (also known as sensitivity), specificity, accuracy, precision and F1-measure (F1-score) for comprehensive comparison.

Parameter sensitivity analysis

There are four important hyperparameters (GCN layer k, initial feature embedding size f0, embedding size p and learning rate lr) in our method MAGCN. In this section, we empirically set values to the hyperparameters, and analyze their impacts on inference performance by conducting 5-fold cross-validation experiments on known MDAs. By changing the value of only one parameter, with the others remaining fixed, the following results are received.

GCN layer

Our model uses multiple GCN layers to extract structural information from different layers of lncRNAs, miRNAs and diseases to obtain their features. We set the number of GCN layers to be 2, 3 and 4 for analysis. The received AUC values are shown in Figure 2. We discover that the numbers of GCN layers have little effect on prediction performance. In the following experiments, the number of GCN layers k is therefore set to 2.

Figure 2

Sensitivity analysis on GCN layer k.

Initial feature embedding size

The node features of the model MAGCN are initialized randomly, and the size of the node features f0 is a hyperparameter. We choose the embedding size for features in the range of {64, 128, 256, 512, 1024} in our experiments, and the results are shown in Figure 3. From Figure 3, we can find that the best result is received when the initial feature embedding size is 512. In this study, we set the embedding size f0 to be 512.

Figure 3

Sensitivity analysis on initial feature embedding size f0.

Embedding size

In our model MAGCN, we use multiple layers of GCN to obtain the potential embeddings of lncRNAs, miRNAs and diseases, and finally use channel attention and CNN modules to calculate the final embeddings. We analyze the effect of the size of the potential embedding on the model. Specifically, we set the dimensions of the potential embedding to 64, 128, 256, 512 and 1024 for experimental comparison. From Figure 4, we can see that when the potential embedding size is 128, the model achieves the optimal AUC value. Therefore, we set the potential embedding size p to 128 in this study.

Figure 4

Sensitivity analysis on embedding size p.

Learning rate

The learning rate is a hyperparameter that is used in the loss function to update the network weights. We vary the value of the learning rate in {0.1, 0.01, 0.001, 0.0001} in our experiments, and from Figure 5 we can see that the AUC value is optimal when the learning rate is 0.001. Therefore, we set the learning rate lr to 0.001.

Figure 5

Sensitivity analysis on learning rate lr.

Finally, the hyperparameters in our model MAGCN are set as follows: the number of training epoch is set to 200, the learning rate is set to 0.001, the loss function scale α is set to 0.0001, the L2 regularization weight loss λ is 0.00005, the number of GCN layers is 2, the initialized embedding size of node features is 512 and the potential embedding size for lncRNAs, miRNAs and diseases is 128.

Effects of different model components on prediction performance

In MAGCN, we use multichannel attention mechanism and CNN combiner for feature extraction. We conduct ablation experiments on the channel attention mechanism and CNN combiner. We use MAGCN_noatte to represent only using the CNN combiner to combine different channel information to get the final feature information. MAGCN_nocnn means simply adding up the different feature information obtained through the channel attention mechanism to obtain the final result, and not using the CNN combiner to learn the complex non-linear relationships. MAGCN_noatten_nocnn indicates that the final features are obtained by adding up the embeddings of the different layers, while neither the channel attention mechanism nor the CNN combiner is used. Table 1 shows the evaluation metrics on the datasets using MAGCN and its variant models under 5-fold cross-validation experiments. We also plot the corresponding AUC curves and PR curves in Figures 6 and 7, from which we can find that using the multichannel attention mechanism and the CNN combiner can extract more important features in the feature space and learn complex non-linear relationships, making the model’s prediction performance improved.

Table 1

Performance of MAGCN and its variants based on 5-fold cross-validations

Method	AUROC	AUPRC	F1-score	ACC	RECALL	SPEC	PRE
MAGCN_noatte	0.9012	0.5188	0.5054	0.9470	0.5150	0.9710	0.4975
MAGCN_nocnn	0.9009	0.5247	0.5064	0.9477	0.5104	0.9719	0.5036
MAGCN_noatte_nocnn	0.8988	0.5124	0.5015	0.9480	0.4976	0.9729	0.5065
MAGCN	0.9032	0.5252	0.5066	0.9471	0.5162	0.9710	0.4981

Method	AUROC	AUPRC	F1-score	ACC	RECALL	SPEC	PRE
MAGCN_noatte	0.9012	0.5188	0.5054	0.9470	0.5150	0.9710	0.4975
MAGCN_nocnn	0.9009	0.5247	0.5064	0.9477	0.5104	0.9719	0.5036
MAGCN_noatte_nocnn	0.8988	0.5124	0.5015	0.9480	0.4976	0.9729	0.5065
MAGCN	0.9032	0.5252	0.5066	0.9471	0.5162	0.9710	0.4981

The bold value indicates the highest one in each column.

Table 1

Performance of MAGCN and its variants based on 5-fold cross-validations

Method	AUROC	AUPRC	F1-score	ACC	RECALL	SPEC	PRE
MAGCN_noatte	0.9012	0.5188	0.5054	0.9470	0.5150	0.9710	0.4975
MAGCN_nocnn	0.9009	0.5247	0.5064	0.9477	0.5104	0.9719	0.5036
MAGCN_noatte_nocnn	0.8988	0.5124	0.5015	0.9480	0.4976	0.9729	0.5065
MAGCN	0.9032	0.5252	0.5066	0.9471	0.5162	0.9710	0.4981

Method	AUROC	AUPRC	F1-score	ACC	RECALL	SPEC	PRE
MAGCN_noatte	0.9012	0.5188	0.5054	0.9470	0.5150	0.9710	0.4975
MAGCN_nocnn	0.9009	0.5247	0.5064	0.9477	0.5104	0.9719	0.5036
MAGCN_noatte_nocnn	0.8988	0.5124	0.5015	0.9480	0.4976	0.9729	0.5065
MAGCN	0.9032	0.5252	0.5066	0.9471	0.5162	0.9710	0.4981

The bold value indicates the highest one in each column.

Figure 6

ROC curves of ablation tests in MAGCN.

Figure 7

PR curves of ablation tests in MAGCN.

Performance evaluation

In this section, we further evaluate the prediction performance of our model MAGCN based on cross-validations. Since our model can predict both the LMIs and MDAs. We first use LMIs as auxiliary information to predict the potential associations between miRNAs and diseases, with average AUROC values of 0.8984, 0.9032 and 0.9044 under 2-fold, 5-fold and 10-fold cross-validations, respectively. We also use MAGCN to predict potential LMIs based on known MDAs, and the average AUROC values are 0.8973, 0.9605 and 0.9699 under 2-fold, 5-fold and 10-fold cross-validations, respectively. The experimental results demonstrate the reliability of our method in MDAs and LMIs predictions.

Comparison with other methods

In this section, we compare MAGCN with the latest methods that were proposed for MDA predictions. We here select five methods (i.e. MVMTMDA [34], MDA-SKF [17], Zeng et al.’s work [15], MDHGI [16] and IMCMDA [14]) for performance comparison. The methods are tested based on 5-fold cross-validations, and the comparative results are shown in Table 2. MAGCN obtains the highest AUROC value of 0.9032 in 5-fold cross-validations and its AUROC value is higher than that of other methods by 5.20% (MVMTMDA), 8.4% (MDA-SKF), 11.49% (Zeng et al.’s work), 21% (MDHGI) and 27.99% (IMCMDA), respectively. The experiments indicate the excellent performance of our method.

Table 2

Performance comparison with other methods based on 5-fold cross-validations

Prediction task	Method	Average AUROC
Prediction of miRNA-disease associations	IMCMDA [14]	0.6233
	MDHGI [16]	0.6932
	Zeng et al.’s work [15]	0.7883
	MDA-SKF [17]	0.8192
	MVMTMDA [34]	0.8512
	MAGCN	0.9032

Prediction task	Method	Average AUROC
Prediction of miRNA-disease associations	IMCMDA [14]	0.6233
	MDHGI [16]	0.6932
	Zeng et al.’s work [15]	0.7883
	MDA-SKF [17]	0.8192
	MVMTMDA [34]	0.8512
	MAGCN	0.9032

Table 2

Performance comparison with other methods based on 5-fold cross-validations

Prediction task	Method	Average AUROC
Prediction of miRNA-disease associations	IMCMDA [14]	0.6233
	MDHGI [16]	0.6932
	Zeng et al.’s work [15]	0.7883
	MDA-SKF [17]	0.8192
	MVMTMDA [34]	0.8512
	MAGCN	0.9032

Prediction task	Method	Average AUROC
Prediction of miRNA-disease associations	IMCMDA [14]	0.6233
	MDHGI [16]	0.6932
	Zeng et al.’s work [15]	0.7883
	MDA-SKF [17]	0.8192
	MVMTMDA [34]	0.8512
	MAGCN	0.9032

Meanwhile, both MAGCN and MVMTMDA can be applied to LMI predictions. We therefore use both methods to test their performance in LMI predictions under 2-fold, 5-fold and 10-fold cross-validations. The received AUROC values are available at Table 3. It can be observed from Table 3 that our method MAGCN performs better than MVMTMDA, which further demonstrates the superiority of our method MAGCN.

Table 3

Comparison of average AUROC values of LMI predictions based on 2-fold, 5-fold and 10-fold cross-validations

Method	2-fold	5-fold	10-fold
MVMTMDA	0.8747	0.9014	0.9037
MAGCN	0.8973	0.9605	0.9699

Table 3

Comparison of average AUROC values of LMI predictions based on 2-fold, 5-fold and 10-fold cross-validations

Method	2-fold	5-fold	10-fold
MVMTMDA	0.8747	0.9014	0.9037
MAGCN	0.8973	0.9605	0.9699

Table 4

The top 50 predicted miRNAs associated with colon neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-21-5p	dbDEMC, HMDD	26	hsa-miR-31-5p	dbDEMC, HMDD
2	hsa-miR-146a-5p	dbDEMC, HMDD	27	hsa-miR-1	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-214-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC
5	hsa-miR-34a-5p	dbDEMC, HMDD	30	hsa-miR-96-5p	dbDEMC, HMDD
6	hsa-miR-126-3p	dbDEMC, HMDD	31	hsa-miR-17-5p	dbDEMC, HMDD
7	hsa-miR-145-5p	dbDEMC, HMDD	32	hsa-miR-125b-5p	dbDEMC, HMDD
8	hsa-miR-122-5p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC, HMDD
9	hsa-miR-221-3p	dbDEMC, HMDD	34	hsa-miR-19a-3p	dbDEMC, HMDD
10	hsa-miR-132-3p	dbDEMC, HMDD	35	hsa-miR-27a-3p	dbDEMC, HMDD
11	hsa-miR-150-5p	dbDEMC, HMDD	36	hsa-miR-124-3p	dbDEMC
12	hsa-miR-143-3p	dbDEMC, HMDD	37	hsa-miR-200b-3p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC	38	hsa-miR-34c-5p	dbDEMC
14	hsa-miR-206	dbDEMC	39	hsa-miR-200c-3p	dbDEMC, HMDD
15	hsa-miR-142-3p	dbDEMC, HMDD	40	hsa-miR-30a-5p	dbDEMC, HMDD
16	hsa-miR-29a-3p	dbDEMC, HMDD	41	hsa-miR-15b-5p	dbDEMC, HMDD
17	hsa-miR-210-3p	dbDEMC, HMDD	42	hsa-miR-486-5p	dbDEMC, HMDD
18	hsa-miR-16-5p	dbDEMC	43	hsa-miR-106b-5p	dbDEMC, HMDD
19	hsa-miR-15a-5p	dbDEMC, HMDD	44	hsa-miR-205-5p	dbDEMC, HMDD
20	hsa-miR-182-5p	dbDEMC	45	hsa-miR-93-5p	dbDEMC, HMDD
21	hsa-miR-222-3p	dbDEMC, HMDD	46	hsa-miR-29b-3p	dbDEMC, HMDD
22	hsa-miR-24-3p	dbDEMC, HMDD	47	hsa-miR-192-5p	dbDEMC, HMDD
23	hsa-miR-133a-3p	dbDEMC, HMDD	48	hsa-miR-141-3p	dbDEMC, HMDD
24	hsa-miR-146b-5p	dbDEMC	49	hsa-miR-195-5p	dbDEMC, HMDD
25	hsa-miR-20a-5p	dbDEMC, HMDD	50	hsa-miR-181a-5p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-21-5p	dbDEMC, HMDD	26	hsa-miR-31-5p	dbDEMC, HMDD
2	hsa-miR-146a-5p	dbDEMC, HMDD	27	hsa-miR-1	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-214-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC
5	hsa-miR-34a-5p	dbDEMC, HMDD	30	hsa-miR-96-5p	dbDEMC, HMDD
6	hsa-miR-126-3p	dbDEMC, HMDD	31	hsa-miR-17-5p	dbDEMC, HMDD
7	hsa-miR-145-5p	dbDEMC, HMDD	32	hsa-miR-125b-5p	dbDEMC, HMDD
8	hsa-miR-122-5p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC, HMDD
9	hsa-miR-221-3p	dbDEMC, HMDD	34	hsa-miR-19a-3p	dbDEMC, HMDD
10	hsa-miR-132-3p	dbDEMC, HMDD	35	hsa-miR-27a-3p	dbDEMC, HMDD
11	hsa-miR-150-5p	dbDEMC, HMDD	36	hsa-miR-124-3p	dbDEMC
12	hsa-miR-143-3p	dbDEMC, HMDD	37	hsa-miR-200b-3p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC	38	hsa-miR-34c-5p	dbDEMC
14	hsa-miR-206	dbDEMC	39	hsa-miR-200c-3p	dbDEMC, HMDD
15	hsa-miR-142-3p	dbDEMC, HMDD	40	hsa-miR-30a-5p	dbDEMC, HMDD
16	hsa-miR-29a-3p	dbDEMC, HMDD	41	hsa-miR-15b-5p	dbDEMC, HMDD
17	hsa-miR-210-3p	dbDEMC, HMDD	42	hsa-miR-486-5p	dbDEMC, HMDD
18	hsa-miR-16-5p	dbDEMC	43	hsa-miR-106b-5p	dbDEMC, HMDD
19	hsa-miR-15a-5p	dbDEMC, HMDD	44	hsa-miR-205-5p	dbDEMC, HMDD
20	hsa-miR-182-5p	dbDEMC	45	hsa-miR-93-5p	dbDEMC, HMDD
21	hsa-miR-222-3p	dbDEMC, HMDD	46	hsa-miR-29b-3p	dbDEMC, HMDD
22	hsa-miR-24-3p	dbDEMC, HMDD	47	hsa-miR-192-5p	dbDEMC, HMDD
23	hsa-miR-133a-3p	dbDEMC, HMDD	48	hsa-miR-141-3p	dbDEMC, HMDD
24	hsa-miR-146b-5p	dbDEMC	49	hsa-miR-195-5p	dbDEMC, HMDD
25	hsa-miR-20a-5p	dbDEMC, HMDD	50	hsa-miR-181a-5p	dbDEMC, HMDD

Table 4

The top 50 predicted miRNAs associated with colon neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-21-5p	dbDEMC, HMDD	26	hsa-miR-31-5p	dbDEMC, HMDD
2	hsa-miR-146a-5p	dbDEMC, HMDD	27	hsa-miR-1	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-214-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC
5	hsa-miR-34a-5p	dbDEMC, HMDD	30	hsa-miR-96-5p	dbDEMC, HMDD
6	hsa-miR-126-3p	dbDEMC, HMDD	31	hsa-miR-17-5p	dbDEMC, HMDD
7	hsa-miR-145-5p	dbDEMC, HMDD	32	hsa-miR-125b-5p	dbDEMC, HMDD
8	hsa-miR-122-5p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC, HMDD
9	hsa-miR-221-3p	dbDEMC, HMDD	34	hsa-miR-19a-3p	dbDEMC, HMDD
10	hsa-miR-132-3p	dbDEMC, HMDD	35	hsa-miR-27a-3p	dbDEMC, HMDD
11	hsa-miR-150-5p	dbDEMC, HMDD	36	hsa-miR-124-3p	dbDEMC
12	hsa-miR-143-3p	dbDEMC, HMDD	37	hsa-miR-200b-3p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC	38	hsa-miR-34c-5p	dbDEMC
14	hsa-miR-206	dbDEMC	39	hsa-miR-200c-3p	dbDEMC, HMDD
15	hsa-miR-142-3p	dbDEMC, HMDD	40	hsa-miR-30a-5p	dbDEMC, HMDD
16	hsa-miR-29a-3p	dbDEMC, HMDD	41	hsa-miR-15b-5p	dbDEMC, HMDD
17	hsa-miR-210-3p	dbDEMC, HMDD	42	hsa-miR-486-5p	dbDEMC, HMDD
18	hsa-miR-16-5p	dbDEMC	43	hsa-miR-106b-5p	dbDEMC, HMDD
19	hsa-miR-15a-5p	dbDEMC, HMDD	44	hsa-miR-205-5p	dbDEMC, HMDD
20	hsa-miR-182-5p	dbDEMC	45	hsa-miR-93-5p	dbDEMC, HMDD
21	hsa-miR-222-3p	dbDEMC, HMDD	46	hsa-miR-29b-3p	dbDEMC, HMDD
22	hsa-miR-24-3p	dbDEMC, HMDD	47	hsa-miR-192-5p	dbDEMC, HMDD
23	hsa-miR-133a-3p	dbDEMC, HMDD	48	hsa-miR-141-3p	dbDEMC, HMDD
24	hsa-miR-146b-5p	dbDEMC	49	hsa-miR-195-5p	dbDEMC, HMDD
25	hsa-miR-20a-5p	dbDEMC, HMDD	50	hsa-miR-181a-5p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-21-5p	dbDEMC, HMDD	26	hsa-miR-31-5p	dbDEMC, HMDD
2	hsa-miR-146a-5p	dbDEMC, HMDD	27	hsa-miR-1	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-214-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC
5	hsa-miR-34a-5p	dbDEMC, HMDD	30	hsa-miR-96-5p	dbDEMC, HMDD
6	hsa-miR-126-3p	dbDEMC, HMDD	31	hsa-miR-17-5p	dbDEMC, HMDD
7	hsa-miR-145-5p	dbDEMC, HMDD	32	hsa-miR-125b-5p	dbDEMC, HMDD
8	hsa-miR-122-5p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC, HMDD
9	hsa-miR-221-3p	dbDEMC, HMDD	34	hsa-miR-19a-3p	dbDEMC, HMDD
10	hsa-miR-132-3p	dbDEMC, HMDD	35	hsa-miR-27a-3p	dbDEMC, HMDD
11	hsa-miR-150-5p	dbDEMC, HMDD	36	hsa-miR-124-3p	dbDEMC
12	hsa-miR-143-3p	dbDEMC, HMDD	37	hsa-miR-200b-3p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC	38	hsa-miR-34c-5p	dbDEMC
14	hsa-miR-206	dbDEMC	39	hsa-miR-200c-3p	dbDEMC, HMDD
15	hsa-miR-142-3p	dbDEMC, HMDD	40	hsa-miR-30a-5p	dbDEMC, HMDD
16	hsa-miR-29a-3p	dbDEMC, HMDD	41	hsa-miR-15b-5p	dbDEMC, HMDD
17	hsa-miR-210-3p	dbDEMC, HMDD	42	hsa-miR-486-5p	dbDEMC, HMDD
18	hsa-miR-16-5p	dbDEMC	43	hsa-miR-106b-5p	dbDEMC, HMDD
19	hsa-miR-15a-5p	dbDEMC, HMDD	44	hsa-miR-205-5p	dbDEMC, HMDD
20	hsa-miR-182-5p	dbDEMC	45	hsa-miR-93-5p	dbDEMC, HMDD
21	hsa-miR-222-3p	dbDEMC, HMDD	46	hsa-miR-29b-3p	dbDEMC, HMDD
22	hsa-miR-24-3p	dbDEMC, HMDD	47	hsa-miR-192-5p	dbDEMC, HMDD
23	hsa-miR-133a-3p	dbDEMC, HMDD	48	hsa-miR-141-3p	dbDEMC, HMDD
24	hsa-miR-146b-5p	dbDEMC	49	hsa-miR-195-5p	dbDEMC, HMDD
25	hsa-miR-20a-5p	dbDEMC, HMDD	50	hsa-miR-181a-5p	dbDEMC, HMDD

Case studies

In this section, we conduct case studies to further validate the predictive performance of the model MAGCN in real situations. As tumors are serious illnesses that cause many deaths each year, predicting their related miRNAs is of great interest. We therefore choose to predict disease-related miRNAs with three common cancers (colon tumor, breast cancer and kidney cancer). Specifically, we remove the association information with a specific disease from the known MDA dataset, and train MAGCN with the other information to obtain the prediction results. Since biologists are more interested in the top predictions, we finally choose the top 50 associated miRNAs from the prediction results and validate them on the latest databases, such as HMDD v3.0 [36] and dbDEMC [40]. The validation results are listed in Tables 4,5 and 6, respectively. We can find from the three tables that all the top 50 predictions in the three diseases have been supported by existing databases. The results from the three case studies suggest that MAGCN is an effective tool in detecting new MDAs.

Table 5

The top 50 predicted miRNAs associated with breast neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC, HMDD	26	hsa-miR-222-3p	dbDEMC, HMDD
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-214-3p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-320a	dbDEMC, HMDD
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC, HMDD
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-200c-3p	dbDEMC, HMDD
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-20a-5p	dbDEMC, HMDD
7	hsa-miR-132-3p	dbDEMC, HMDD	32	hsa-miR-92a-3p	dbDEMC, HMDD
8	hsa-miR-34a-5p	dbDEMC, HMDD	33	hsa-miR-34c-5p	dbDEMC, HMDD
9	hsa-miR-122-5p	dbDEMC, HMDD	34	hsa-miR-143-3p	dbDEMC, HMDD
10	hsa-miR-145-5p	dbDEMC, HMDD	35	hsa-miR-29a-3p	dbDEMC, HMDD
11	hsa-miR-206	dbDEMC, HMDD	36	hsa-miR-125a-5p	dbDEMC, HMDD
12	hsa-miR-221-3p	dbDEMC, HMDD	37	hsa-miR-182-5p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC, HMDD	38	hsa-miR-124-3p	dbDEMC, HMDD
14	hsa-miR-142-3p	dbDEMC, HMDD	39	hsa-miR-30a-5p	dbDEMC, HMDD
15	hsa-miR-96-5p	dbDEMC, HMDD	40	hsa-miR-19a-3p	dbDEMC, HMDD
16	hsa-miR-17-5p	dbDEMC, HMDD	41	hsa-miR-205-5p	dbDEMC, HMDD
17	hsa-miR-133a-3p	dbDEMC, HMDD	42	hsa-miR-140-5p	dbDEMC, HMDD
18	hsa-miR-150-5p	dbDEMC, HMDD	43	hsa-miR-486-5p	dbDEMC, HMDD
19	hsa-miR-146b-5p	dbDEMC, HMDD	44	hsa-miR-212-3p	dbDEMC, HMDD
20	hsa-miR-16-5p	dbDEMC, HMDD	45	hsa-miR-15b-5p	dbDEMC, HMDD
21	hsa-miR-15a-5p	dbDEMC, HMDD	46	hsa-miR-192-5p	dbDEMC, HMDD
22	hsa-miR-125b-5p	dbDEMC, HMDD	47	hsa-miR-144-3p	dbDEMC, HMDD
23	hsa-miR-24-3p	dbDEMC, HMDD	48	hsa-miR-106b-5p	dbDEMC, HMDD
24	hsa-miR-1	dbDEMC, HMDD	49	hsa-let-7b-5p	dbDEMC, HMDD
25	hsa-miR-31-5p	dbDEMC, HMDD	50	hsa-miR-200b-3p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC, HMDD	26	hsa-miR-222-3p	dbDEMC, HMDD
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-214-3p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-320a	dbDEMC, HMDD
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC, HMDD
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-200c-3p	dbDEMC, HMDD
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-20a-5p	dbDEMC, HMDD
7	hsa-miR-132-3p	dbDEMC, HMDD	32	hsa-miR-92a-3p	dbDEMC, HMDD
8	hsa-miR-34a-5p	dbDEMC, HMDD	33	hsa-miR-34c-5p	dbDEMC, HMDD
9	hsa-miR-122-5p	dbDEMC, HMDD	34	hsa-miR-143-3p	dbDEMC, HMDD
10	hsa-miR-145-5p	dbDEMC, HMDD	35	hsa-miR-29a-3p	dbDEMC, HMDD
11	hsa-miR-206	dbDEMC, HMDD	36	hsa-miR-125a-5p	dbDEMC, HMDD
12	hsa-miR-221-3p	dbDEMC, HMDD	37	hsa-miR-182-5p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC, HMDD	38	hsa-miR-124-3p	dbDEMC, HMDD
14	hsa-miR-142-3p	dbDEMC, HMDD	39	hsa-miR-30a-5p	dbDEMC, HMDD
15	hsa-miR-96-5p	dbDEMC, HMDD	40	hsa-miR-19a-3p	dbDEMC, HMDD
16	hsa-miR-17-5p	dbDEMC, HMDD	41	hsa-miR-205-5p	dbDEMC, HMDD
17	hsa-miR-133a-3p	dbDEMC, HMDD	42	hsa-miR-140-5p	dbDEMC, HMDD
18	hsa-miR-150-5p	dbDEMC, HMDD	43	hsa-miR-486-5p	dbDEMC, HMDD
19	hsa-miR-146b-5p	dbDEMC, HMDD	44	hsa-miR-212-3p	dbDEMC, HMDD
20	hsa-miR-16-5p	dbDEMC, HMDD	45	hsa-miR-15b-5p	dbDEMC, HMDD
21	hsa-miR-15a-5p	dbDEMC, HMDD	46	hsa-miR-192-5p	dbDEMC, HMDD
22	hsa-miR-125b-5p	dbDEMC, HMDD	47	hsa-miR-144-3p	dbDEMC, HMDD
23	hsa-miR-24-3p	dbDEMC, HMDD	48	hsa-miR-106b-5p	dbDEMC, HMDD
24	hsa-miR-1	dbDEMC, HMDD	49	hsa-let-7b-5p	dbDEMC, HMDD
25	hsa-miR-31-5p	dbDEMC, HMDD	50	hsa-miR-200b-3p	dbDEMC, HMDD

Table 5

The top 50 predicted miRNAs associated with breast neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC, HMDD	26	hsa-miR-222-3p	dbDEMC, HMDD
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-214-3p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-320a	dbDEMC, HMDD
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC, HMDD
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-200c-3p	dbDEMC, HMDD
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-20a-5p	dbDEMC, HMDD
7	hsa-miR-132-3p	dbDEMC, HMDD	32	hsa-miR-92a-3p	dbDEMC, HMDD
8	hsa-miR-34a-5p	dbDEMC, HMDD	33	hsa-miR-34c-5p	dbDEMC, HMDD
9	hsa-miR-122-5p	dbDEMC, HMDD	34	hsa-miR-143-3p	dbDEMC, HMDD
10	hsa-miR-145-5p	dbDEMC, HMDD	35	hsa-miR-29a-3p	dbDEMC, HMDD
11	hsa-miR-206	dbDEMC, HMDD	36	hsa-miR-125a-5p	dbDEMC, HMDD
12	hsa-miR-221-3p	dbDEMC, HMDD	37	hsa-miR-182-5p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC, HMDD	38	hsa-miR-124-3p	dbDEMC, HMDD
14	hsa-miR-142-3p	dbDEMC, HMDD	39	hsa-miR-30a-5p	dbDEMC, HMDD
15	hsa-miR-96-5p	dbDEMC, HMDD	40	hsa-miR-19a-3p	dbDEMC, HMDD
16	hsa-miR-17-5p	dbDEMC, HMDD	41	hsa-miR-205-5p	dbDEMC, HMDD
17	hsa-miR-133a-3p	dbDEMC, HMDD	42	hsa-miR-140-5p	dbDEMC, HMDD
18	hsa-miR-150-5p	dbDEMC, HMDD	43	hsa-miR-486-5p	dbDEMC, HMDD
19	hsa-miR-146b-5p	dbDEMC, HMDD	44	hsa-miR-212-3p	dbDEMC, HMDD
20	hsa-miR-16-5p	dbDEMC, HMDD	45	hsa-miR-15b-5p	dbDEMC, HMDD
21	hsa-miR-15a-5p	dbDEMC, HMDD	46	hsa-miR-192-5p	dbDEMC, HMDD
22	hsa-miR-125b-5p	dbDEMC, HMDD	47	hsa-miR-144-3p	dbDEMC, HMDD
23	hsa-miR-24-3p	dbDEMC, HMDD	48	hsa-miR-106b-5p	dbDEMC, HMDD
24	hsa-miR-1	dbDEMC, HMDD	49	hsa-let-7b-5p	dbDEMC, HMDD
25	hsa-miR-31-5p	dbDEMC, HMDD	50	hsa-miR-200b-3p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC, HMDD	26	hsa-miR-222-3p	dbDEMC, HMDD
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-214-3p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-320a	dbDEMC, HMDD
4	hsa-miR-223-3p	dbDEMC, HMDD	29	hsa-miR-9-5p	dbDEMC, HMDD
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-200c-3p	dbDEMC, HMDD
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-20a-5p	dbDEMC, HMDD
7	hsa-miR-132-3p	dbDEMC, HMDD	32	hsa-miR-92a-3p	dbDEMC, HMDD
8	hsa-miR-34a-5p	dbDEMC, HMDD	33	hsa-miR-34c-5p	dbDEMC, HMDD
9	hsa-miR-122-5p	dbDEMC, HMDD	34	hsa-miR-143-3p	dbDEMC, HMDD
10	hsa-miR-145-5p	dbDEMC, HMDD	35	hsa-miR-29a-3p	dbDEMC, HMDD
11	hsa-miR-206	dbDEMC, HMDD	36	hsa-miR-125a-5p	dbDEMC, HMDD
12	hsa-miR-221-3p	dbDEMC, HMDD	37	hsa-miR-182-5p	dbDEMC, HMDD
13	hsa-miR-183-5p	dbDEMC, HMDD	38	hsa-miR-124-3p	dbDEMC, HMDD
14	hsa-miR-142-3p	dbDEMC, HMDD	39	hsa-miR-30a-5p	dbDEMC, HMDD
15	hsa-miR-96-5p	dbDEMC, HMDD	40	hsa-miR-19a-3p	dbDEMC, HMDD
16	hsa-miR-17-5p	dbDEMC, HMDD	41	hsa-miR-205-5p	dbDEMC, HMDD
17	hsa-miR-133a-3p	dbDEMC, HMDD	42	hsa-miR-140-5p	dbDEMC, HMDD
18	hsa-miR-150-5p	dbDEMC, HMDD	43	hsa-miR-486-5p	dbDEMC, HMDD
19	hsa-miR-146b-5p	dbDEMC, HMDD	44	hsa-miR-212-3p	dbDEMC, HMDD
20	hsa-miR-16-5p	dbDEMC, HMDD	45	hsa-miR-15b-5p	dbDEMC, HMDD
21	hsa-miR-15a-5p	dbDEMC, HMDD	46	hsa-miR-192-5p	dbDEMC, HMDD
22	hsa-miR-125b-5p	dbDEMC, HMDD	47	hsa-miR-144-3p	dbDEMC, HMDD
23	hsa-miR-24-3p	dbDEMC, HMDD	48	hsa-miR-106b-5p	dbDEMC, HMDD
24	hsa-miR-1	dbDEMC, HMDD	49	hsa-let-7b-5p	dbDEMC, HMDD
25	hsa-miR-31-5p	dbDEMC, HMDD	50	hsa-miR-200b-3p	dbDEMC, HMDD

Table 6

The top 50 predicted miRNAs associated with kidney neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC	26	hsa-miR-96-5p	dbDEMC
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-15a-5p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-24-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC	29	hsa-miR-320a	dbDEMC
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-125b-5p	dbDEMC
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-19a-3p	dbDEMC
7	hsa-miR-122-5p	dbDEMC	32	hsa-miR-20a-5p	dbDEMC
8	hsa-miR-221-3p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC
9	hsa-miR-34a-5p	dbDEMC, HMDD	34	hsa-miR-33a-5p	dbDEMC
10	hsa-miR-206	dbDEMC	35	hsa-miR-486-5p	dbDEMC
11	hsa-miR-1	dbDEMC	36	hsa-miR-16-5p	dbDEMC
12	hsa-miR-222-3p	dbDEMC	37	hsa-miR-192-5p	dbDEMC, HMDD
13	hsa-miR-145-5p	dbDEMC	38	hsa-miR-29a-3p	dbDEMC
14	hsa-miR-142-3p	dbDEMC	39	hsa-miR-34c-5p	dbDEMC
15	hsa-miR-183-5p	dbDEMC, HMDD	40	hsa-miR-124-3p	dbDEMC
16	hsa-miR-132-3p	dbDEMC, HMDD	41	hsa-miR-194-5p	dbDEMC
17	hsa-miR-143-3p	dbDEMC	42	hsa-miR-15b-5p	dbDEMC
18	hsa-miR-9-5p	dbDEMC	43	hsa-miR-144-3p	dbDEMC
19	hsa-miR-214-3p	dbDEMC	44	hsa-miR-205-5p	dbDEMC
20	hsa-miR-133a-3p	dbDEMC	45	hsa-let-7b-5p	dbDEMC
21	hsa-miR-182-5p	dbDEMC	46	hsa-miR-30a-5p	dbDEMC
22	hsa-miR-146b-5p	dbDEMC	47	hsa-miR-200c-3p	dbDEMC
23	hsa-miR-150-5p	dbDEMC	48	hsa-miR-204-5p	dbDEMC
24	hsa-miR-31-5p	dbDEMC	49	hsa-miR-22-3p	dbDEMC
25	hsa-miR-17-5p	dbDEMC, HMDD	50	hsa-miR-27a-3p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC	26	hsa-miR-96-5p	dbDEMC
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-15a-5p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-24-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC	29	hsa-miR-320a	dbDEMC
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-125b-5p	dbDEMC
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-19a-3p	dbDEMC
7	hsa-miR-122-5p	dbDEMC	32	hsa-miR-20a-5p	dbDEMC
8	hsa-miR-221-3p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC
9	hsa-miR-34a-5p	dbDEMC, HMDD	34	hsa-miR-33a-5p	dbDEMC
10	hsa-miR-206	dbDEMC	35	hsa-miR-486-5p	dbDEMC
11	hsa-miR-1	dbDEMC	36	hsa-miR-16-5p	dbDEMC
12	hsa-miR-222-3p	dbDEMC	37	hsa-miR-192-5p	dbDEMC, HMDD
13	hsa-miR-145-5p	dbDEMC	38	hsa-miR-29a-3p	dbDEMC
14	hsa-miR-142-3p	dbDEMC	39	hsa-miR-34c-5p	dbDEMC
15	hsa-miR-183-5p	dbDEMC, HMDD	40	hsa-miR-124-3p	dbDEMC
16	hsa-miR-132-3p	dbDEMC, HMDD	41	hsa-miR-194-5p	dbDEMC
17	hsa-miR-143-3p	dbDEMC	42	hsa-miR-15b-5p	dbDEMC
18	hsa-miR-9-5p	dbDEMC	43	hsa-miR-144-3p	dbDEMC
19	hsa-miR-214-3p	dbDEMC	44	hsa-miR-205-5p	dbDEMC
20	hsa-miR-133a-3p	dbDEMC	45	hsa-let-7b-5p	dbDEMC
21	hsa-miR-182-5p	dbDEMC	46	hsa-miR-30a-5p	dbDEMC
22	hsa-miR-146b-5p	dbDEMC	47	hsa-miR-200c-3p	dbDEMC
23	hsa-miR-150-5p	dbDEMC	48	hsa-miR-204-5p	dbDEMC
24	hsa-miR-31-5p	dbDEMC	49	hsa-miR-22-3p	dbDEMC
25	hsa-miR-17-5p	dbDEMC, HMDD	50	hsa-miR-27a-3p	dbDEMC, HMDD

Table 6

The top 50 predicted miRNAs associated with kidney neoplasms

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC	26	hsa-miR-96-5p	dbDEMC
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-15a-5p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-24-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC	29	hsa-miR-320a	dbDEMC
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-125b-5p	dbDEMC
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-19a-3p	dbDEMC
7	hsa-miR-122-5p	dbDEMC	32	hsa-miR-20a-5p	dbDEMC
8	hsa-miR-221-3p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC
9	hsa-miR-34a-5p	dbDEMC, HMDD	34	hsa-miR-33a-5p	dbDEMC
10	hsa-miR-206	dbDEMC	35	hsa-miR-486-5p	dbDEMC
11	hsa-miR-1	dbDEMC	36	hsa-miR-16-5p	dbDEMC
12	hsa-miR-222-3p	dbDEMC	37	hsa-miR-192-5p	dbDEMC, HMDD
13	hsa-miR-145-5p	dbDEMC	38	hsa-miR-29a-3p	dbDEMC
14	hsa-miR-142-3p	dbDEMC	39	hsa-miR-34c-5p	dbDEMC
15	hsa-miR-183-5p	dbDEMC, HMDD	40	hsa-miR-124-3p	dbDEMC
16	hsa-miR-132-3p	dbDEMC, HMDD	41	hsa-miR-194-5p	dbDEMC
17	hsa-miR-143-3p	dbDEMC	42	hsa-miR-15b-5p	dbDEMC
18	hsa-miR-9-5p	dbDEMC	43	hsa-miR-144-3p	dbDEMC
19	hsa-miR-214-3p	dbDEMC	44	hsa-miR-205-5p	dbDEMC
20	hsa-miR-133a-3p	dbDEMC	45	hsa-let-7b-5p	dbDEMC
21	hsa-miR-182-5p	dbDEMC	46	hsa-miR-30a-5p	dbDEMC
22	hsa-miR-146b-5p	dbDEMC	47	hsa-miR-200c-3p	dbDEMC
23	hsa-miR-150-5p	dbDEMC	48	hsa-miR-204-5p	dbDEMC
24	hsa-miR-31-5p	dbDEMC	49	hsa-miR-22-3p	dbDEMC
25	hsa-miR-17-5p	dbDEMC, HMDD	50	hsa-miR-27a-3p	dbDEMC, HMDD

Ranking	miRNA	Evidence	Ranking	miRNA	Evidence
1	hsa-miR-146a-5p	dbDEMC	26	hsa-miR-96-5p	dbDEMC
2	hsa-miR-21-5p	dbDEMC, HMDD	27	hsa-miR-15a-5p	dbDEMC, HMDD
3	hsa-miR-155-5p	dbDEMC, HMDD	28	hsa-miR-24-3p	dbDEMC
4	hsa-miR-223-3p	dbDEMC	29	hsa-miR-320a	dbDEMC
5	hsa-miR-126-3p	dbDEMC, HMDD	30	hsa-miR-125b-5p	dbDEMC
6	hsa-miR-210-3p	dbDEMC, HMDD	31	hsa-miR-19a-3p	dbDEMC
7	hsa-miR-122-5p	dbDEMC	32	hsa-miR-20a-5p	dbDEMC
8	hsa-miR-221-3p	dbDEMC	33	hsa-miR-92a-3p	dbDEMC
9	hsa-miR-34a-5p	dbDEMC, HMDD	34	hsa-miR-33a-5p	dbDEMC
10	hsa-miR-206	dbDEMC	35	hsa-miR-486-5p	dbDEMC
11	hsa-miR-1	dbDEMC	36	hsa-miR-16-5p	dbDEMC
12	hsa-miR-222-3p	dbDEMC	37	hsa-miR-192-5p	dbDEMC, HMDD
13	hsa-miR-145-5p	dbDEMC	38	hsa-miR-29a-3p	dbDEMC
14	hsa-miR-142-3p	dbDEMC	39	hsa-miR-34c-5p	dbDEMC
15	hsa-miR-183-5p	dbDEMC, HMDD	40	hsa-miR-124-3p	dbDEMC
16	hsa-miR-132-3p	dbDEMC, HMDD	41	hsa-miR-194-5p	dbDEMC
17	hsa-miR-143-3p	dbDEMC	42	hsa-miR-15b-5p	dbDEMC
18	hsa-miR-9-5p	dbDEMC	43	hsa-miR-144-3p	dbDEMC
19	hsa-miR-214-3p	dbDEMC	44	hsa-miR-205-5p	dbDEMC
20	hsa-miR-133a-3p	dbDEMC	45	hsa-let-7b-5p	dbDEMC
21	hsa-miR-182-5p	dbDEMC	46	hsa-miR-30a-5p	dbDEMC
22	hsa-miR-146b-5p	dbDEMC	47	hsa-miR-200c-3p	dbDEMC
23	hsa-miR-150-5p	dbDEMC	48	hsa-miR-204-5p	dbDEMC
24	hsa-miR-31-5p	dbDEMC	49	hsa-miR-22-3p	dbDEMC
25	hsa-miR-17-5p	dbDEMC, HMDD	50	hsa-miR-27a-3p	dbDEMC, HMDD

Conclusion

In this study, we develop an end-to-end GCN-based computational approach MAGCN to predict novel MDAs. Different from previous research, our method MAGCN uses LMIs, instead of similarity measurements, to infer associations between miRNAs and diseases. We apply GCN with multichannel attention mechanism and a CNN combiner as encoders for feature learning. A bilinear decoder is used for association inference. Our method can predict not only MDAs but also LMIs. Extensive experiments including cross-validations and case studies demonstrate the effectiveness and superiority of our method.

It should be noted that the LMIs used in our study are limited and incomplete. The predicted results received by our method may therefore be biased. Integrating more experimentally validated LMIs would provide more reliable predictions. Meanwhile, the expression of lncRNAs and miRNAs are always tissue- or disease-specific. The lncRNA–miRNA associations may not be functionally activated under some conditions. Moreover, setting proper values to the hyperparameters in our method to obtain optimal prediction results is a challenging task. Besides pathogenic lncRNA–miRNA co-regulations in disease development, miRNAs have also been discovered to lead to translational inhibition or degradation of their target mRNAs. Incorporating more related biological information would further improve our understanding of the roles of miRNAs in the pathogenesis of human diseases, thus improving the accuracy of MDA predictions.

Key Points

We propose a GCN-based method MAGCN to predict novel MDAs, in which LMIs instead of similarity measurements are used as initial input features.
Using multichannel attention mechanism and CNN combiner, our method can learn complex relationships between graph nodes.
Comprehensive experiments, such as cross-validations and case studies, demonstrate the effectiveness of our method in detecting new MDAs.
Compared with existing well-known approaches, our method MAGCN shows improvement in prediction accuracy.

Data availability

The datasets and source codes used in this study are freely available at https://github.com/shine-lucky/ MAGCN.

Authors’ contribution

H.C. conceived and designed this study. W.W. implemented the experiments. W.W. and H.C. analyzed the results. W.W. and H.C. wrote the manuscript. Both authors read and approved the final manuscript.

Funding

National Natural Science Foundation of China (61862026).

Wengang Wang is a graduate student at School of Software, East China Jiaotong University. His research interest includes deep learning and bioinformatics.

Hailin Chen, PhD, is an associate professor at School of Software, East China Jiaotong University. His research interest includes data mining and bioinformatics.

References

1.

Bartel

DP

.

MicroRNAs: genomics, biogenesis, mechanism, and function

.

Cell

2004

;

116

:

281

–

97

.

2.

Ambros

V

.

The functions of animal microRNAs

.

Nature

2004

;

431

:

350

–

5

.

3.

Carthew

RW

,

Sontheimer

EJ

.

Origins and mechanisms of miRNAs and siRNAs

.

Cell

2009

;

136

:

642

–

55

.

4.

Croce

CM

,

Calin

GA

.

miRNAs, cancer, and stem cell division

.

Cell

2005

;

122

:

6

–

7

.

5.

Lu

M

,

Zhang

Q

,

Deng

M

, et al.

An analysis of human microRNA and disease associations

.

PLoS One

2008

;

3

:

e3420

.

6.

Machová Poláková

K

,

Lopotová

T

,

Klamová

H

, et al.

Expression patterns of microRNAs associated with CML phases and their disease related targets

.

Mol Cancer

2011

;

10

:

1

–

13

.

7.

Le

H-B

,

Zhu

W-Y

,

Chen

D-D

, et al.

Evaluation of dynamic change of serum miR-21 and miR-24 in pre-and post-operative lung carcinoma patients

.

Med Oncol

2012

;

29

:

3190

–

7

.

8.

Pescador

N

,

Pérez-Barba

M

,

Ibarra

JM

, et al.

Serum circulating microRNA profiling for identification of potential type 2 diabetes and obesity biomarkers

.

PLoS One

2013

;

8

:

e77251

.

9.

Chen

H

,

Zhang

Z

.

Similarity-based methods for potential human microRNA-disease association prediction

.

BMC Med Genomics

2013

;

6

:

1

–

9

.

PubMed

10.

Chen

X

,

Yan

G-Y

.

Semi-supervised learning for potential human microRNA-disease associations inference

.

Sci Rep

2014

;

4

:

1

–

10

.

11.

Xuan

P

,

Han

K

,

Guo

Y

, et al.

Prediction of potential disease-associated microRNAs based on random walk

.

Bioinformatics

2015

;

31

:

1805

–

15

.

12.

Luo

J

,

Ding

P

,

Liang

C

, et al.

Collective prediction of disease-associated miRNAs based on transduction learning

.

IEEE/ACM Trans Comput Biol Bioinform

2017

;

14

:

1468

–

75

.

13.

Chen

X

,

Niu

YW

,

Wang

GH

, et al.

HAMDA: hybrid approach for MiRNA-disease association prediction

.

J Biomed Inform

2017

;

76

:

50

–

8

.

14.

Chen

X

,

Wang

L

,

Qu

J

, et al.

Predicting miRNA-disease association based on inductive matrix completion

.

Bioinformatics

2018

;

34

:

4256

–

65

.

PubMed

15.

Zeng

X

,

Liu

L

,

Lu

L

, et al.

Prediction of potential disease-associated microRNAs using structural perturbation method

.

Bioinformatics

2018

;

34

:

2425

–

32

.

16.

Chen

X

,

Yin

J

,

Qu

J

, et al.

MDHGI: matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction

.

PLoS Comput Biol

2018

;

14

:

e1006418

.

17.

Jiang

L

,

Ding

Y

,

Tang

J

, et al.

MDA-SKF: similarity kernel fusion for accurately discovering miRNA-disease association

.

Front Genet

2018

;

9

:

618

.

18.

Zhang

W

,

Li

Z

,

Guo

W

, et al.

A fast linear neighborhood similarity-based network link inference method to predict microRNA-disease associations

.

IEEE/ACM Trans Comput Biol Bioinform

2019

;

18

:

405

–

15

.

Crossref

19.

Xu

J

,

Zhu

W

,

Cai

L

, et al.

LRMCMDA: predicting miRNA-disease association by integrating low-rank matrix completion with miRNA and disease similarity information

.

IEEE Access

2020

;

8

:

80728

–

38

.

Crossref

20.

Chen

X

,

Sun

L-G

,

Zhao

Y

.

NCMCMDA: miRNA–disease association prediction through neighborhood constraint matrix completion

.

Brief Bioinform

2021

;

22

:

485

–

96

.

21.

Chen

H

,

Guo

R

,

Li

G

, et al.

Comparative analysis of similarity measurements in miRNAs with applications to miRNA-disease association predictions

.

BMC Bioinformatics

2020

;

21

:

176

.

22.

Chen

X

,

Huang

L

,

Xie

D

, et al.

EGBMMDA: extreme gradient boosting machine for MiRNA-disease association prediction

.

Cell Death Dis

2018

;

9

:

3

.

23.

Zeng

X

,

Wang

W

,

Deng

G

, et al.

Prediction of potential disease-associated microRNAs by using neural networks

.

Mol Ther Nucleic Acids

2019

;

16

:

566

–

75

.

24.

Chen

X

,

Zhu

CC

,

Yin

J

.

Ensemble of decision tree reveals potential miRNA-disease associations

.

PLoS Comput Biol

2019

;

15

:

e1007209

.

25.

Ji

BY

,

You

ZH

,

Cheng

L

, et al.

Predicting miRNA-disease association from heterogeneous information network with GraRep embedding model

.

Sci Rep

2020

;

10

:

6658

.

26.

Liu

B

,

Zhu

X

,

Zhang

L

, et al.

Combined embedding model for MiRNA-disease association prediction

.

BMC Bioinformatics

2021

;

22

:

161

.

27.

Liu

D

,

Huang

Y

,

Nie

W

, et al.

SMALF: miRNA-disease associations prediction based on stacked autoencoder and XGBoost

.

BMC Bioinformatics

2021

;

22

:

219

.

28.

Tang

X

,

Luo

J

,

Shen

C

, et al.

Multi-view multichannel attention graph convolutional network for miRNA–disease association prediction

.

Brief Bioinform

2021

;

22

:

bbab174

.

29.

Liu

W

,

Lin

H

,

Huang

L

, et al.

Identification of miRNA-disease associations via deep forest ensemble learning based on autoencoder

.

Brief Bioinform

2022

;

23

:bbac104. https://doi.org/10.1093/bib/bbac104.

30.

Yan

C

,

Duan

G

,

Li

N

, et al.

PDMDA: predicting deep-level miRNA–disease associations with graph neural networks and sequence features

.

Bioinformatics

2022

;

38

:

2226

–

34

.

Crossref

31.

Wang

W

,

Chen

H

.

Predicting miRNA-disease associations based on graph attention networks and dual Laplacian regularized least squares

.

Brief Bioinform

2022

;

23

:bbac292.