Prediction of potential miRNA–disease associations based on stacked autoencoder

Performance comparison between SAEMDA and other nine models under 5-fold cross-validation

Prediction model	AUC	Standard deviation
SAEMDA	0.9102	0.0029
PBMDA	0.9172	0.0007
EGBMMDA	0.9048	0.0012
MDHGI	0.8794	0.0021
TLHNMDA	0.8795	0.0010
MCMDA	0.8767	0.0011
MaxFlow	0.8579	0.001
RLSMDA	0.8569	0.0020
HDMP	0.8342	0.0010
WBSMDA	0.8185	0.0009

Prediction model	AUC	Standard deviation
SAEMDA	0.9102	0.0029
PBMDA	0.9172	0.0007
EGBMMDA	0.9048	0.0012
MDHGI	0.8794	0.0021
TLHNMDA	0.8795	0.0010
MCMDA	0.8767	0.0011
MaxFlow	0.8579	0.001
RLSMDA	0.8569	0.0020
HDMP	0.8342	0.0010
WBSMDA	0.8185	0.0009

Table 1

Performance comparison between SAEMDA and other nine models under 5-fold cross-validation

Prediction model	AUC	Standard deviation
SAEMDA	0.9102	0.0029
PBMDA	0.9172	0.0007
EGBMMDA	0.9048	0.0012
MDHGI	0.8794	0.0021
TLHNMDA	0.8795	0.0010
MCMDA	0.8767	0.0011
MaxFlow	0.8579	0.001
RLSMDA	0.8569	0.0020
HDMP	0.8342	0.0010
WBSMDA	0.8185	0.0009

Prediction model	AUC	Standard deviation
SAEMDA	0.9102	0.0029
PBMDA	0.9172	0.0007
EGBMMDA	0.9048	0.0012
MDHGI	0.8794	0.0021
TLHNMDA	0.8795	0.0010
MCMDA	0.8767	0.0011
MaxFlow	0.8579	0.001
RLSMDA	0.8569	0.0020
HDMP	0.8342	0.0010
WBSMDA	0.8185	0.0009

Case studies

In our work, we carried out three different types of case studies to further illustrate the predictive power of SAEMDA. In the first case study, we obtained known associations from HMDD v2.0 database and then verified the predicted results through dbDEMC [43] and miR2Disease [44] database. We chose BN, the most common malignant disease in women, as the investigated disease. BN begins as a local disease and can spread to lymph nodes and other organs [45]. Clinical breast examination is one of the main methods to detect BN and early diagnosis can greatly improve the cure rate of BN [46]. Studies have found that most of BN patients have abnormal miRNA expression [47], implying that miRNA could be a potential biomarker for the diagnosis of BN. For example, Heneghan et al. [48] found that the expression of miR-195 was significantly increased in BN patients. We utilized SAEMDA to reveal more miRNAs related to BN. As a result, 8 out of the top 10 and 41 out of the top 50 potential miRNAs were confirmed based on dbDEMC and miR2Disease databases (Table 2).

Table 2

Validation of the top 50 miRNAs predicted to be associated with BN by SAEMDA based on the known associations in HMDD v2.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-196a	dbDEMC; miR2Disease	hsa-mir-210	dbDEMC; miR2Disease
hsa-mir-1246	unconfirmed	hsa-mir-101	dbDEMC; miR2Disease
hsa-mir-198	dbDEMC	hsa-mir-125a	dbDEMC; miR2Disease
hsa-mir-29a	dbDEMC	hsa-mir-99b	dbDEMC
hsa-mir-205	dbDEMC; miR2Disease	hsa-let-7f	dbDEMC; miR2Disease
hsa-mir-200b	dbDEMC; miR2Disease	hsa-mir-590	dbDEMC
hsa-mir-200c	dbDEMC; miR2Disease	hsa-mir-7	dbDEMC; miR2Disease
hsa-mir-635	unconfirmed	hsa-mir-144	dbDEMC
hsa-mir-27b	dbDEMC	hsa-mir-499a	unconfirmed
hsa-mir-143	dbDEMC; miR2Disease	hsa-mir-141	dbDEMC; miR2Disease
hsa-mir-103a	unconfirmed	hsa-mir-195	dbDEMC; miR2Disease
hsa-mir-19b	dbDEMC	hsa-mir-191	dbDEMC; miR2Disease
hsa-mir-93	dbDEMC	hsa-mir-204	dbDEMC; miR2Disease
hsa-mir-363	dbDEMC	hsa-mir-200a	dbDEMC; miR2Disease
hsa-mir-133a	dbDEMC	hsa-mir-650	dbDEMC
hsa-let-7a	dbDEMC; miR2Disease	hsa-mir-10b	dbDEMC; miR2Disease
hsa-mir-124	dbDEMC	hsa-mir-125b	miR2Disease
hsa-mir-29b	dbDEMC; miR2Disease	hsa-mir-30e	unconfirmed
hsa-mir-30a	miR2Disease	hsa-mir-449a	unconfirmed
hsa-mir-20a	miR2Disease	hsa-mir-1972	unconfirmed
hsa-mir-1273a	unconfirmed	hsa-mir-23b	dbDEMC
hsa-mir-433	dbDEMC	hsa-mir-34b	dbDEMC
hsa-mir-31	dbDEMC; miR2Disease	hsa-mir-95	dbDEMC
hsa-mir-221	dbDEMC; miR2Disease	hsa-mir-1302	unconfirmed
hsa-mir-223	dbDEMC	hsa-mir-505	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-196a	dbDEMC; miR2Disease	hsa-mir-210	dbDEMC; miR2Disease
hsa-mir-1246	unconfirmed	hsa-mir-101	dbDEMC; miR2Disease
hsa-mir-198	dbDEMC	hsa-mir-125a	dbDEMC; miR2Disease
hsa-mir-29a	dbDEMC	hsa-mir-99b	dbDEMC
hsa-mir-205	dbDEMC; miR2Disease	hsa-let-7f	dbDEMC; miR2Disease
hsa-mir-200b	dbDEMC; miR2Disease	hsa-mir-590	dbDEMC
hsa-mir-200c	dbDEMC; miR2Disease	hsa-mir-7	dbDEMC; miR2Disease
hsa-mir-635	unconfirmed	hsa-mir-144	dbDEMC
hsa-mir-27b	dbDEMC	hsa-mir-499a	unconfirmed
hsa-mir-143	dbDEMC; miR2Disease	hsa-mir-141	dbDEMC; miR2Disease
hsa-mir-103a	unconfirmed	hsa-mir-195	dbDEMC; miR2Disease
hsa-mir-19b	dbDEMC	hsa-mir-191	dbDEMC; miR2Disease
hsa-mir-93	dbDEMC	hsa-mir-204	dbDEMC; miR2Disease
hsa-mir-363	dbDEMC	hsa-mir-200a	dbDEMC; miR2Disease
hsa-mir-133a	dbDEMC	hsa-mir-650	dbDEMC
hsa-let-7a	dbDEMC; miR2Disease	hsa-mir-10b	dbDEMC; miR2Disease
hsa-mir-124	dbDEMC	hsa-mir-125b	miR2Disease
hsa-mir-29b	dbDEMC; miR2Disease	hsa-mir-30e	unconfirmed
hsa-mir-30a	miR2Disease	hsa-mir-449a	unconfirmed
hsa-mir-20a	miR2Disease	hsa-mir-1972	unconfirmed
hsa-mir-1273a	unconfirmed	hsa-mir-23b	dbDEMC
hsa-mir-433	dbDEMC	hsa-mir-34b	dbDEMC
hsa-mir-31	dbDEMC; miR2Disease	hsa-mir-95	dbDEMC
hsa-mir-221	dbDEMC; miR2Disease	hsa-mir-1302	unconfirmed
hsa-mir-223	dbDEMC	hsa-mir-505	dbDEMC

Table 2

Validation of the top 50 miRNAs predicted to be associated with BN by SAEMDA based on the known associations in HMDD v2.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-196a	dbDEMC; miR2Disease	hsa-mir-210	dbDEMC; miR2Disease
hsa-mir-1246	unconfirmed	hsa-mir-101	dbDEMC; miR2Disease
hsa-mir-198	dbDEMC	hsa-mir-125a	dbDEMC; miR2Disease
hsa-mir-29a	dbDEMC	hsa-mir-99b	dbDEMC
hsa-mir-205	dbDEMC; miR2Disease	hsa-let-7f	dbDEMC; miR2Disease
hsa-mir-200b	dbDEMC; miR2Disease	hsa-mir-590	dbDEMC
hsa-mir-200c	dbDEMC; miR2Disease	hsa-mir-7	dbDEMC; miR2Disease
hsa-mir-635	unconfirmed	hsa-mir-144	dbDEMC
hsa-mir-27b	dbDEMC	hsa-mir-499a	unconfirmed
hsa-mir-143	dbDEMC; miR2Disease	hsa-mir-141	dbDEMC; miR2Disease
hsa-mir-103a	unconfirmed	hsa-mir-195	dbDEMC; miR2Disease
hsa-mir-19b	dbDEMC	hsa-mir-191	dbDEMC; miR2Disease
hsa-mir-93	dbDEMC	hsa-mir-204	dbDEMC; miR2Disease
hsa-mir-363	dbDEMC	hsa-mir-200a	dbDEMC; miR2Disease
hsa-mir-133a	dbDEMC	hsa-mir-650	dbDEMC
hsa-let-7a	dbDEMC; miR2Disease	hsa-mir-10b	dbDEMC; miR2Disease
hsa-mir-124	dbDEMC	hsa-mir-125b	miR2Disease
hsa-mir-29b	dbDEMC; miR2Disease	hsa-mir-30e	unconfirmed
hsa-mir-30a	miR2Disease	hsa-mir-449a	unconfirmed
hsa-mir-20a	miR2Disease	hsa-mir-1972	unconfirmed
hsa-mir-1273a	unconfirmed	hsa-mir-23b	dbDEMC
hsa-mir-433	dbDEMC	hsa-mir-34b	dbDEMC
hsa-mir-31	dbDEMC; miR2Disease	hsa-mir-95	dbDEMC
hsa-mir-221	dbDEMC; miR2Disease	hsa-mir-1302	unconfirmed
hsa-mir-223	dbDEMC	hsa-mir-505	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-196a	dbDEMC; miR2Disease	hsa-mir-210	dbDEMC; miR2Disease
hsa-mir-1246	unconfirmed	hsa-mir-101	dbDEMC; miR2Disease
hsa-mir-198	dbDEMC	hsa-mir-125a	dbDEMC; miR2Disease
hsa-mir-29a	dbDEMC	hsa-mir-99b	dbDEMC
hsa-mir-205	dbDEMC; miR2Disease	hsa-let-7f	dbDEMC; miR2Disease
hsa-mir-200b	dbDEMC; miR2Disease	hsa-mir-590	dbDEMC
hsa-mir-200c	dbDEMC; miR2Disease	hsa-mir-7	dbDEMC; miR2Disease
hsa-mir-635	unconfirmed	hsa-mir-144	dbDEMC
hsa-mir-27b	dbDEMC	hsa-mir-499a	unconfirmed
hsa-mir-143	dbDEMC; miR2Disease	hsa-mir-141	dbDEMC; miR2Disease
hsa-mir-103a	unconfirmed	hsa-mir-195	dbDEMC; miR2Disease
hsa-mir-19b	dbDEMC	hsa-mir-191	dbDEMC; miR2Disease
hsa-mir-93	dbDEMC	hsa-mir-204	dbDEMC; miR2Disease
hsa-mir-363	dbDEMC	hsa-mir-200a	dbDEMC; miR2Disease
hsa-mir-133a	dbDEMC	hsa-mir-650	dbDEMC
hsa-let-7a	dbDEMC; miR2Disease	hsa-mir-10b	dbDEMC; miR2Disease
hsa-mir-124	dbDEMC	hsa-mir-125b	miR2Disease
hsa-mir-29b	dbDEMC; miR2Disease	hsa-mir-30e	unconfirmed
hsa-mir-30a	miR2Disease	hsa-mir-449a	unconfirmed
hsa-mir-20a	miR2Disease	hsa-mir-1972	unconfirmed
hsa-mir-1273a	unconfirmed	hsa-mir-23b	dbDEMC
hsa-mir-433	dbDEMC	hsa-mir-34b	dbDEMC
hsa-mir-31	dbDEMC; miR2Disease	hsa-mir-95	dbDEMC
hsa-mir-221	dbDEMC; miR2Disease	hsa-mir-1302	unconfirmed
hsa-mir-223	dbDEMC	hsa-mir-505	dbDEMC

In the second case study, we sought to verify the performance of SAEMDA when it was applied to disease without any known associated miRNAs and took LN as the investigated disease. The training data were also collected from HMDD v2.0 database. We removed all association information for LN from the training data to simulate LN as a new disease. LN is one of the malignant tumors with the fastest increase in morbidity and mortality [49]. About 230 000 new cases of LN will be diagnosed in the United States in 2021 [49]. Although great progress has been made in imaging diagnostic techniques at present, there is no desirable method to significantly improve the early detection rate of LN, which makes most patients miss the optimal treatment period [50]. Therefore, it is very important to find an effective method of early screening and diagnosis for LN. Some studies have found that the occurrence of LN is closely related to miRNAs [21]. For example, the miR-17-92 cluster was found to be overexpressed in human LN [51]. Besides, the expression level of miR-224 in non–small cell lung cancer (NSCLC) is higher than that in normal lung tissue and it can promote tumor progression in NSCLC [52]. We trained SAEMDA to infer potential LN-related miRNAs. The validation results showed that all the top 50 predicted miRNAs were confirmed by HMDD v2.0, dbDEMC and miR2Disease (Table 3).

Table 3

Validation of the top 50 miRNAs predicted to be associated with LN by SAEMDA based on the known associations in HMDD v2.0. Especially, LN was considered as new disease by removing association information of LN from HMDD v2.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-21	dbDEMC; miR2Disease; HMDD	hsa-mir-223	HMDD
hsa-mir-155	dbDEMC; miR2Disease; HMDD	hsa-mir-146b	miR2Disease; HMDD
hsa-mir-92a	HMDD	hsa-mir-19a	dbDEMC; miR2Disease; HMDD
hsa-mir-30a	miR2Disease; HMDD	hsa-mir-24	miR2Disease; HMDD
hsa-mir-19b	dbDEMC; HMDD	hsa-mir-125b	miR2Disease; HMDD
hsa-mir-195	dbDEMC; miR2Disease	hsa-mir-181a	dbDEMC; HMDD
hsa-mir-17	miR2Disease; HMDD	hsa-mir-125a	dbDEMC; miR2Disease; HMDD
hsa-mir-29c	dbDEMC; miR2Disease; HMDD	hsa-mir-34a	dbDEMC; HMDD
hsa-mir-210	dbDEMC; miR2Disease; HMDD	hsa-mir-145	dbDEMC; miR2Disease; HMDD
hsa-mir-29a	dbDEMC; miR2Disease; HMDD	hsa-let-7c	dbDEMC; miR2Disease; HMDD
hsa-mir-16	dbDEMC; miR2Disease	hsa-mir-27a	dbDEMC; HMDD
hsa-mir-126	dbDEMC; miR2Disease; HMDD	hsa-mir-15b	dbDEMC
hsa-mir-26a	dbDEMC; miR2Disease; HMDD	hsa-mir-1	dbDEMC; miR2Disease; HMDD
hsa-mir-142	HMDD	hsa-mir-199b	dbDEMC; miR2Disease; HMDD
hsa-mir-29b	dbDEMC; miR2Disease; HMDD	hsa-mir-9	miR2Disease; HMDD
hsa-mir-200c	dbDEMC; miR2Disease; HMDD	hsa-let-7e	miR2Disease; HMDD
hsa-mir-146a	dbDEMC; miR2Disease; HMDD	hsa-mir-22	miR2Disease; HMDD
hsa-mir-150	dbDEMC; miR2Disease; HMDD	hsa-let-7b	miR2Disease; HMDD
hsa-mir-7	miR2Disease; HMDD	hsa-mir-30e	miR2Disease; HMDD
hsa-mir-15a	dbDEMC	hsa-mir-148a	dbDEMC; HMDD
hsa-mir-106b	dbDEMC	hsa-let-7d	dbDEMC; miR2Disease; HMDD
hsa-mir-18a	dbDEMC; miR2Disease; HMDD	hsa-mir-221	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-192	dbDEMC; miR2Disease; HMDD
hsa-let-7a	dbDEMC; miR2Disease; HMDD	hsa-mir-196a	dbDEMC; HMDD
hsa-mir-20a	dbDEMC; miR2Disease; HMDD	hsa-mir-20b	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-21	dbDEMC; miR2Disease; HMDD	hsa-mir-223	HMDD
hsa-mir-155	dbDEMC; miR2Disease; HMDD	hsa-mir-146b	miR2Disease; HMDD
hsa-mir-92a	HMDD	hsa-mir-19a	dbDEMC; miR2Disease; HMDD
hsa-mir-30a	miR2Disease; HMDD	hsa-mir-24	miR2Disease; HMDD
hsa-mir-19b	dbDEMC; HMDD	hsa-mir-125b	miR2Disease; HMDD
hsa-mir-195	dbDEMC; miR2Disease	hsa-mir-181a	dbDEMC; HMDD
hsa-mir-17	miR2Disease; HMDD	hsa-mir-125a	dbDEMC; miR2Disease; HMDD
hsa-mir-29c	dbDEMC; miR2Disease; HMDD	hsa-mir-34a	dbDEMC; HMDD
hsa-mir-210	dbDEMC; miR2Disease; HMDD	hsa-mir-145	dbDEMC; miR2Disease; HMDD
hsa-mir-29a	dbDEMC; miR2Disease; HMDD	hsa-let-7c	dbDEMC; miR2Disease; HMDD
hsa-mir-16	dbDEMC; miR2Disease	hsa-mir-27a	dbDEMC; HMDD
hsa-mir-126	dbDEMC; miR2Disease; HMDD	hsa-mir-15b	dbDEMC
hsa-mir-26a	dbDEMC; miR2Disease; HMDD	hsa-mir-1	dbDEMC; miR2Disease; HMDD
hsa-mir-142	HMDD	hsa-mir-199b	dbDEMC; miR2Disease; HMDD
hsa-mir-29b	dbDEMC; miR2Disease; HMDD	hsa-mir-9	miR2Disease; HMDD
hsa-mir-200c	dbDEMC; miR2Disease; HMDD	hsa-let-7e	miR2Disease; HMDD
hsa-mir-146a	dbDEMC; miR2Disease; HMDD	hsa-mir-22	miR2Disease; HMDD
hsa-mir-150	dbDEMC; miR2Disease; HMDD	hsa-let-7b	miR2Disease; HMDD
hsa-mir-7	miR2Disease; HMDD	hsa-mir-30e	miR2Disease; HMDD
hsa-mir-15a	dbDEMC	hsa-mir-148a	dbDEMC; HMDD
hsa-mir-106b	dbDEMC	hsa-let-7d	dbDEMC; miR2Disease; HMDD
hsa-mir-18a	dbDEMC; miR2Disease; HMDD	hsa-mir-221	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-192	dbDEMC; miR2Disease; HMDD
hsa-let-7a	dbDEMC; miR2Disease; HMDD	hsa-mir-196a	dbDEMC; HMDD
hsa-mir-20a	dbDEMC; miR2Disease; HMDD	hsa-mir-20b	dbDEMC

Table 3

Validation of the top 50 miRNAs predicted to be associated with LN by SAEMDA based on the known associations in HMDD v2.0. Especially, LN was considered as new disease by removing association information of LN from HMDD v2.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-21	dbDEMC; miR2Disease; HMDD	hsa-mir-223	HMDD
hsa-mir-155	dbDEMC; miR2Disease; HMDD	hsa-mir-146b	miR2Disease; HMDD
hsa-mir-92a	HMDD	hsa-mir-19a	dbDEMC; miR2Disease; HMDD
hsa-mir-30a	miR2Disease; HMDD	hsa-mir-24	miR2Disease; HMDD
hsa-mir-19b	dbDEMC; HMDD	hsa-mir-125b	miR2Disease; HMDD
hsa-mir-195	dbDEMC; miR2Disease	hsa-mir-181a	dbDEMC; HMDD
hsa-mir-17	miR2Disease; HMDD	hsa-mir-125a	dbDEMC; miR2Disease; HMDD
hsa-mir-29c	dbDEMC; miR2Disease; HMDD	hsa-mir-34a	dbDEMC; HMDD
hsa-mir-210	dbDEMC; miR2Disease; HMDD	hsa-mir-145	dbDEMC; miR2Disease; HMDD
hsa-mir-29a	dbDEMC; miR2Disease; HMDD	hsa-let-7c	dbDEMC; miR2Disease; HMDD
hsa-mir-16	dbDEMC; miR2Disease	hsa-mir-27a	dbDEMC; HMDD
hsa-mir-126	dbDEMC; miR2Disease; HMDD	hsa-mir-15b	dbDEMC
hsa-mir-26a	dbDEMC; miR2Disease; HMDD	hsa-mir-1	dbDEMC; miR2Disease; HMDD
hsa-mir-142	HMDD	hsa-mir-199b	dbDEMC; miR2Disease; HMDD
hsa-mir-29b	dbDEMC; miR2Disease; HMDD	hsa-mir-9	miR2Disease; HMDD
hsa-mir-200c	dbDEMC; miR2Disease; HMDD	hsa-let-7e	miR2Disease; HMDD
hsa-mir-146a	dbDEMC; miR2Disease; HMDD	hsa-mir-22	miR2Disease; HMDD
hsa-mir-150	dbDEMC; miR2Disease; HMDD	hsa-let-7b	miR2Disease; HMDD
hsa-mir-7	miR2Disease; HMDD	hsa-mir-30e	miR2Disease; HMDD
hsa-mir-15a	dbDEMC	hsa-mir-148a	dbDEMC; HMDD
hsa-mir-106b	dbDEMC	hsa-let-7d	dbDEMC; miR2Disease; HMDD
hsa-mir-18a	dbDEMC; miR2Disease; HMDD	hsa-mir-221	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-192	dbDEMC; miR2Disease; HMDD
hsa-let-7a	dbDEMC; miR2Disease; HMDD	hsa-mir-196a	dbDEMC; HMDD
hsa-mir-20a	dbDEMC; miR2Disease; HMDD	hsa-mir-20b	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-21	dbDEMC; miR2Disease; HMDD	hsa-mir-223	HMDD
hsa-mir-155	dbDEMC; miR2Disease; HMDD	hsa-mir-146b	miR2Disease; HMDD
hsa-mir-92a	HMDD	hsa-mir-19a	dbDEMC; miR2Disease; HMDD
hsa-mir-30a	miR2Disease; HMDD	hsa-mir-24	miR2Disease; HMDD
hsa-mir-19b	dbDEMC; HMDD	hsa-mir-125b	miR2Disease; HMDD
hsa-mir-195	dbDEMC; miR2Disease	hsa-mir-181a	dbDEMC; HMDD
hsa-mir-17	miR2Disease; HMDD	hsa-mir-125a	dbDEMC; miR2Disease; HMDD
hsa-mir-29c	dbDEMC; miR2Disease; HMDD	hsa-mir-34a	dbDEMC; HMDD
hsa-mir-210	dbDEMC; miR2Disease; HMDD	hsa-mir-145	dbDEMC; miR2Disease; HMDD
hsa-mir-29a	dbDEMC; miR2Disease; HMDD	hsa-let-7c	dbDEMC; miR2Disease; HMDD
hsa-mir-16	dbDEMC; miR2Disease	hsa-mir-27a	dbDEMC; HMDD
hsa-mir-126	dbDEMC; miR2Disease; HMDD	hsa-mir-15b	dbDEMC
hsa-mir-26a	dbDEMC; miR2Disease; HMDD	hsa-mir-1	dbDEMC; miR2Disease; HMDD
hsa-mir-142	HMDD	hsa-mir-199b	dbDEMC; miR2Disease; HMDD
hsa-mir-29b	dbDEMC; miR2Disease; HMDD	hsa-mir-9	miR2Disease; HMDD
hsa-mir-200c	dbDEMC; miR2Disease; HMDD	hsa-let-7e	miR2Disease; HMDD
hsa-mir-146a	dbDEMC; miR2Disease; HMDD	hsa-mir-22	miR2Disease; HMDD
hsa-mir-150	dbDEMC; miR2Disease; HMDD	hsa-let-7b	miR2Disease; HMDD
hsa-mir-7	miR2Disease; HMDD	hsa-mir-30e	miR2Disease; HMDD
hsa-mir-15a	dbDEMC	hsa-mir-148a	dbDEMC; HMDD
hsa-mir-106b	dbDEMC	hsa-let-7d	dbDEMC; miR2Disease; HMDD
hsa-mir-18a	dbDEMC; miR2Disease; HMDD	hsa-mir-221	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-192	dbDEMC; miR2Disease; HMDD
hsa-let-7a	dbDEMC; miR2Disease; HMDD	hsa-mir-196a	dbDEMC; HMDD
hsa-mir-20a	dbDEMC; miR2Disease; HMDD	hsa-mir-20b	dbDEMC

In the third case study, to demonstrate the generalization ability of SAEMDA on different datasets, we obtained the training data from HMDD v1.0 containing 1395 known associations between 271 miRNAs and 137 diseases. EN was selected for the case study. EN is one of the most high-risk cancers in the world and its mortality rate ranks sixth among all cancers [53]. During recent years, the incidence of EN in Asia has gradually increased [54]. Although chemotherapy, radiotherapy and other technologies are developing rapidly, they cannot provide satisfactory treatment for advanced EN patients [54]. Therefore, identifying biomarkers of EN for early diagnosis will make a significant impact on the prospects for diagnosis and treatment of EN. Current studies show that the occurrence, development and prognosis of EN are related to the abnormal regulation of miRNAs [55]. For example, miR-377 can suppress initiation and progression of EN by inhibiting CD133 and VEGF [56]. In addition, miR-296 was overexpressed in esophageal squamous cell cancer tissues and downregulation of miR-296 can suppress growth of EN cells [57]. Here, we employed SAEMDA to predict EN-associated miRNAs based on known associations in HMDD v1.0. As a result, 45 out of the top 50 predicted miRNAs were verified by HMDD v2.0, dbDEMC and miR2Disease databases (Table 4).

Table 4

Validation of the top 50 miRNAs predicted to be associated with EN by SAEMDA based on the known associations in HMDD v1.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-155	dbDEMC; HMDD	hsa-mir-208b	unconfirmed
hsa-mir-365	unconfirmed	hsa-mir-92b	dbDEMC
hsa-mir-448	dbDEMC	hsa-mir-200b	dbDEMC
hsa-mir-221	dbDEMC	hsa-let-7d	dbDEMC
hsa-mir-146a	dbDEMC; HMDD	hsa-let-7i	dbDEMC
hsa-let-7c	dbDEMC; HMDD	hsa-mir-29a	dbDEMC
hsa-mir-222	dbDEMC	hsa-mir-181b	dbDEMC
hsa-mir-20a	dbDEMC; HMDD	hsa-mir-181a	dbDEMC
hsa-mir-92a	HMDD	hsa-let-7 g	dbDEMC
hsa-mir-514	unconfirmed	hsa-mir-125b	dbDEMC
hsa-mir-338	dbDEMC	hsa-mir-210	dbDEMC; HMDD
hsa-mir-137	dbDEMC	hsa-mir-141	dbDEMC; HMDD
hsa-mir-18a	dbDEMC	hsa-mir-300	unconfirmed
hsa-mir-145	dbDEMC; HMDD	hsa-mir-383	dbDEMC
hsa-mir-423	dbDEMC	hsa-mir-515	unconfirmed
hsa-mir-19a	dbDEMC; HMDD	hsa-mir-602	dbDEMC
hsa-mir-29c	dbDEMC; HMDD	hsa-mir-196b	dbDEMC; HMDD
hsa-mir-199b	dbDEMC	hsa-mir-135b	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-206	dbDEMC
hsa-let-7b	dbDEMC; HMDD	hsa-mir-127	dbDEMC
hsa-mir-520b	dbDEMC	hsa-mir-98	dbDEMC; HMDD
hsa-mir-335	dbDEMC	hsa-mir-9	dbDEMC
hsa-mir-330	dbDEMC	hsa-mir-373	dbDEMC; miR2Disease
hsa-mir-223	dbDEMC; miR2Disease; HMDD	hsa-mir-132	dbDEMC
hsa-mir-34a	dbDEMC; HMDD	hsa-mir-134	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-155	dbDEMC; HMDD	hsa-mir-208b	unconfirmed
hsa-mir-365	unconfirmed	hsa-mir-92b	dbDEMC
hsa-mir-448	dbDEMC	hsa-mir-200b	dbDEMC
hsa-mir-221	dbDEMC	hsa-let-7d	dbDEMC
hsa-mir-146a	dbDEMC; HMDD	hsa-let-7i	dbDEMC
hsa-let-7c	dbDEMC; HMDD	hsa-mir-29a	dbDEMC
hsa-mir-222	dbDEMC	hsa-mir-181b	dbDEMC
hsa-mir-20a	dbDEMC; HMDD	hsa-mir-181a	dbDEMC
hsa-mir-92a	HMDD	hsa-let-7 g	dbDEMC
hsa-mir-514	unconfirmed	hsa-mir-125b	dbDEMC
hsa-mir-338	dbDEMC	hsa-mir-210	dbDEMC; HMDD
hsa-mir-137	dbDEMC	hsa-mir-141	dbDEMC; HMDD
hsa-mir-18a	dbDEMC	hsa-mir-300	unconfirmed
hsa-mir-145	dbDEMC; HMDD	hsa-mir-383	dbDEMC
hsa-mir-423	dbDEMC	hsa-mir-515	unconfirmed
hsa-mir-19a	dbDEMC; HMDD	hsa-mir-602	dbDEMC
hsa-mir-29c	dbDEMC; HMDD	hsa-mir-196b	dbDEMC; HMDD
hsa-mir-199b	dbDEMC	hsa-mir-135b	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-206	dbDEMC
hsa-let-7b	dbDEMC; HMDD	hsa-mir-127	dbDEMC
hsa-mir-520b	dbDEMC	hsa-mir-98	dbDEMC; HMDD
hsa-mir-335	dbDEMC	hsa-mir-9	dbDEMC
hsa-mir-330	dbDEMC	hsa-mir-373	dbDEMC; miR2Disease
hsa-mir-223	dbDEMC; miR2Disease; HMDD	hsa-mir-132	dbDEMC
hsa-mir-34a	dbDEMC; HMDD	hsa-mir-134	dbDEMC

Table 4

Open in new tab Download slide

Validation of the top 50 miRNAs predicted to be associated with EN by SAEMDA based on the known associations in HMDD v1.0. The first column records the top 1–25 predicted miRNAs and the third column records the 26–50 predicted miRNAs

miRNA	Evidence	miRNA	Evidence
hsa-mir-155	dbDEMC; HMDD	hsa-mir-208b	unconfirmed
hsa-mir-365	unconfirmed	hsa-mir-92b	dbDEMC
hsa-mir-448	dbDEMC	hsa-mir-200b	dbDEMC
hsa-mir-221	dbDEMC	hsa-let-7d	dbDEMC
hsa-mir-146a	dbDEMC; HMDD	hsa-let-7i	dbDEMC
hsa-let-7c	dbDEMC; HMDD	hsa-mir-29a	dbDEMC
hsa-mir-222	dbDEMC	hsa-mir-181b	dbDEMC
hsa-mir-20a	dbDEMC; HMDD	hsa-mir-181a	dbDEMC
hsa-mir-92a	HMDD	hsa-let-7 g	dbDEMC
hsa-mir-514	unconfirmed	hsa-mir-125b	dbDEMC
hsa-mir-338	dbDEMC	hsa-mir-210	dbDEMC; HMDD
hsa-mir-137	dbDEMC	hsa-mir-141	dbDEMC; HMDD
hsa-mir-18a	dbDEMC	hsa-mir-300	unconfirmed
hsa-mir-145	dbDEMC; HMDD	hsa-mir-383	dbDEMC
hsa-mir-423	dbDEMC	hsa-mir-515	unconfirmed
hsa-mir-19a	dbDEMC; HMDD	hsa-mir-602	dbDEMC
hsa-mir-29c	dbDEMC; HMDD	hsa-mir-196b	dbDEMC; HMDD
hsa-mir-199b	dbDEMC	hsa-mir-135b	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-206	dbDEMC
hsa-let-7b	dbDEMC; HMDD	hsa-mir-127	dbDEMC
hsa-mir-520b	dbDEMC	hsa-mir-98	dbDEMC; HMDD
hsa-mir-335	dbDEMC	hsa-mir-9	dbDEMC
hsa-mir-330	dbDEMC	hsa-mir-373	dbDEMC; miR2Disease
hsa-mir-223	dbDEMC; miR2Disease; HMDD	hsa-mir-132	dbDEMC
hsa-mir-34a	dbDEMC; HMDD	hsa-mir-134	dbDEMC

miRNA	Evidence	miRNA	Evidence
hsa-mir-155	dbDEMC; HMDD	hsa-mir-208b	unconfirmed
hsa-mir-365	unconfirmed	hsa-mir-92b	dbDEMC
hsa-mir-448	dbDEMC	hsa-mir-200b	dbDEMC
hsa-mir-221	dbDEMC	hsa-let-7d	dbDEMC
hsa-mir-146a	dbDEMC; HMDD	hsa-let-7i	dbDEMC
hsa-let-7c	dbDEMC; HMDD	hsa-mir-29a	dbDEMC
hsa-mir-222	dbDEMC	hsa-mir-181b	dbDEMC
hsa-mir-20a	dbDEMC; HMDD	hsa-mir-181a	dbDEMC
hsa-mir-92a	HMDD	hsa-let-7 g	dbDEMC
hsa-mir-514	unconfirmed	hsa-mir-125b	dbDEMC
hsa-mir-338	dbDEMC	hsa-mir-210	dbDEMC; HMDD
hsa-mir-137	dbDEMC	hsa-mir-141	dbDEMC; HMDD
hsa-mir-18a	dbDEMC	hsa-mir-300	unconfirmed
hsa-mir-145	dbDEMC; HMDD	hsa-mir-383	dbDEMC
hsa-mir-423	dbDEMC	hsa-mir-515	unconfirmed
hsa-mir-19a	dbDEMC; HMDD	hsa-mir-602	dbDEMC
hsa-mir-29c	dbDEMC; HMDD	hsa-mir-196b	dbDEMC; HMDD
hsa-mir-199b	dbDEMC	hsa-mir-135b	dbDEMC; HMDD
hsa-mir-23b	dbDEMC	hsa-mir-206	dbDEMC
hsa-let-7b	dbDEMC; HMDD	hsa-mir-127	dbDEMC
hsa-mir-520b	dbDEMC	hsa-mir-98	dbDEMC; HMDD
hsa-mir-335	dbDEMC	hsa-mir-9	dbDEMC
hsa-mir-330	dbDEMC	hsa-mir-373	dbDEMC; miR2Disease
hsa-mir-223	dbDEMC; miR2Disease; HMDD	hsa-mir-132	dbDEMC
hsa-mir-34a	dbDEMC; HMDD	hsa-mir-134	dbDEMC

Discussion

Predicting potential miRNA–disease associations enables researchers to better understand the mechanisms of diseases and promotes the diagnosis, treatment and prognosis of complex diseases. In this study, we developed SAEMDA that can be an effective supplement to traditional biological experimental methods. In SAEMDA, all miRNA–disease samples were used to pretrain an SAE. Then, the SAE was fine-tuned with the positive samples and the same number of negative samples. SAEMDA obtained better performance than other models in three types of cross validation. SAEMDA is superior to previous methods mainly because it makes full use of the information of all unlabeled samples in the training process. In addition, the results of three kinds of case studies further illustrated the reliable prediction performance of SAEMDA. In addition to miRNA–disease association prediction, there are many important link prediction problems in the field of bioinformatics, such as lncRNA–disease association prediction [58], circular RNA (circRNA)–disease association prediction [59] and protein–protein interaction prediction [60]. In the task of miRNA–disease association prediction, SAEMDA shows good performance. Therefore, the framework of SAEMDA could be considered to be utilized to solve above link prediction problems.

The reliable performance of SAEMDA was due to the following aspects. Firstly, the data used in our study contain 189 585 miRNA–disease pairs for 495 miRNAs and 383 diseases, with only 5430 known associations. SAEMDA was especially suitable for the dataset composed of a large amount of unlabeled data and a small amount of labeled data, because SAEMDA adopts a combination of unsupervised pretraining and supervised fine-tuning. The pretraining process enabled the model to learn the features of all miRNA–disease pairs and made up for the defect that traditional supervised learning model only can be trained with label samples. Besides, fine-tuning process enabled the model to learn label information of a small amount of labeled data for further performance improvement. Secondly, SAEMDA integrated diverse similarity networks so that the features could better capture the information of all miRNA–disease pairs. Finally, we selected Adam optimizer in the training process of SAEMDA, as it is more efficient than traditional Stochastic Gradient Descent (SGD) optimizer.

However, SAEMDA still has some limitations. Firstly, hyperparameter of neural networks (such as the number of hidden layers and the number of neurons per layer) was not well determined. Secondly, SAEMDA obtained larger standard deviation than comparison models in 100 times of 5-fold cross validation. Therefore, SAEMDA was slightly inferior to other models in terms of stability, which is a common problem in deep learning. Thirdly, positive and negative samples are needed in the process of fine-tuning, but randomly selecting unlabeled samples as negative samples would bring inaccurate information. Finally, there is room for improvement in splicing similarity of disease and miRNA as features of disease–miRNA pair. Therefore, how to construct and extract reliable features of miRNA–disease pairs would be a future research direction of prediction method design. Besides, it is necessary to design appropriate methods to change the way of negative sample selection. Clustering algorithm could be considered to be used in the process of negative sample selection [61–63]. In addition, it may be an important direction to design effective methods to introduce other biological information to help predict potential miRNA–disease associations.

Materials and methods

Materials

First, the human miRNA–disease associations were obtained from HMDD v2.0 database [42]. Specifically, there were 495 miRNAs, 383 diseases and 5430 experimentally verified miRNA–diseases associations. We used nd and nm to represent the number of diseases and miRNAs, respectively. The adjacency matrix A with the size of nm×nd was utilized to represent all miRNA–disease pairs. A(i,j) is equal to 1 if miRNA m(i) is related to disease d(j); otherwise, it is 0. Besides, the miRNA functional similarity scores were calculated in previous study [64] and can be downloaded from http://www.cuilab.cn/files/images/cuilab/misim.zip. The matrix FS was utilized to denote miRNA functional similarity matrix. In addition, we described the relationships between two diseases through the Directed Acyclic Graph (DAG) and used two different methods to calculate disease semantic similarity according to previous study [32]. Based on the assumption that the greater the common part of the DAGs of two diseases, the greater the semantic similarity value, we calculated the first type of disease semantic similarity matrix SS1 through the method in previous study [32]. Because the different disease terms in the same layer of DAG should have different contributions to the semantic value of investigated disease, we redefined semantic contribution of per disease term and further calculated the second type of disease semantic similarity matrix SS2 [32]. Furthermore, based on the assumption that similar diseases (miRNAs) have similar pattern of interaction and noninteraction with the miRNAs (diseases), we calculated Gaussian interaction profile kernel similarity matrix KD (KM) of diseases (miRNAs) according to the previous method [65]. It should be noted that in each turn of LOOCV and 5-fold cross validation, KD and KM would be recalculated based on all known association information except the test sample. Finally, we combined Gaussian interaction profile kernel similarity of miRNAs with miRNA functional similarity to get the integrated miRNA similarity matrix SM according to the method in previous study [22] as follows:

$$\begin{equation} \mathrm{SM}\left(m(i),m(j)\right)=\left\{\begin{array}{l}\mathrm{FS}\left(m(i),m(j)\right),\kern0.6em m(i)\kern0.2em \mathrm{and}\kern0.2em m(j)\kern0.2em \mathrm{has}\\ \kern0.2em \mathrm{functional}\kern0.34em \mathrm{similarity}\\{}\mathrm{KM}\left(m(i),m(j)\right)\kern-2pt,\kern0.4em \mathrm{otherwise}\end{array}\right. \end{equation}$$

(1)

Similarly, we also calculated the integrated disease similarity matrix SD by integrating Gaussian interaction profile kernel similarity of disease and two kinds of disease semantic similarity.

$$\begin{equation} \mathrm{SD}\left(d(i),d(j)\right)=\left\{\begin{array}{@{}l}\displaystyle\frac{SS1\left(d(i),d(j)\right)+ SS2\left(d(i),d(j)\right)}{2},\kern0.2em d(i)\kern0.2em \mathrm{and}\kern0.2em d(j)\\ \kern0.2em \mathrm{has}\kern0.2em \mathrm{semantic}\kern0.34em \mathrm{similarity}\\{}\mathrm{KD}\left(d(i),d(j)\right),\kern0.4em \mathrm{otherwise}\end{array}\right. \end{equation}$$

(2)

Figure 2

Flowchart of SAEMDA to predict potential miRNA–disease associations.

SAEMDA

In this study, we proposed a new model named SAEMDA to predict potential miRNA–disease associations. The flowchart of SAEMDA is depicted in Figure 2. The first step of SAEMDA is data preparation, which is to denote the miRNA–disease pairs as feature vectors. As presented in previous sections, we constructed the adjacency matrix A of miRNA–disease pairs (nm × nd), the integrated miRNA similarity matrix SM (nm × nm) and the integrated disease similarity matrix SD (nd × nd). From them, nm and nd features were extracted for each miRNA and disease, respectively. Concatenating the feature vectors of the investigated disease and miRNA yielded nm + nd features for each miRNA–disease pair. Among all miRNA–disease pairs, a total of 5430 pairs were known associations and the remaining miRNA–disease pairs were unlabeled.

The second step of SAEMDA is the unsupervised pretraining of SAE based on all miRNA–disease pairs. The deep learning model of SAE can be constructed by stacking several autoencoders (AEs) [66]. An AE is composed of an encoder and a decoder. The encoder learns new representation by mapping input features from the input layer to the hidden layer, while the decoder reconstructs the original inputs from the hidden layer to the output layer. In addition, the input layer and the output layer have the same number of neurons. The AE can reduce the dimensionality of the original data. After inputting the feature vector X of training sample to AE, the representation of the hidden layer was defined as follows:

$$\begin{equation} Y=\sigma \left( WX+b\right) \end{equation}$$

(3)

where |$\sigma$|⁠, W and b represented the activation function (tanh in our study), the weight matrix and the bias vector of the encoder, respectively. Then, the output |${X}^{\prime}$| with the same shape as X was reconstructed based on representation of the hidden layer as follows:

$$\begin{equation} {X}^{\prime}=\sigma \left({W}^{\prime}Y+{b}^{\prime}\right) \end{equation}$$

(4)

where |${W}^{\prime }$| and |${b}^{\prime }$| denoted the weight matrix and the bias vector of the decoder. Next, the AE was trained to minimize the reconstruction cost based on Adam optimizer:

$$\begin{equation} L\left(X,{X}^{\prime}\right)={\left\Vert X-{X}^{\prime}\right\Vert}^2={\left\Vert X-\sigma \left({W}^{\prime}\left(\sigma \left( WX+b\right)\right)+{b}^{\prime}\right)\right\Vert}^2\end{equation}$$

(5)

In this study, SAE was constructed by stacking three AEs according to previous research [41]. The unsupervised pretraining of SAE was carried out as follows:

An AE was trained using the feature vectors of all miRNA–disease pairs.
The decoder layer was removed from the AE. Then, a new AE was constructed with the feature vectors generated by the first AE as input.
The new AE was trained, while weights and bias of the previously trained AE remain unchanged.
Repeated steps 2 and 3 until three AEs are stacked.

After the unsupervised pretraining, we obtained the weight matrices W1, W2 and W3 as well as the bias vectors of b1, b2 and b3 of SAE. Then, the third step of SAEMDA is supervised fine-tuning of SAE based on positive and negative samples. Here, the 5430 known miRNA–disease associations were taken as positive samples. In addition, 5430 negative samples were randomly selected from the unlabeled miRNA–disease pairs. The fine-tuning process contained the following steps:

An output layer was added into the SAE obtained in the pretraining process. Here, the weight matrix W4 and bias vector b4 between the output layer and previous layer were randomly initialized.
Positive samples and the same number of selected negative samples were used to train the SAE.

Finally, the trained SAE can be used to predict potential miRNA–disease associations. It is worth noting that SAEMDA used the tanh activation function in each hidden layer and the softmax classifier in the output layer. Besides, cross entropy was used as loss function in the fine-tuning process and Adam optimizer was utilized to optimize SAE. In addition, we set the number of hidden layers of three AE as 512, 256 and 128, respectively. After setting the hyperparameters of the model, we trained SAEMDA with a learning rate of 0.0001 to obtain the final miRNA–disease association score.

Key Points

SAEMDA was especially suitable for the dataset composed of a large amount of unlabeled data and a small amount of labeled data.
SAEMDA integrated diverse similarity networks. Therefore, the features could better capture the information of all miRNA–disease pairs.
We selected Adam optimizer in the training process of SAEMDA, as it is more efficient than traditional Stochastic Gradient Descent (SGD) optimizer.
Leave-one-out cross validation and case studies were implemented to evaluate the prediction performance of SAEMDA.

Data availability

We provided the python code and data for SAEMDA at https://github.com/xpnbs/SAEMDA.

Funding

This work was supported by Fundamental Research Funds for the Central Universities (2019ZDPY01).

Chun-Chun Wang is a PhD student of School of Information and Control Engineering, China University of Mining and Technology. His research interests include bioinformatics, complex network algorithm and machine learning.

Tian-Hao Li is a master’s student of School of Information and Control Engineering, China University of Mining and Technology. His research interests include bioinformatics and machine learning.

Li Huang is a PhD student of Academy of Arts and Design, Tsinghua University. His research interests include bioinformatics, complex network algorithm and machine learning.

Xing Chen, PhD, is a professor of China University of Mining and Technology. He is the associate dean of Artificial Intelligence Research Institute, China University of Mining and Technology. He is also the founding director of Institute of Bioinformatics, China University of Mining and Technology and Big Data Research Center, China University of Mining and Technology. His research interests include complex disease-related noncoding RNA biomarker prediction, computational models for drug discovery and early detection of human complex disease based on big data and artificial intelligence algorithms.

References

1.

Ambros

V

.

microRNAs: tiny regulators with great potential

.

Cell

2001

;

107

:

823

–

6

.

2.

Bartel

DP

.

MicroRNAs: genomics, biogenesis, mechanism, and function

.

Cell

2004

;

116

:

281

–

97

.

3.

Xiao

C

,

Calado

DP

,

Galler

G

, et al.

MiR-150 controls B cell differentiation by targeting the transcription factor c-Myb

.

Cell

2007

;

131

:

146

–

59

.

4.

Johnnidis

JB

,

Harris

MH

,

Wheeler

RT

, et al.

Regulation of progenitor cell proliferation and granulocyte function by microRNA-223

.

Nature

2008

;

451

:

1125

–

9

.

5.

Kim Jin

H

,

Woo Hye

R

,

Kim

J

, et al.

Trifurcate feed-forward regulation of age-dependent cell death involving miR164 in Arabidopsis

.

Science

2009

;

323

:

1053

–

7

.

PubMed

6.

Mendell Joshua

T

,

Olson

EN

.

MicroRNAs in stress signaling and human disease

.

Cell

2012

;

148

:

1172

–

87

.

7.

Lu

J

,

Getz

G

,

Miska

EA

, et al.

MicroRNA expression profiles classify human cancers

.

Nature

2005

;

435

:

834

–

8

.

8.

Esquela-Kerscher

A

,

Slack

FJ

.

Oncomirs - microRNAs with a role in cancer

.

Nat Rev Cancer

2006

;

6

:

259

–

69

.

9.

Latronico

MV

,

Catalucci

D

,

Condorelli

G

.

Emerging role of microRNAs in cardiovascular biology

.

Circ Res

2007

;

101

:

1225

–

36

.

10.

Krutzfeldt

J

,

Stoffel

M

.

MicroRNAs: a new class of regulatory genes affecting metabolism

.

Cell Metab

2006

;

4

:

9

–

12

.

11.

Barwari

T

,

Joshi

A

,

Mayr

M

.

MicroRNAs in cardiovascular disease

.

J Am Coll Cardiol

2016

;

68

:

2577

–

84

.

12.

Szabo

G

,

Bala

S

.

MicroRNAs in liver disease

.

Nat Rev Gastroenterol Hepatol

2013

;

10

:

542

–

52

.

13.

He

L

,

Thomson

JM

,

Hemann

MT

, et al.

A microRNA polycistron as a potential human oncogene

.

Nature

2005

;

435

:

828

–

33

.

14.

Shimono

Y

,

Zabala

M

,

Cho

RW

, et al.

Downregulation of miRNA-200c links breast cancer stem cells with normal stem cells

.

Cell

2009

;

138

:

592

–

603

.

15.

Ng

EK

,

Chong

WW

,

Jin

H

, et al.

Differential expression of microRNAs in plasma of patients with colorectal cancer: a potential marker for colorectal cancer screening

.

Gut

2009

;

58

:

1375

–

81

.

16.

Hu

Z

,

Chen

X

,

Zhao

Y

, et al.

Serum microRNA signatures identified in a genome-wide serum microRNA expression profiling predict survival of non-small-cell lung cancer

.

J Clin Oncol

2010

;

28

:

1721

–

6

.

17.

Calin

GA

,

Croce

CM

.

MicroRNA signatures in human cancers

.

Nat Rev Cancer

2006

;

6

:

857

–

66

.

18.

Slack

FJ

,

Weidhaas

JB

.

MicroRNA in cancer prognosis

.

N Engl J Med

2008

;

359

:

2720

–

2

.

19.

Bouchie

A

.

First microRNA mimic enters clinic

.

Nat Biotechnol

2013

;

31

:

577

–

7

.

20.

Jiang

Q

,

Hao

Y

,

Wang

G

, et al.

Prioritization of disease microRNAs through a human phenome-microRNAome network

.

BMC Syst Biol

2010

;

4

:

S2

.

21.

Chen

X

,

Xie

D

,

Zhao

Q

, et al.

MicroRNAs and complex diseases: from experimental results to computational models

.

Brief Bioinform

2019

;

20

:

515

–

39

.

22.

Chen

X

,

Yan

CC

,

Zhang

X

, et al.

WBSMDA: within and between score for MiRNA-disease association prediction

.

Sci Rep

2016

;

6

:

21106

.

23.

Mork

S

,

Pletscher-Frankild

S

,

Palleja Caro

A

, et al.

Protein-driven inference of miRNA-disease associations

.

Bioinformatics

2014

;

30

:

392

–

7

.

24.

Chen

X

,

Liu

MX

,

Yan

GY

.

RWRMDA: predicting novel human microRNA-disease associations

.

Mol Biosyst

2012

;

8

:

2792

–

8

.

25.

Shi

H

,

Xu

J

,

Zhang

G

, et al.

Walking the interactome to identify human miRNA-disease associations through the functional link between miRNA targets and disease genes

.

BMC Syst Biol

2013

;

7

:

101

.

26.

Xuan

P

,

Han

K

,

Guo

Y

, et al.

Prediction of potential disease-associated microRNAs based on random walk

.

Bioinformatics

2015

;

31

:

1805

–

15

.

27.

Yu

H

,

Chen

X

,

Lu

L

.

Large-scale prediction of microRNA-disease associations by combinatorial prioritization algorithm

.

Sci Rep

2017

;

7

:

43792

.

28.

Chen

X

,

Yan

CC

,

Zhang

X

, et al.

HGIMDA: heterogeneous graph inference for miRNA-disease association prediction

.

Oncotarget

2016

;

7

:

65257

–

69

.

29.

You

ZH

,

Huang

ZA

,

Zhu

Z

, et al.

PBMDA: a novel and effective path-based computational model for miRNA-disease association prediction

.

PLoS Comput Biol

2017

;

13

:e1005455.

30.

Chen

X

,

Qu

J

,

Yin

J

.

TLHNMDA: triple layer heterogeneous network based inference for MiRNA-disease association prediction

.

Front Genet

2018

;

9

:

234

.

31.

Chen

X

,

Yin

J

,

Qu

J

, et al.

MDHGI: matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction

.

PLoS Comput Biol

2018

;

14

:e1006418.

32.

Xuan

P

,

Han

K

,

Guo

M

, et al.

Prediction of microRNAs associated with human diseases based on weighted k most similar neighbors

.

PLoS One

2013

;

8

:e70204.

33.

Chen

X

,

Yan

GY

.

Semi-supervised learning for potential human microRNA-disease associations inference

.

Sci Rep

2014

;

4

:

5501

.

34.

Chen

X

,

Yan

CC

,

Zhang

X

, et al.

RBMMMDA: predicting multiple types of disease-microRNA associations

.

Sci Rep

2015

;

5

:

13877

.

35.

Pasquier

C

,

Gardes

J

.

Prediction of miRNA-disease associations with a vector space model

.

Sci Rep

2016

;

6

:

27036

.

36.

Li

JQ

,

Rong

ZH

,

Chen

X

, et al.

MCMDA: matrix completion for MiRNA-disease association prediction

.

Oncotarget

2017

;

8

:

21187

–

99

.

37.

Chen

X

,

Wu

QF

,

Yan

GY

.

RKNNMDA: ranking-based KNN for MiRNA-disease association prediction

.

RNA Biol

2017

;

14

:

952

–

62

.

38.

Chen

X

,

Huang

L

,

Xie

D

, et al.

EGBMMDA: extreme gradient boosting machine for MiRNA-disease association prediction

.

Cell Death Dis

2018

;

9

:3.

39.

Zhu

C-C

,

Wang

C-C

,

Zhao

Y

, et al.

Identification of miRNA–disease associations via multiple information integration with Bayesian ranking

.

Brief Bioinform

2021

;

22

:

bbab302

.

40.

LeCun

Y

,

Bengio

Y

,

Hinton

G

.

Deep learning

.

Nature

2015

;

521

:

436

–

44

.

41.

Bahi

M

,

Batouche

M

. Deep semi-supervised learning for DTI prediction using large datasets and H2O-spark platform. In:

2018 International Conference on Intelligent Systems and Computer Vision (ISCV). Fez, Morocco

,

2018

, p.

1

–

7

.

IEEE

,

New York, NY, USA

.

42.

Li

Y

,

Qiu

C

,

Tu

J

, et al.

HMDD v2.0: a database for experimentally supported human microRNA and disease associations

.

Nucleic Acids Res

2014

;

42

:

D1070

–

4

.

43.

Yang

Z

,

Ren

F

,

Liu

C

, et al.

dbDEMC: a database of differentially expressed miRNAs in human cancers

.

BMC Genomics

2010

;

11

:

S5

.

44.

Jiang

Q

,

Wang

Y

,

Hao

Y

, et al.

miR2Disease: a manually curated database for microRNA deregulation in human disease

.

Nucleic Acids Res

2009

;

37

:

D98

–

104

.

45.

Ma

L

.

Determinants of breast cancer progression

.

Sci Transl Med

2014

;

6

:243fs225.

46.

Elmore

JG

,

Armstrong

K

,

Lehman

CD

, et al.

Screening for breast cancer

.

JAMA

2005

;

293

:

1245

–

56

.

47.

Mulrane

L

,

McGee

SF

,

Gallagher

WM

, et al.

miRNA dysregulation in breast cancer

.

Cancer Res

2013

;

73

:

6554

–

62

.

48.

Heneghan

HM

,

Miller

N

,

Lowery

AJ

, et al.

Circulating microRNAs as novel minimally invasive biomarkers for breast cancer

.

Ann Surg

2010

;

251

:

499

–

505

.

49.

Siegel

RL

,

Miller

KD

,

Fuchs

HE

, et al.

Cancer statistics, 2021

.

CA Cancer J Clin

2021

;

71

:

7

–

33

.

50.

Hirsch

FR

,

Scagliotti

GV

,

Mulshine

JL

, et al.

Lung cancer: current therapies and new targeted treatments

.

Lancet

2017

;

389

:

299

–

311

.

51.

Hayashita

Y

,

Osada

H

,

Tatematsu

Y

, et al.

A polycistronic microRNA cluster, miR-17-92, is overexpressed in human lung cancers and enhances cell proliferation

.

Cancer Res

2005

;

65

:

9628

–

32

.

52.

Cui

R

,

Meng

W

,

Sun

HL

, et al.

MicroRNA-224 promotes tumor progression in nonsmall cell lung cancer

.

Proc Natl Acad Sci U S A

2015

;

112

:

E4288

–

97

.

53.

Pennathur

A

,

Gibson

MK

,

Jobe

BA

, et al.

Oesophageal carcinoma

.

Lancet

2013

;

381

:

400

–

12

.

54.

El-Serag

HB

,

Sweet

S

,

Winchester

CC

, et al.

Update on the epidemiology of gastro-oesophageal reflux disease: a systematic review

.

Gut

2014

;

63

:

871

–

80

.

55.

Sakai

NS

,

Samia-Aly

E

,

Barbera

M

, et al.

A review of the current understanding and clinical utility of miRNAs in esophageal cancer

.

Semin Cancer Biol

2013

;

23

:

512

–

21

.

56.

Li

B

,

Xu

WW

,

Han

L

, et al.

MicroRNA-377 suppresses initiation and progression of esophageal cancer by inhibiting CD133 and VEGF

.

Oncogene

2017

;

36

:

3986

–

4000

.

57.

Hong

L

,

Han

Y

,

Zhang

H

, et al.

The prognostic and chemotherapeutic value of miR-296 in esophageal squamous cell carcinoma

.

Ann Surg

2010

;

251

:

1056

–

63

.

58.

Chen

X

,

Yan

CC

,

Zhang

X

, et al.

Long non-coding RNAs and complex diseases: from experimental results to computational models

.

Brief Bioinform

2016

;

18

:

558

–

76

.

59.

Wang

C-C

,

Han

C-D

,

Zhao

Q

, et al.

Circular RNAs and complex diseases: from experimental results to computational models

.

Brief Bioinform

2021

;

22

:bbab286.

60.

Hu

L

,

Wang

X

,

Huang

Y-A

, et al.

A survey on computational models for predicting protein–protein interactions

.

Brief Bioinform

2021

;

22

:bbab036.

61.

Zhao

Y

,

Chen

X

,

Yin

J

.

Adaptive boosting-based computational model for predicting potential miRNA-disease associations

.

Bioinformatics

2019

;

35

:

4730

–

8

.

62.

Hu

L

,

Zhang

J

,

Pan

X

, et al.

HiSCF: leveraging higher-order structures for clustering analysis in biological networks

.

Bioinformatics

2020

;

37

:

542

–

50

.

Crossref

63.

Hu

L

,

Chan

KCC

,

Yuan

X

, et al.

A variational Bayesian framework for cluster analysis in a complex network

.

IEEE Trans Knowl Data Eng

2020

;

32

:

2115

–

28

.

Crossref