Abstract

Motivation

Identifying microRNAs that are associated with different diseases as biomarkers is a problem of great medical significance. Existing computational methods for uncovering such microRNA-diseases associations (MDAs) are mostly developed under the assumption that similar microRNAs tend to associate with similar diseases. Since such an assumption is not always valid, these methods may not always be applicable to all kinds of MDAs. Considering that the relationship between long noncoding RNA (lncRNA) and different diseases and the co-regulation relationships between the biological functions of lncRNA and microRNA have been established, we propose here a multiview multitask method to make use of the known lncRNA–microRNA interaction to predict MDAs on a large scale. The investigation is performed in the absence of complete information of microRNAs and any similarity measurement for it and to the best knowledge, the work represents the first ever attempt to discover MDAs based on lncRNA–microRNA interactions.

Results

In this paper, we propose to develop a deep learning model called MVMTMDA that can create a multiview representation of microRNAs. The model is trained based on an end-to-end multitasking approach to machine learning so that, based on it, missing data in the side information can be determined automatically. Experimental results show that the proposed model yields an average area under ROC curve of 0.8410+/−0.018, 0.8512+/−0.012 and 0.8521+/−0.008 when k is set to 2, 5 and 10, respectively. In addition, we also propose here a statistical approach to predicting lncRNA-disease associations based on these associations and the MDA discovered using MVMTMDA.

Availability

Python code and the datasets used in our studies are made available at https://github.com/yahuang1991polyu/MVMTMDA/.

Introduction

MicroRNAs and long noncoding RNAs (lncRNAs) have been found involved in transcriptional and post-transcriptional processes, forming a gene expression program that all eukaryotic cells rely on [1]. MicroRNAs are ~22 nt ncRNAs and they generally bind to the three prime untranslated region (3'UTR) of the mRNA imperfectly. In most cases, this can lead to translational inhibition or degradation of its target mRNA. Although much effort has focused on the functions and biogenies of microRNAs, lncRNAs are gaining prominence as they take up the largest portion of mammalian noncoding transcriptome. It has recently been found to serve the role of critical epigenetic regulators of gene expression [2]. Most diseases are frequently associated with alteration of the transcriptome, and such an altered transcription pattern has recently been found to not just be restricted to the protein-coding RNAs aberrantly expressed but also to dysregulation of the expression of microRNAs and lncRNAs. As a result, much effort is currently being made to characterize those lncRNAs and microRNAs that interfere with gene expression and signaling pathways at various stages of disease development.

Recently, there has been an increasing body of experimental evidence that shows that through a sophisticated and multilayered mode of regulation noncoding RNA, including lncRNA and microRNA, can influence every aspect of normal tissue physiology [1]. Recently, the competitive endogenous RNA (ceRNA) hypothesis [3] has gained substantial attention as it unifies all the hypotheses about the general mechanism of the intricate interplay among diverse RNA species. Specifically, it proposes that lncRNAs that share specific microRNA response elements (MREs) communicate with and co-regulate each other by competing for binding to the shared microRNAs. Considering that both lncRNA and microRNA are key regulators that control cellular processes, and that they interact with each other to fine-tune gene expression, knowledge of the mechanisms by which they cooperate is the first step toward understanding the functions that they exert in disease processes. Unfortunately, in spite of its importance, little is known about the co-regulation between lncRNA and miRNA in disease processes.

However, with the advent of high-throughput sequencing techniques, more and more lncRNAs and microRNAs have been identified to be involved in the development of diverse diseases, which include cancers, acting as oncogenes or as tumor suppressors [4]. Both lncRNA and microRNAs are now routinely used as biomarkers in disease diagnosis and treatment. Much progress has also been made toward their use as molecular targets for new drugs. This promising trend depends largely on our understanding of the associations between lncRNA or microRNA and a diversity of different diseases.

Recently, with the advances in analytical methods including circulation, genetic, epigenetic, miRNA-target and tissue-expression assays, several databases, such as HMDD and miR2Disease, have been established to allow data related to the relationship between lncRNA/microRNA and different diseases to be publicly accessible. Unfortunately, as the assays are time consuming and tedious, the data that have been collected so far are still relatively limited in number, focusing only on a few key noncoding RNAs rather than their contextual regulation network. In addition, it can be difficult to integrate the data in the databases together to form a complete regulation network due to their sparsity in number and their being from different bioassay-based research.

In recent years, there has been increasing research interest to exploit lncRNA–miRNA interaction (LMI) in studies related to various complex human diseases [5] such as colorectal cancer [6, 7], cervical squamous cell carcinoma (CESC) [8] and heart failure [9], etc. Rather than investigating into the signaling pathways of a few types of noncoding RNAs, these studies consider transcriptome-wide regulation that involves both microRNAs and lncRNAs cooperating together. However, it should be noted that, as information about lncRNA–microRNA regulation network is not available from existing databases, current research in this area is mainly based on sequence-based microRNA target-prediction algorithms, such as miRWalk, Cytoscape and TAM [10–12]. These algorithms are used to construct a predicted LMI network so that they can be used to predict pathogenic lncRNA–microRNA co-regulations. However, as pointed out by some studies, most existing microRNA target-prediction algorithms predict too many false positives and the LMI networks constructed based on the results that these algorithms predict would therefore be unreliable [13]. Although ground-truth data of LMI may help us understand the important regulation functions of ncRNA thereby deciphering the complex ncRNA regulation network in the pathology of diseases, finding out the relationship between LMIs and the diseases that they are associated with is difficult.

As it is slow and tedious to perform laboratory experiments, relying on computational approaches can allow potential candidates for experimental confirmation to be quickly identified by better integrating prior information from different relevant studies much faster and with much lower costs. Toward this goal, a number of computational tools have been developed for computer-aided ncRNA biomarker discovery. As reviewed in [14, 15], most existing methods in this domain rely on the basic assumption that microRNAs that are similar tend to be involved in diseases that have similar pathological characteristics. While this assumption seems to be very reasonable, it should be noted that how microRNA similarity should be defined is a complex and open problem. Apart from known microRNA-disease association that can be used for training prediction model, there are different kinds of supplementary information relevant to microRNA and disease. This kind of knowledge is often called side information, which can be introduced into model to boost prediction performance. Different metrices for microRNA similarity have been proposed using different side information and statistical metrices such as Pearson correlation coefficients, cosine similarity and Euclidean distance [15]. However, as the features in the feature vectors of the side information may not be linearly dependent, these metrices may not be able to capture the complex relationship between two lncRNAs/microRNAs. In addition to this problem, the data of side information, such as the LMIs, are quite limited in amount and are incomplete. Due to the missing data, microRNA similarity cannot be determined accurately and for this reason,microRNA-diseases associations (MDAs) cannot be predicted accurately. Hence, in order to improve prediction accuracy, there is a need to learn an effective feature representation for microRNA and lncRNA.

There has recently been an increasing number of computational tools proposed to predict MDA. Many of these tools do not take into consideration the incompleteness of information about what raw features of microRNAs can best be used for prediction. Also, these tools determine functional similarity for microRNAs based on data sources that are not reliable [14, 15]. For example, the functional similarity score matrix (http://www.cuilab.cn/files/images/cuilab/misim.zip) released by Wang et al. [16] was obtained using a computational model developed with a data set, which has not been continually updated, and as such, prediction of the association relationships are not reliable. One other limitation with existing tools is related to the statistical methods that they use to compute the microRNA similarity scores. As explained above, they are too simple to capture the complex correlation relationship among microRNAs. For example, the use of Gaussian kernel measure or linear Euclidean distance, which are widely used, do not capture dependence of features in microRNA feature vectors [14]. In this work, we compared four state-of-the-art existing methods with the proposed MVMTMDA model. All of them are based on similarity measurement and use different techniques with various advantages. Specifically, IMCMDA and MDHGI model use two matrix completion techniques to consider a space mapping from the similarity space to the space of input network [17, 18]; the method proposed in Zeng et al.’s [19] work adopts a structural perturbation method, considering the structural consistency property of the input network; MDA similarity kernel fusion (SKF) model uses a similarity fusion method to consider the correlation between different similarities when performing link prediction [20]. In summary, the choice of data sources and the way side information is integrated do not provide current computation methods the best tools to predicting MDAs most accurately. To develop better methods for this purpose, we take the co-regulation between lncRNA and microRNA into account when predicting new microRNA biomarkers. Based on the assumption that the patterns of lncRNA–microRNA co-regulation can be implied from the network of LMI identified by large-scale CLIP-seq experiments, we developed a computational model to predict MDAs on a transcriptome-wide scale by introducing known LMIs. There has also been a method called DCSMDA [21] proposed to use LMI for predicting MDAs. It is based on similarity matrix construction and unsupervised learning. Different from DCSMDA, the proposed model can consider the data missing problem in LMI network. In addition, it is a supervised approach that is able to consider known MDAs for prediction.

To evaluate the performance of the proposed model, we implement 2-fold, 5-fold and 10-fold cross-validation to predict associations between microRNA and disease using the ground true data from HMDD v3.0 and lncRNASNP v2.0. A number of additional experiments are also implemented for the performance comparison of the proposed method with state-of-the-art methods for MDA prediction. We use multiple criteria to evaluate the prediction performance including the area under ROC curve (AUC), hit ratio (HR) and normalized discounted cumulative gain (NDCG) [22]. As a result, MVMTMDA yielded the best performance with average AUC of 0.8512+/−0.012, HR of 0.7553 and NDCG of 0.4895 in 5-fold cross-validation. The experimental results demonstrate the effectiveness of MVMTMDA for predicting large-scale associations between miRNA and disease by introducing the LMI network. We publicly release our predicted results including predicted LMI, MDA, lncRNA-disease associations and the graph embeddings for microRNA, which are expected to be useful for future research in noncoding RNA domain.

Materials

The data we used in this work include experimentally validated lncRNA–microRNA interactions and MDAs. There are several public databases providing such two types of data resource. In order to obtain the up-to-date data resource, we collect the datasets from lncRNASNP v2.0 [23] and HMDD v3.0 [4], both of which have been recently updated within a year.

lncRNASNPv2.0 database (http://bioinfo.life.hust.edu.cn/lncRNASNP) integrates the data from starBase v3.0 [24] database (http://starbase.sysu.edu.cn/) providing comprehensive knowledge on lncRNAs. It records 45 329 LMIs between 3521 types of lncRNA and 276 types of microRNAs. HMDD v3.0 database (http://www.cuilab.cn/hmdd) provides 18 732 MDAs between 874 types of diseases and 1207 types of microRNAs.

We discard the redundant data and manually match the ids of microRNA in lncRNASNP v2.0 database and those in HMDD v3.0 database. As a result, the LMI dataset we collected has 10 465 LMIs between 541 lncRNAs and 268 microRNAs. Based on the 268 types of microRNAs whose ids are successfully matched to HMDD database, we collect 11 253 MDAs covering 799 types of diseases.

Methods

Problem statement

In this work, we propose MVMTMDA to predict MDAs considering the co-regulation of lncRNA and microRNA. As mentioned in the Introduction section, one challenge in our work is to solve the problem of incompleteness and sparsity of LMI networks. To this end, we introduce what we call multitask learning when we design our model. Based on multitask learning, LMIs and MDA are simultaneously predicted. Considering that both known networks of LMI and MDA are far less than complete and that the information contained in these networks are complementary to each other, we believe, therefore, that prediction of new links in one network can be made based on the other ones. These predictions are also mutually beneficial. An accurate link prediction in the LMI network, for example, can provide useful information for MDA predictions to be made more accurately and vice versa.

Another challenge of our prediction task lies in the development of a similarity measure between lncRNA/microRNA in an lncRNA–microRNA-disease network. Due to complexity of their synergistic effects, such a measure would be highly complex as well. To tackle the problem, we propose to learn embedding features for lncRNA and microRNA from the LMI and MDA networks and this can be defined as a multiview learning problem. We consider the functional roles of a given microRNA to have two heterogenous representations on LMI and MDA networks, respectively, with each network having a different view. The key to tackling the multiview learning problem is to effectively exploit the diversity and consistency of multiview data of the networks of LMI and MDA, which consequently identifies the feature dimensions in which the characteristics of the original data could be retained.

Suppose there are Nm types of microRNAs |$\mathcal{M}=\big\{{\mathcal{m}}_1,\dots, {\mathcal{m}}_{N_m}\big\}$|⁠, Nd types of diseases |$\mathcal{D}=\big\{{\mathcal{d}}_1,\dots, {\mathcal{d}}_{N_d}\big\}$| and Nl types of lncRNAs |$\mathcal{L}=\big\{\mathcal{l}_1,\dots, \mathcal{l}_{N_l}\big\}$|⁠. Let |$\mathcal{X}\in{\mathbb{R}}^{N_d\times{N}_m}$| and |$\mathcal{S}\in{\mathbb{R}}^{N_l\times{N}_m}$| denote the adjacent matrixes of known MDA and LMI network, respectively. Based on the datasets we collected, |$\mathcal{X}$| and |$\mathcal{S}$| are constructed as,
(1)
(2)
We formulate prediction task for MDAs as the problem of simultaneously estimating the value of each unobserved entry in |$\mathcal{X}$| and |$\mathcal{S}$|⁠. It is assumed that there is an underlying model that can be constructed to generate all interaction possibility for each pair of MDA/LMI as follows:
(3)
(4)
where |${\hat{\mathcal{X}}}_{ij}$| and |${\hat{\mathcal{S}}}_{ij}$| denote the predicted score of association |${\mathcal{X}}_{ij}$| between disease |${\mathcal{d}}_i$| and microRNA |${\mathcal{m}}_j$| and interaction |${\mathcal{S}}_{ij}$| between lncRNA |${\ell}_i$| and microRNA |${\mathcal{m}}_j$|⁠, respectively; |${\Theta}_{\mathcal{X}}$| and |${\Theta}_{\mathcal{S}}$| denote the model parameters; |${F}_x$| and |${F}_{\mathcal{S}}$| denote the functions that map the model parameters to predicted scores. As the outputs of |${F}_x$| and |${F}_{\mathcal{S}}$| are also the inputs for each other, we adopt a co-training optimization method to train the models. We introduce latent factor model (LFM) to build functions |${F}_x$| and |${F}_{\mathcal{S}}$| applying dot product as,
(5)
(6)
where p, q and r denote the latent features for disease, microRNA and lncRNA, respectively. For the purpose of learning the nonlinear connection among lncRNA, microRNA and disease, in this work, we propose a method to simultaneously learn the function |${F}_x$| and |${F}_{\mathcal{S}}$| with three multilayer neural networks.

In this paper, we present a deep neural network to exploit association/interaction relationship of microRNAs to diseases and lncRNAs using a graph embedding technique. Given a kind of microRNA, m, with a set of its known associated diseases, D, the proposed method can predict new diseases associated with m, learn feature representation for m’s function, and predict lncRNAs that are associated with D. The learned features synthesize the information in the networks of LMI and MDA and thus are anticipated to comprehensively describe functional role and correlations of lncRNA and microRNAs.

Schematic diagram of multiview multitask learning for microRNA-disease association prediction. There are three neural networks included in the MVMTMDA model (marked as α, β and γ), aiming to learn the embeddings for disease, microRNA and lncRNAs, respectively. The process for training the model contains two steps, which alternatively proceed. p and R are optimized in step 1 and 2, respectively, and q keeps being optimized in both steps.
Figure 1

Schematic diagram of multiview multitask learning for microRNA-disease association prediction. There are three neural networks included in the MVMTMDA model (marked as α, β and γ), aiming to learn the embeddings for disease, microRNA and lncRNAs, respectively. The process for training the model contains two steps, which alternatively proceed. p and R are optimized in step 1 and 2, respectively, and q keeps being optimized in both steps.

Multiview multitask learning for predicting microRNA-disease associations

The proposed model, MVMTLMDA, is designed with a deep structure composed of three neural networks. Different from conventional prediction models for MDA that separate similarity measurement and value prediction, it provides end-to-end solution to handle graph-based raw data to yield the final results without any statistical assumption. Specifically, it learns the hidden features for diseases, microRNAs and lncRNAs via multiview learning and yields the prediction via multitask learning (Figure 1).

To the best of our knowledge, this work is the first attempt to consider the topological information of lncRNA–microRNA interaction network to predict MDAs. Apart from the prediction improvement from previous models, the contribution of our work lies in some outstanding characters of our method that can be outlined as follows: (i) MVMTMDA is able to integrate data from different types of relevant biological network for prediction even if the data is incomplete; (ii) it realizes end-to-end training for feature representation from multiple biological networks; and (iii) it provides a solution to combine the issues of MDA prediction and lncRNA-disease association prediction.

Multiple graph embeddings via multiview learning

As mentioned in Section 3.1, we form two matrix X and S according to Equations 1 and 2. With matrix X and S as inputs, we propose an architecture of three deep neural network to project each type of disease, microRNA and lncRNA into a latent structured space. From the matrix X, each disease di is firstly represented as an i-th row vector X, which represents the i-th disease’s relationship across all microRNAs. Each microRNA mi is firstly represented as a j-th column vector X·j, which represents the j-th microRNA’s relationship across all diseases. As shown in Figure 1, the input feature of each type of elements on lncRNA–microRNA-disease network is processed by a single neural network. In each layer of both networks, each input vector is mapped into another vector of a different dimension in a new space. Denoting a given input vector of a neural network by x, the output vector by y, the intermediate hidden layers by li, i = 1,…,N-1, the weight matrix and bias term of li by Wi and bi, we have
(7)
where the activation function f is here chosen as the ReLU function, f(x) = max(0,x). In the neural network γ that is built to consider the side information of LMI, the whole adjacent matrix S for LMI network is used as the input. With a given training sample xij, its aim is to learn the features of all lncRNAs according to their known interaction with the j-th microRNA. According to Equation 7, the outputs of neural networks α, β, γ can be, respectively, formulated as follows:
(8)
(9)
(10)
Here, Wα1, Wβ1 and Wγ1 are the weight matrix of the first layer in networks α, β, γ, respectively, and bα1, bβ1, Bγ1 are the corresponding bias terms. Wα2, Wβ2,Wγ2, bα1, bβ1, and Bγ1 are for the second layer, and so on. It should be noted that the row dimension D’ is Nl, the same as that of input matrix S.|$D^{\prime}={\big[{r_1}^T,\dots, {{\mathrm{r}}_{N_l}}^T\big]}^T$| is a matrix stacking the embedding features of all lncRNAs. Based on the embedding features learned from neural networks α, β, γ, we formulate the outputs of our models as follows:
(11)
(12)

It should be noted that, because of the operation of dot product in Equations 11 and 12, the weight matrixes in the last layers of neural networks should have the same column dimension, assuring that the dimensions of pi and qj and the column dimension of R are the same. The embedding feature of microRNA qj connects the results of neural networks β and γ, and therefore it remains and combines the information of their inputs (i.e. known MDAs and LMIs). As qj is used to yield the scores for each pair of MDA and LMI, it can effectively represent the biological role of a given microRNA on both networks while training the model by recovering X and S. We consider X and S are strongly related data that provide two different view for the function of microRNAs and the embedding features yielded from the proposed are basically based on multiview learning.

Model training via multitask learning

Based on the outputs yielded by Equations 11 and 12, we define two objective functions for model optimization according to the observed data and unobserved feedback. Each of the object functions is corresponding to one prediction task. As shown in Figure 1, when training the model using xij, the embeddings of the i-th disease and the j-th microRNA are learned for the reconstruction of xij. Meanwhile, the column vector of j-th microRNA in the LMI matrix is the reconstruction target for the neural network γ. Considering that the prediction problem is a semi-supervised learning problem with all the training sample that are positive, the objective function is generalized as follow:
(13)
where l(·) denotes a loss function; Ω(Θ) is the regularizer for model parameters; Y+ is the set of positive samples and Y is that of negative samples, which we adopt negative sampling on the unlabeled microRNA-disease pairs. To train the neural network α and β on the dataset of MDA, the first loss function is and defined with a binary cross-entropy loss as follows:
14
where w and b denote the parameters in neural network α and β. For training the model on known interactions between lncRNA and microRNA, the loss function for the second step is defined as follows:
15
where MSE(·) denotes the function of mean-square error. The optimization for model training contains two steps based on L1 and L2 that are executed alternately. Optimization on function L1 is basically a point-wise matrix factorization on LMI network while that on function L2 is a column-wise matrix factorization on MDA network. As the first step of optimization is to predict the scores for the pairs of MDA and the second step is to predict the interaction possibility for lncRNA–microRNA pairs, the proposed model is basically optimized via multitask learning.

Prediction of lncRNA-disease association with MDA and LMI

Computational tools for predicting disease-associated noncoding RNAs can be mainly categorized into two types: lncRNA-disease association prediction and MDA prediction. For example, Fu et al. [25] proposed an approach called MFLDA based on matrix factorization to predict lncRNA-disease associations using 11 multiple data sources related to lncRNA and disease. Despite their close intrinsic relation with respect to the function mechanism of lncRNA and microRNA, little effort has been devoted to combine these two important fields. We here consider the lncRNA–microRNA interaction as a useful bridge to connect these two prediction problems and propose a statistical method to predict lncRNA-disease association based on the results yielded by MVMTMDA. Based on a score matrix of MDA |$\hat{X}$|predicted by MVMTMDA and the adjacent matrix of LMI, we calculate the P-value for each pair of lncRNA-disease. Given an lncRNA-disease pair (lp-dp), we denote Lm the number of microRNAs associated with lp in LMI dataset, Dm the number of microRNA associated with dp in MDA dataset and Mld the number of microRNAs that simultaneously associate with lncRNA lp and disease dp. The P-value for the association between lp and dp is defined as follows:
(16)

In the datasets that we collected, each type of lncRNA and disease has relation to at least one microRNA, such that the P-value for each lncRNA-disease pair can be calculated using Equation 16. By setting P-value < 0.05, we consequently identify 15 945 lncRNA-disease associations from a total of 432 259 lncRNA-disease pairs. To further control the false positive rate (FPR) of our prediction, we, in addition, conduct false discovery rate (FDR) correction on the computed P-values. The lncRNA-disease pairs with FDR < 0.05 are considered to have strong positive or negative correlation. As a result, we identify 25 076 potential lncRNA-disease association. The predicted lists are available in Supplementary Table S1.

Functional clustering of microRNAs based on multiview embedding features

In recent years, the similarity measurement for function of microRNAs has been attracting increasing attention due to its significance in the domain of noncoding RNA research [26, 27]. In this section, we propose a new type of functional similarity measure for microRNAs based on MDA and LMI.

As the connection joint of the networks of MDA and LMI, in this work, microRNA is considered to have two views to represent its biological functions. Motivated by this, the proposed MVMTMDA learns graph embedding features for each type of microRNA, comprehensively considering their relationship with diseases and target lncRNAs. The microRNA features learned from the proposed model can thus imply the functional similarity among microRNAs. In this section, we implement K-means clustering in the feature space of the microRNA graph embedding learned by the MVMTMDA model.

Specifically, we first use all data in MDA dataset as training set to train the two-layer model of MVMTMDA until results converged. Secondly, in order to visualize the cluttering results in 3 × 3 subfigures, we reduced the dimension of embedding features to 3. Specifically, we applied principal component analysis (PCA) on microRNA features. Based on the first three dimensions in PCA, the clustering algorithm of k-means was implemented. We set the number of clusters as 6 and the corresponding scatter diagram is shown in Figure 2. In addition, we calculate the Pearson correlation coefficient (PCC) of microRNA features as the function similarity score.

Scatter diagram of functional clustering for 268 types of microRNAs. Six different colors represent the distribution of six clusters, based on three dimensions of microRNA nodes. The subfigure in i-th row and j-th column present the nodes distribution based on the i-th and j-th dimension.
Figure 2

Scatter diagram of functional clustering for 268 types of microRNAs. Six different colors represent the distribution of six clusters, based on three dimensions of microRNA nodes. The subfigure in i-th row and j-th column present the nodes distribution based on the i-th and j-th dimension.

We publicly release the embedding features of microRNAs along with the clustering results and the microRNA functional similarity (available in Supplementary Table S2). The prediction scores for each microRNA-disease pair are also available in Supplementary Table S3. It is anticipated that the microRNA-disease pair with high predicted scores will be confirmed by biological experiment in the future.

Results

Performance evaluation for MVMTMDA

To evaluate the prediction performance of the proposed model, we used a real dataset involving experimentally confirmed MDA and LMI and tested accuracy using 2-fold, 5-fold and 10-fold cross-validation. Specifically, in k-fold (k = 2, 5 and 10) cross-validation, we randomly separate the samples of MDA into k roughly equal parts. k-1 of them are in turn used as training samples and the rest one is for testing. To quantify the performance in k-fold cross-validation, we adopt three kinds of criteria, i.e. AUC, HR and NDCG.

In each fold of prediction, we calculate the ranks of testing samples among the unlabeled samples. Those testing samples obtaining a rank higher than the given threshold are considered as positive. Setting different thresholds, we computed the corresponding true positive rates (TPRs, sensitivity) and FPRs (1-specificity) where sensitivity and specificity are the percentages of testing samples predicted as positive and negative, respectively. Corresponding receiver operating characteristic (ROC) curves are computed by plotting TPR versus FPR and AUC is computed. AUC = 0.5 implies a purely random guess and AUC =1 indicates perfect prediction. In addition, we adopt the metrics of HR and NDCG. We used the testing samples and 50× its number of random unlabeled samples to construct the Ground-truth item set (GT) and truncated the ranked list at 10 for both metrics. As such, the HR intuitively measures the percentage of testing samples in the top-10 list, while the NDCG measures the ranking quality, which assigns higher scores to hits at top position ranks. For both metrics, larger values indicate better performance.

To avoid any bias caused by the random sample partitioning in cross-validation, we repeat the random sampling along with prediction for 20×. The performance results of average AUC, best HR and best NDCG yielded by MVMTMDA are listed in Table 1. As larger size of training set would lead to a more accurate prediction, it shows that the prediction accuracy yielded by the proposed model yields increases with the increased number of folds in k-fold cross-validation. The corresponding ROC curves shown in Figure 3a–c show the HR and NDCG yielded by the proposed model increase rapidly within the first 10 epochs and tend to stabilize after the 20th training epoch.

Table 1

Prediction performance w.r.t. AUC, HG and NDCG using MVMTMDA in k-fold cross-validation

CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8410+/−0.0180.8512+/−0.0120.8521+/−0.008
Best HR0.71960.75530.7603
Best NDCG0.44290.48950.5030
CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8410+/−0.0180.8512+/−0.0120.8521+/−0.008
Best HR0.71960.75530.7603
Best NDCG0.44290.48950.5030
Table 1

Prediction performance w.r.t. AUC, HG and NDCG using MVMTMDA in k-fold cross-validation

CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8410+/−0.0180.8512+/−0.0120.8521+/−0.008
Best HR0.71960.75530.7603
Best NDCG0.44290.48950.5030
CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8410+/−0.0180.8512+/−0.0120.8521+/−0.008
Best HR0.71960.75530.7603
Best NDCG0.44290.48950.5030
Prediction performance of MVMTMDA: (a) ROC curves yielded by MVMTMDA with two, three and four layers; (b) Hit ratio yielded by MVMTMDA with increasing training epochs; (c) NDCG yielded by MVMTMDA with increasing training epochs; (d) the training loss in Equation 14 for the step 1 with increasing training epochs; (e) the training loss in Equation 15 for the step 2 with increasing training epochs.
Figure 3

Prediction performance of MVMTMDA: (a) ROC curves yielded by MVMTMDA with two, three and four layers; (b) Hit ratio yielded by MVMTMDA with increasing training epochs; (c) NDCG yielded by MVMTMDA with increasing training epochs; (d) the training loss in Equation 14 for the step 1 with increasing training epochs; (e) the training loss in Equation 15 for the step 2 with increasing training epochs.

Performance evaluation on LMI prediction using MVMTMDA

As mentioned, we consider that the prediction of MDA and that of LMI are mutually beneficial. Given a type of microRNA, its involvement in different diseases offers useful information for predicting its target lncRNA. In this section, we try to use the MDA to predict LMI using MVMTMDA, whose prediction performance is concerned. Specifically, we exchange the matrixes of X and S, with X as the LMI matrix and S as the MDA matrix. We set the model parameters the same as the setting in the above experiment. As a result, predicting LMIs with two hidden layers, MVMTMDA yielded average AUC of 0.8747+/−0.018, 0.9014+/−0.012 and 0.9037+/−0.011 in 2-fold, 5-fold and 10-fold cross-validation (Figure 4 and Table 2). The reliable results demonstrate the usefulness of MDA for LMI prediction, and the effectiveness of the proposed model to integrate different types of biological networks for prediction.

Performance yielded MVMTMDA in LMI prediction: (a) ROC curves yielded by MVMTMDA with two, three and four layers; (b) Hit ratio yielded by MVMTMDA with increasing training epochs; (c) NDCG yielded by MVMTMDA with increasing training epochs; (d) the training loss in Equation 14 for the step 1 with increasing training epochs; (e) the training loss in Equation 15 for the step 2 with increasing training epochs.
Figure 4

Performance yielded MVMTMDA in LMI prediction: (a) ROC curves yielded by MVMTMDA with two, three and four layers; (b) Hit ratio yielded by MVMTMDA with increasing training epochs; (c) NDCG yielded by MVMTMDA with increasing training epochs; (d) the training loss in Equation 14 for the step 1 with increasing training epochs; (e) the training loss in Equation 15 for the step 2 with increasing training epochs.

Table 2

Prediction performance on LMI dataset using MVMTMDA in k-fold cross-validation

CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8747+/−0.0180.9014+/−0.0120.9037+/−0.011
HR0.73100.85060.9045
HDCG0.41190.55420.6504
CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8747+/−0.0180.9014+/−0.0120.9037+/−0.011
HR0.73100.85060.9045
HDCG0.41190.55420.6504
Table 2

Prediction performance on LMI dataset using MVMTMDA in k-fold cross-validation

CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8747+/−0.0180.9014+/−0.0120.9037+/−0.011
HR0.73100.85060.9045
HDCG0.41190.55420.6504
CV method2-fold CV5-fold CV10-fold CV
Average AUC0.8747+/−0.0180.9014+/−0.0120.9037+/−0.011
HR0.73100.85060.9045
HDCG0.41190.55420.6504

Performance comparison

In this subsection, we compare the proposed MVMTMDA with other methods that were previously proposed for predicting MDA and LMI. There are an increasing number of computational tools proposed for predicting potential microRNAs involved in different diseases. We here select four methods for performance comparison, all of which are recently published in 2018.

In these works, there are two kinds of data used to compute the microRNA similarity. One is the microRNA functional similarity yielded by MISIM [16], which hasn’t been updated for several years. In addition, Wang’s microRNA similarity was calculated based on an MDA dataset collected in 2010 such that it is inappropriate for MDA prediction. The other one is microRNA sequence similarity. However, the relation between microRNA functional similarity on pathology and microRNA sequence similarity is still unknown. To the best of our knowledge, the proposed MVMTMDA is the first one to consider the network structure of LMI to predict MDA. For the sake of fairness, we also introduced LMI into the compared prediction models.

Apart from the prediction tools for RNA target that are based on sequence matching, existing network-based prediction models for LMI is limited. For the performance comparison about LMI prediction, we compare MVMTMDA with the model of EPLMI [28] and three other baseline methods (i.e. Katz measure [29], basic LFM [30] and neighbor-based collaborative filtering [31]).

Different from the MVMTMDA model adopting end-to-end learning, these comparison methods need a microRNA similarity matrix as input. To execute the comparison methods on our datasets, we first construct a microRNA similarity matrix MS using PCC as follows:
(17)
where S denotes the matrix of side information. That is, it denotes the adjacent matrix of LMI when predicting MDAs and, on the other hand, the adjacent matrix of MDA when predicting LMIs. As a result, the compared methods yielded AUCs arranging from 0.6233 to 0.8192 in MDA prediction and AUCs arranging from 0.7301 to 0.8737 in LMI prediction, both of which are significantly low than those yielded by MVMTMDA (Table 3).
Table 3

Performance comparison on the prediction of MDA and LMI in 5-fold cross-validation

Prediction taskMethodAverage AUC
Prediction of microRNA-disease associationsIMCMDA [17]0.6233+/−0.032
MDHGI [18]0.6932+/−0.027
Zeng et al.’s work [19]0.7883+/−0.012
MDA-SKF [20]0.8192+/−0.010
The proposed method0.8512+/−0.012
Prediction of lncRNA–microRNA interactionsNeighbor-based CF [21]0.7301+/−0.026
LFM CF [22]0.7692+/−0.025
EPLMI [23]0.8126+/−0.012
Katz [24]0.8737+/−0.008
The proposed method0.9014+/−0.012
Prediction taskMethodAverage AUC
Prediction of microRNA-disease associationsIMCMDA [17]0.6233+/−0.032
MDHGI [18]0.6932+/−0.027
Zeng et al.’s work [19]0.7883+/−0.012
MDA-SKF [20]0.8192+/−0.010
The proposed method0.8512+/−0.012
Prediction of lncRNA–microRNA interactionsNeighbor-based CF [21]0.7301+/−0.026
LFM CF [22]0.7692+/−0.025
EPLMI [23]0.8126+/−0.012
Katz [24]0.8737+/−0.008
The proposed method0.9014+/−0.012
Table 3

Performance comparison on the prediction of MDA and LMI in 5-fold cross-validation

Prediction taskMethodAverage AUC
Prediction of microRNA-disease associationsIMCMDA [17]0.6233+/−0.032
MDHGI [18]0.6932+/−0.027
Zeng et al.’s work [19]0.7883+/−0.012
MDA-SKF [20]0.8192+/−0.010
The proposed method0.8512+/−0.012
Prediction of lncRNA–microRNA interactionsNeighbor-based CF [21]0.7301+/−0.026
LFM CF [22]0.7692+/−0.025
EPLMI [23]0.8126+/−0.012
Katz [24]0.8737+/−0.008
The proposed method0.9014+/−0.012
Prediction taskMethodAverage AUC
Prediction of microRNA-disease associationsIMCMDA [17]0.6233+/−0.032
MDHGI [18]0.6932+/−0.027
Zeng et al.’s work [19]0.7883+/−0.012
MDA-SKF [20]0.8192+/−0.010
The proposed method0.8512+/−0.012
Prediction of lncRNA–microRNA interactionsNeighbor-based CF [21]0.7301+/−0.026
LFM CF [22]0.7692+/−0.025
EPLMI [23]0.8126+/−0.012
Katz [24]0.8737+/−0.008
The proposed method0.9014+/−0.012

The reasons for the superior performance of the proposed model may lie in two aspects. One is that MVMTMDA adopts deep neural network structure, which can automatically learn the complex relation between microRNA from MDA and LMI network in an end-to-end manner. The other one is that the proposed model considers the incompleteness of the side information and adopts multitask learning to fill the missing values of it.

Impact of side information on MVMTMDA

As mentioned in sections 4.1 and 4.2, MVMTMDA predicts MDAs using the network of LMI as side information and can also predict LMIs with MDA network as side information. In this subsection, we evaluate the usefulness of the introduction of the side information. Specifically, for performance comparison, the second step of optimization (Equation 15) is discarded, such that the data of side information would be ignored when training the model. As shown in Table 4, without using the side information, the prediction performance of the proposed model significantly declines in the 2-fold and 5-fold cross-validation. The comparison results demonstrate the ability of MVMTMDA to integrate multiple graph data, and also confirm our assumption that the information of LMI and MDA is closely related and mutually beneficial for the prediction task of each other.

Table 4

Results of 2-fold and 5-fold cross-validation yielded the proposed model with and without side information

PredictionCross-validationMVMTMDA with side informationMVMTMDA without side information
MDA prediction2-fold CVAUC: 0.8410;
HR: 0.7196;
NDCG: 0.4429
AUC: 0.8306;
HR: 0.7224;
NDCG:0.4507
5-fold CVAUC: 0.8512;
HR:0.7553;
NDCG: 0.4895
AUC: 0.8423;
HR: 0.7442;
NDCG: 0.4705
LMI prediction2-fold CVAUC: 0.8747
HR:0.731;
NDCG: 0.4119
AUC: 0.8316;
HR:0.6217;
NDCG: 0.3445
5-fold CVAUC: 0.9014;
HR:0.8506;
NDCG: 0.5542
AUC: 0.8697;
HR:0.8291;
NDCG: 0.5470
PredictionCross-validationMVMTMDA with side informationMVMTMDA without side information
MDA prediction2-fold CVAUC: 0.8410;
HR: 0.7196;
NDCG: 0.4429
AUC: 0.8306;
HR: 0.7224;
NDCG:0.4507
5-fold CVAUC: 0.8512;
HR:0.7553;
NDCG: 0.4895
AUC: 0.8423;
HR: 0.7442;
NDCG: 0.4705
LMI prediction2-fold CVAUC: 0.8747
HR:0.731;
NDCG: 0.4119
AUC: 0.8316;
HR:0.6217;
NDCG: 0.3445
5-fold CVAUC: 0.9014;
HR:0.8506;
NDCG: 0.5542
AUC: 0.8697;
HR:0.8291;
NDCG: 0.5470
Table 4

Results of 2-fold and 5-fold cross-validation yielded the proposed model with and without side information

PredictionCross-validationMVMTMDA with side informationMVMTMDA without side information
MDA prediction2-fold CVAUC: 0.8410;
HR: 0.7196;
NDCG: 0.4429
AUC: 0.8306;
HR: 0.7224;
NDCG:0.4507
5-fold CVAUC: 0.8512;
HR:0.7553;
NDCG: 0.4895
AUC: 0.8423;
HR: 0.7442;
NDCG: 0.4705
LMI prediction2-fold CVAUC: 0.8747
HR:0.731;
NDCG: 0.4119
AUC: 0.8316;
HR:0.6217;
NDCG: 0.3445
5-fold CVAUC: 0.9014;
HR:0.8506;
NDCG: 0.5542
AUC: 0.8697;
HR:0.8291;
NDCG: 0.5470
PredictionCross-validationMVMTMDA with side informationMVMTMDA without side information
MDA prediction2-fold CVAUC: 0.8410;
HR: 0.7196;
NDCG: 0.4429
AUC: 0.8306;
HR: 0.7224;
NDCG:0.4507
5-fold CVAUC: 0.8512;
HR:0.7553;
NDCG: 0.4895
AUC: 0.8423;
HR: 0.7442;
NDCG: 0.4705
LMI prediction2-fold CVAUC: 0.8747
HR:0.731;
NDCG: 0.4119
AUC: 0.8316;
HR:0.6217;
NDCG: 0.3445
5-fold CVAUC: 0.9014;
HR:0.8506;
NDCG: 0.5542
AUC: 0.8697;
HR:0.8291;
NDCG: 0.5470

Sensitivity to hyperparameters

Depth of layers in networks

The number of layers in neural networks is critical for the performance of deep learning-based models. In this work, we simply set the layer numbers and the layer sizes of neural networks α, β and γ the same. We set the number layers as two, three and four for testing. Layer sizes that we set in our experiments are 64::32::16::8, 32::16::8 and 32::16 in the four-layer, three-layer and two-layer structure, respectively. Table 5 shows the prediction performance yielded by MVMTMDA with different layers in 5-fold cross-validation. Figures 3 and 4 show the corresponding curves for prediction performance and optimization. The results show that the proposed model was optimized with layer number set as 2. We therefore use such structure for MVMTMDA in the experiments of this paper.

Table 5

Prediction performance using MVMTMDA with two, three and four layers in 5-fold cross-validation

Prediction taskDepth of layers in networks
Two layersThree layersFour layers
MDA prediction0.8512+/−0.0120.781+/−0.0110.8384+/−0.015
LMI prediction0.9014+/−0.0120.8602+/−0.0150.7647+/−0.022
Prediction taskDepth of layers in networks
Two layersThree layersFour layers
MDA prediction0.8512+/−0.0120.781+/−0.0110.8384+/−0.015
LMI prediction0.9014+/−0.0120.8602+/−0.0150.7647+/−0.022
Table 5

Prediction performance using MVMTMDA with two, three and four layers in 5-fold cross-validation

Prediction taskDepth of layers in networks
Two layersThree layersFour layers
MDA prediction0.8512+/−0.0120.781+/−0.0110.8384+/−0.015
LMI prediction0.9014+/−0.0120.8602+/−0.0150.7647+/−0.022
Prediction taskDepth of layers in networks
Two layersThree layersFour layers
MDA prediction0.8512+/−0.0120.781+/−0.0110.8384+/−0.015
LMI prediction0.9014+/−0.0120.8602+/−0.0150.7647+/−0.022

Negative sampling ratio

In this work, the samples from the datasets of MDA and LMI we collected are all positive such that the prediction task is a semi-supervised learning problem in which unlabeled samples are important to be considered. To train the model, we need to sample negative instances from unlabeled data to construct the set of X in Equations 14 and 15. In this experiment, we apply different negative sampling ratios (i.e. 1, 3 and 5) to observe the performance variance with regards to the prediction on MDA and LMI. As shown in Table 6, MVMTMDA yielded the best prediction performance with negative sampling ratio set as 1 and 5 in the prediction of MDA and LMI, respectively. Figure 5 shows the corresponding precision-recall curves for the experiment. The area under the precision-recall curve (AUPRC) values for the green, red and blue curve are 0.3261, 0.3228 and 0.3024, respectively. The prediction performance is generally stable with different negative sampling ratios.

Table 6

Prediction performance using MVMTMDA with different negative sampling ratios in 5-fold cross-validation

Prediction taskNegative sampling ratio
1-neg3-neg5-neg
MDA prediction0.8512+/−0.0120.8506+/−0.0110.8437+/−0.015
LMI prediction0.9014+/−0.0120.8922+/−0.0150.9109+/−0.012
Prediction taskNegative sampling ratio
1-neg3-neg5-neg
MDA prediction0.8512+/−0.0120.8506+/−0.0110.8437+/−0.015
LMI prediction0.9014+/−0.0120.8922+/−0.0150.9109+/−0.012
Table 6

Prediction performance using MVMTMDA with different negative sampling ratios in 5-fold cross-validation

Prediction taskNegative sampling ratio
1-neg3-neg5-neg
MDA prediction0.8512+/−0.0120.8506+/−0.0110.8437+/−0.015
LMI prediction0.9014+/−0.0120.8922+/−0.0150.9109+/−0.012
Prediction taskNegative sampling ratio
1-neg3-neg5-neg
MDA prediction0.8512+/−0.0120.8506+/−0.0110.8437+/−0.015
LMI prediction0.9014+/−0.0120.8922+/−0.0150.9109+/−0.012
Precision-recall curves for microRNA-disease association prediction in 5-fold cross-validation. The AUPRC values for the green, red and blue curve are 0.3261, 0.3228 and 0.3024, respectively.
Figure 5

Precision-recall curves for microRNA-disease association prediction in 5-fold cross-validation. The AUPRC values for the green, red and blue curve are 0.3261, 0.3228 and 0.3024, respectively.

Top-10 prediction networks for obesity, diabetes mellitus and fatty liver. Black links mean known microRNA-disease associations existing in training data set while red links mean the predicted associations. Different line widths demonstrate the relation of predicted values.
Figure 6

Top-10 prediction networks for obesity, diabetes mellitus and fatty liver. Black links mean known microRNA-disease associations existing in training data set while red links mean the predicted associations. Different line widths demonstrate the relation of predicted values.

Case studies

To further evaluate the prediction performance of the proposed method, in this section, we analyze the top-ranking prediction results focusing on four types of human diseases (colon neoplasms, obesity, diabetes mellitus and fatty liver). Specifically, we used the data of all known microRNA-disease associations as training samples and the method of MVMTMDA to perform prediction. The unknown microRNA-disease pairs in the top-ranking lists are manually verified from the published literatures.

Colon neoplasms, the most common malignancy in the gastrointestinal tract, is the third leading cause of cancer-related deaths in men and in women [32]. A number of microRNAs have been confirmed to involve in the mechanism of colon neoplasms. Using the method of MVMTMDA, we searched the top 10% ranked list of colon neoplasms. Table 7 shows the scores of 10 microRNA-disease pairs that are not included in the training set. As a result, 60% of them (6/10) were successfully verified from published literatures. The result illustrates that the top-ranking prediction yielded by the proposed method is efficient.

Table 7

Unknown microRNA-disease pairs in top 10% prediction list for colon neoplasms

DiseasemicroRNAScoreEvidence
Colon neoplasmshsa-miR-186-5p0.9773PMID: 26885189
hsa-miR-20b-5p0.9702PMID: 27878272
hsa-miR-26a-5p0.9614PMID:26083618
hsa-miR-9-5p0.9582PMID:27844330
hsa-miR-27b-3p0.9563PMID: 28351320
hsa-miR-98-5p0.9658unconfirmed
hsa-miR-144-3p0.9586unconfirmed
hsa-miR-495-3p0.9566unconfirmed
hsa-miR-199b-5p0.9688unconfirmed
hsa-miR-451a0.9579PMID: 24875473
DiseasemicroRNAScoreEvidence
Colon neoplasmshsa-miR-186-5p0.9773PMID: 26885189
hsa-miR-20b-5p0.9702PMID: 27878272
hsa-miR-26a-5p0.9614PMID:26083618
hsa-miR-9-5p0.9582PMID:27844330
hsa-miR-27b-3p0.9563PMID: 28351320
hsa-miR-98-5p0.9658unconfirmed
hsa-miR-144-3p0.9586unconfirmed
hsa-miR-495-3p0.9566unconfirmed
hsa-miR-199b-5p0.9688unconfirmed
hsa-miR-451a0.9579PMID: 24875473
Table 7

Unknown microRNA-disease pairs in top 10% prediction list for colon neoplasms

DiseasemicroRNAScoreEvidence
Colon neoplasmshsa-miR-186-5p0.9773PMID: 26885189
hsa-miR-20b-5p0.9702PMID: 27878272
hsa-miR-26a-5p0.9614PMID:26083618
hsa-miR-9-5p0.9582PMID:27844330
hsa-miR-27b-3p0.9563PMID: 28351320
hsa-miR-98-5p0.9658unconfirmed
hsa-miR-144-3p0.9586unconfirmed
hsa-miR-495-3p0.9566unconfirmed
hsa-miR-199b-5p0.9688unconfirmed
hsa-miR-451a0.9579PMID: 24875473
DiseasemicroRNAScoreEvidence
Colon neoplasmshsa-miR-186-5p0.9773PMID: 26885189
hsa-miR-20b-5p0.9702PMID: 27878272
hsa-miR-26a-5p0.9614PMID:26083618
hsa-miR-9-5p0.9582PMID:27844330
hsa-miR-27b-3p0.9563PMID: 28351320
hsa-miR-98-5p0.9658unconfirmed
hsa-miR-144-3p0.9586unconfirmed
hsa-miR-495-3p0.9566unconfirmed
hsa-miR-199b-5p0.9688unconfirmed
hsa-miR-451a0.9579PMID: 24875473
Table 8

Ranked lists of top-10 microRNA-disease pairs for obesity, diabetes mellitus and fatty liver. (‘ecorded’ means the pair exists in the HMDD v3.0, i.e. the training data set)

DiseasesMicroRNAsScoresEvidenceDiseasesMicroRNAsScoresEvidence
Obesityhsa-miR-16-5p0.9317RecordedDiabetes mellitushsa-miR-150-5p0.9472PMID:30552111
hsa-miR-18a-5p0.9277Recordedhsa-miR-9-5p0.9319Recorded
hsa-miR-17-5p0.9335Recordedhsa-miR-155-5p0.9407Recorded
hsa-miR-20a-5p0.9265Unconfirmedhsa-miR-223-3p0.9583Recorded
hsa-miR-221-3p0.9250Recordedhsa-miR-21-5p0.9383Recorded
hsa-miR-26b-5p0.9216RecordedFatty Liverhsa-miR-145-5p0.7771PMID: 0425654
hsa-miR-122-5p0.9254Recordedhsa-miR-17-5p0.7386Recorded
hsa-miR-21-5p0.9714Recordedhsa-miR-221-3p0.7544Unconfirmed
hsa-miR-143-3p0.9381Recordedhsa-miR-34a-5p0.7664Recorded
hsa-miR-223-3p0.9698Recordedhsa-miR-122-5p0.7314Recorded
Diabetes mellitushsa-miR-16-5p0.9498PMID:31568645hsa-miR-21-5p0.8701Recorded
hsa-miR-17-5p0.9607Recordedhsa-miR-155-5p0.7573Recorded
hsa-miR-19a-3p0.9467Recordedhsa-miR-143-3p0.7299Unconfirmed
hsa-miR-20a-5p0.9518Recordedhsa-miR-126-3p0.7692Unconfirmed
hsa-miR-26b-5p0.9323Recordedhsa-miR-223-3p0.7655Unconfirmed
DiseasesMicroRNAsScoresEvidenceDiseasesMicroRNAsScoresEvidence
Obesityhsa-miR-16-5p0.9317RecordedDiabetes mellitushsa-miR-150-5p0.9472PMID:30552111
hsa-miR-18a-5p0.9277Recordedhsa-miR-9-5p0.9319Recorded
hsa-miR-17-5p0.9335Recordedhsa-miR-155-5p0.9407Recorded
hsa-miR-20a-5p0.9265Unconfirmedhsa-miR-223-3p0.9583Recorded
hsa-miR-221-3p0.9250Recordedhsa-miR-21-5p0.9383Recorded
hsa-miR-26b-5p0.9216RecordedFatty Liverhsa-miR-145-5p0.7771PMID: 0425654
hsa-miR-122-5p0.9254Recordedhsa-miR-17-5p0.7386Recorded
hsa-miR-21-5p0.9714Recordedhsa-miR-221-3p0.7544Unconfirmed
hsa-miR-143-3p0.9381Recordedhsa-miR-34a-5p0.7664Recorded
hsa-miR-223-3p0.9698Recordedhsa-miR-122-5p0.7314Recorded
Diabetes mellitushsa-miR-16-5p0.9498PMID:31568645hsa-miR-21-5p0.8701Recorded
hsa-miR-17-5p0.9607Recordedhsa-miR-155-5p0.7573Recorded
hsa-miR-19a-3p0.9467Recordedhsa-miR-143-3p0.7299Unconfirmed
hsa-miR-20a-5p0.9518Recordedhsa-miR-126-3p0.7692Unconfirmed
hsa-miR-26b-5p0.9323Recordedhsa-miR-223-3p0.7655Unconfirmed
Table 8

Ranked lists of top-10 microRNA-disease pairs for obesity, diabetes mellitus and fatty liver. (‘ecorded’ means the pair exists in the HMDD v3.0, i.e. the training data set)

DiseasesMicroRNAsScoresEvidenceDiseasesMicroRNAsScoresEvidence
Obesityhsa-miR-16-5p0.9317RecordedDiabetes mellitushsa-miR-150-5p0.9472PMID:30552111
hsa-miR-18a-5p0.9277Recordedhsa-miR-9-5p0.9319Recorded
hsa-miR-17-5p0.9335Recordedhsa-miR-155-5p0.9407Recorded
hsa-miR-20a-5p0.9265Unconfirmedhsa-miR-223-3p0.9583Recorded
hsa-miR-221-3p0.9250Recordedhsa-miR-21-5p0.9383Recorded
hsa-miR-26b-5p0.9216RecordedFatty Liverhsa-miR-145-5p0.7771PMID: 0425654
hsa-miR-122-5p0.9254Recordedhsa-miR-17-5p0.7386Recorded
hsa-miR-21-5p0.9714Recordedhsa-miR-221-3p0.7544Unconfirmed
hsa-miR-143-3p0.9381Recordedhsa-miR-34a-5p0.7664Recorded
hsa-miR-223-3p0.9698Recordedhsa-miR-122-5p0.7314Recorded
Diabetes mellitushsa-miR-16-5p0.9498PMID:31568645hsa-miR-21-5p0.8701Recorded
hsa-miR-17-5p0.9607Recordedhsa-miR-155-5p0.7573Recorded
hsa-miR-19a-3p0.9467Recordedhsa-miR-143-3p0.7299Unconfirmed
hsa-miR-20a-5p0.9518Recordedhsa-miR-126-3p0.7692Unconfirmed
hsa-miR-26b-5p0.9323Recordedhsa-miR-223-3p0.7655Unconfirmed
DiseasesMicroRNAsScoresEvidenceDiseasesMicroRNAsScoresEvidence
Obesityhsa-miR-16-5p0.9317RecordedDiabetes mellitushsa-miR-150-5p0.9472PMID:30552111
hsa-miR-18a-5p0.9277Recordedhsa-miR-9-5p0.9319Recorded
hsa-miR-17-5p0.9335Recordedhsa-miR-155-5p0.9407Recorded
hsa-miR-20a-5p0.9265Unconfirmedhsa-miR-223-3p0.9583Recorded
hsa-miR-221-3p0.9250Recordedhsa-miR-21-5p0.9383Recorded
hsa-miR-26b-5p0.9216RecordedFatty Liverhsa-miR-145-5p0.7771PMID: 0425654
hsa-miR-122-5p0.9254Recordedhsa-miR-17-5p0.7386Recorded
hsa-miR-21-5p0.9714Recordedhsa-miR-221-3p0.7544Unconfirmed
hsa-miR-143-3p0.9381Recordedhsa-miR-34a-5p0.7664Recorded
hsa-miR-223-3p0.9698Recordedhsa-miR-122-5p0.7314Recorded
Diabetes mellitushsa-miR-16-5p0.9498PMID:31568645hsa-miR-21-5p0.8701Recorded
hsa-miR-17-5p0.9607Recordedhsa-miR-155-5p0.7573Recorded
hsa-miR-19a-3p0.9467Recordedhsa-miR-143-3p0.7299Unconfirmed
hsa-miR-20a-5p0.9518Recordedhsa-miR-126-3p0.7692Unconfirmed
hsa-miR-26b-5p0.9323Recordedhsa-miR-223-3p0.7655Unconfirmed

In addition, we also investigated the top-10 ranked lists across different diseases. Three types of clinically associated diseases, i.e. obesity, diabetes mellitus and fatty liver, are selected for the analytics [33]. As shown in Table 8, there are totally eight known microRNA-disease pairs in the three top-10 ranked lists, and three of them were confirmed by checking the published literatures. Figure 6 shows the MDA network constructed by the top-10 MDA pairs across obesity, diabetes mellitus and fatty liver. In this network, more than half (11/18) microRNA nodes are shared by at least two disease nodes. Specially, the nodes of has-miR-223-3p and has-miR-21-5p are central in this network, linked to all disease nodes. This shows that microRNA predicted by MVMTMDA model tend to form clusters in the MDA network across different diseases that are clinically associated. MicroRNAs of similar biological functions tend to be involved in the mechanisms of similar diseases. Considering that the proposed method performs prediction based on the graph embedding features of nodes (i.e. p and q in Equation 11), the result proves the learned embedding features able to describe the distance of microRNA nodes in MDA and LMI networks, which reveals the similarity and dissimilarity of their biological functions.

Conclusion

The identification of MDAs is of great significance in microRNA therapeutics. Current computational methods for predicting MDAs haven’t considered the co-regulation between lncRNA and microRNA, which is becoming known to be very important for their function mechanisms. In this work, we propose a multiview multitask model composed of three deep neural networks to fill this gap. Considering the networks of MDA and LMI are two different views collaboratively implying the biological function of microRNAs, we apply a multiview learning method to extract embedding features for microRNA from two different graphs. In addition, we combine the prediction of MDA and LMI, which are closely related as they both belong to parts of aberrant ceRNA regulation on diseases. A number of experiments were implemented on the real datasets that we collected and extensive analysis is also made on the predicted results. The experimental results demonstrate the feasibility and effectiveness of the proposed model to predict MDA on a large scale.

The main contribution of our work is 4-fold. Firstly, the propose model is the first one to consider the interaction between lncRNA and microRNA for large-scale prediction of MDA. LMI is ideal data to uncover the association between microRNA and disease due to their meaning and data type. Secondly, we consider the incompleteness of the side information and use a multitask learning method to synchronously predict MDAs and LMIs. Thirdly, the proposed model enables an end-to-end prediction for MDA. Any type of graph data associated with microRNA (e.g. microRNA–gene interaction and microRNA–protein interaction) can be flexibly and directly used as inputs to improve the prediction, which is important because the amount of microRNA data is increasing rapidly. Fourthly, different from similarity-based model, the proposed model can automatically extract features from the raw data, providing a new type of data source for measuring microRNA functional similarity.

Key points
  • The proposed MVMTMDA model is the first of its type to consider the structural information of lncRNA–microRNA interaction network for predicting microRNA-disease association.

  • A novel deep learning-based method is proposed to consider the incompleteness of features using multitask learning, while the previous works in this domain did not.

  • A kind of graph embedding feature for depicting the underlying microRNA function in both networks of microRNA-disease association and lncRNA–microRNA interaction can be learned through multiview learning.

  • This work presents a new solution for an end-to-end prediction of microRNA-disease association with any graph data feature as inputs.

Funding

The National Natural Science Foundation of China under (Grant 61572506 to Z.-H.Y.).

Conflict of Interest

The authors declared that they have no conflicts of interest to this work.

Yu-An Huang is currently working toward the PhD degree in the Department of Computing at the Hong Kong Polytechnic University. His current research interests mainly focus on data mining algorithms and applications.

Keith C.C. Chan received the BMath (Hons) degree, MSc degree in Computer Science and Statistics and PhD degree in Systems Design Engineering from the University of Waterloo, Canada, in 1984, 1985 and 1989, respectively. He joined the Hong Kong Polytechnic University in 1994, where he is currently a professor in the Department of Computing.

Zhu-Hong You obtained his PhD degree in Control Science and Engineering from the University of Science & Technology of China (USTC). He is currently a Professor with the Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Ürümqi, China.

Pengwei Hu received the PhD degree from the Department of Computing, The Hong Kong Polytechnic University, Hong Kong, in 2019. Currently, he is a research scientist at IBM Research China. His research interests include machine learning, data mining, data fusion and biomedical informatics.

Lei Wang received the PhD degree from China University of Mining and Technology. He is currently working as a Postdoc research fellow in the Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Ürümqi, China.

Zhi-An Huang is currently working toward the PhD degree in the City University of Hong Kong. His current research interests mainly focus on brain signal analytics and medical data mining.

References

1.

Yoon
J-H
,
Abdelmohsen
K
,
Gorospe
M
.
Functional interactions among microRNAs and long noncoding RNAs
.
Semin Cell Dev Biol
2014
;
34
:
9
14
.

2.

Parikshak
NN
,
Swarup
V
,
Belgard
TG
, et al.
Genome-wide changes in lncRNA, splicing, and regional gene expression patterns in autism
.
Nature
2016
;
540
:
423
7
.

3.

Salmena
L
,
Poliseno
L
,
Tay
Y
, et al.
A ceRNA hypothesis: the Rosetta stone of a hidden RNA language?
Cell
2011
;
146
:
353
8
.

4.

Huang
Z
,
Shi
J
,
Gao
Y
, et al.
HMDD v3. 0: a database for experimentally supported human microRNA–disease associations
.
Nuclei Acids Res
2018
;
47
:
D1013
7
.

5.

Zhang
G
,
Pian
C
,
Chen
Z
, et al.
Identification of cancer-related miRNA–lncRNA biomarkers using a basic miRNA–lncRNA network
.
Plos One
2018
;
13
:e0196681.

6.

Liu
J
,
Li
H
,
Zheng
B
, et al.
Competitive endogenous RNA (ceRNA) regulation network of lncRNA–miRNA–mRNA in colorectal carcinogenesis
.
Dig Dis Sci
2019
;
64
:
1868
77
.

7.

Yuan
W
,
Li
X
,
Liu
L
, et al.
Comprehensive analysis of lncRNA-associated ceRNA network in colorectal cancer
.
Biochem Biophys Res Commun
2019
;
508
:
374
9
.

8.

Song
J
,
Ye
A
,
Jiang
E
, et al.
Reconstruction and analysis of the aberrant lncRNA–miRNA–mRNA network based on competitive endogenous RNA in CESC
.
J Cell Biochem
2018
;
119
:
6665
73
.

9.

Wang
G
,
Zheng
X
,
Zheng
Y
, et al.
Construction and analysis of the lncRNA–miRNA–mRNA network based on competitive endogenous RNA reveals functional genes in heart failure
.
Mol Med Rep
2019
;
19
:
994
1003
.

10.

Dweep
H
,
Sticht
C
,
Pandey
P
, et al.
miRWalk–database: prediction of possible miRNA binding sites by ‘walking’ the genes of three genomes
.
J Biomed Inform
2011
;
44
:
839
47
.

11.

Smoot
ME
,
Ono
K
,
Ruscheinski
J
, et al.
Cytoscape 2.8: new features for data integration and network visualization
.
Bioinformatics
2011
;
27
:
431
2
.

12.

Lu
M
,
Shi
B
,
Wang
J
, et al.
TAM: a method for enrichment and depletion analysis of a microRNA category in a list of microRNAs
.
BMC Bioinform
2010
;
11
:
419
.

13.

Pinzón
N
,
Li
B
,
Martinez
L
, et al.
microRNA target prediction programs predict many false positives
.
Genome Res
2017
;
27
:
234
45
.

14.

Chen
X
,
Xie
D
,
Zhao
Q
, et al.
MicroRNAs and complex diseases: from experimental results to computational models
.
Brief Bioinform
2017
;
20
:
515
39
.

15.

Zou
Q
,
Li
J
,
Song
L
, et al.
Similarity computation strategies in the microRNA-disease network: a survey
.
Brief Funct Genomics
2015
;
15
:
55
64
.

16.

Wang
D
,
Wang
J
,
Lu
M
, et al.
Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases
.
Bioinformatics
2010
;
26
:
1644
50
.

17.

Chen
X
,
Wang
L
,
Qu
J
, et al.
Predicting miRNA–disease association based on inductive matrix completion
.
Bioinformatics
2018
;
34
:
4256
65
.

18.

Chen
X
,
Yin
J
,
Qu
J
, et al.
MDHGI: matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction
.
PLoS Comput Biol
2018
;
14
:e1006418.

19.

Zeng
XX
,
Liu
L
,
Lu
LY
, et al.
Prediction of potential disease-associated microRNAs using structural perturbation method
.
Bioinformatics
2018
;
34
:
2425
32
.

20.

Jiang
L
,
Ding
Y
,
Tang
J
, et al.
MDA-SKF: similarity kernel fusion for accurately discovering miRNA-disease association
.
Front Genet
2018
;
9
:
618
.

21.

Zhao
H
,
Kuang
L
,
Wang
L
, et al.
Prediction of microRNA-disease associations based on distance correlation set
.
BMC Bioinform
2018
;
19
:
141
.

22.

He
X
,
Chen
T
,
Kan
M-Y
, et al. Trirank: review-aware explainable recommendation by modeling aspects. In:
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management
.
Melbourne: ACM
,
2015
,
1661
70
.

23.

Ning
S
,
Yue
M
,
Wang
P
, et al.
LincSNP 2.0: an updated database for linking disease-associated SNPs to human long non-coding RNAs and their TFBSs
.
Nucleic Acids Res
2017
;
45
:
D74
8
.

24.

Li
J-H
,
Liu
S
,
Zhou
H
, et al.
starBase v2. 0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale
.
Nucleic Acids Res
2014
;
42
:
D92
7
.

25.

Fu
G
,
Wang
J
,
Domeniconi
C
, et al.
Matrix factorization-based data fusion for the prediction of lncRNA–disease associations
.
Bioinformatics
2018
;
34
:
1529
37
.

26.

Yang
Y
,
Fu
X
,
Qu
W
, et al.
MiRGOFS: a GO-based functional similarity measure for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA-disease association
.
Bioinformatics
2018
;
34
:
3547
56
.

27.

Cheng
L
,
Hu
Y
,
Sun
J
, et al.
DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function
.
Bioinformatics
2018
;
34
:
1953
6
.

28.

Huang
Y-A
,
Chan
KC
,
You
Z-H
.
Constructing prediction models from expression profiles for large scale lncRNA–miRNA interaction profiling
.
Bioinformatics
2018
;
34
:
812
9
.

29.

Chen
X
,
Huang
Y-A
,
You
Z-H
, et al.
A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases
.
Bioinformatics
2017
;
33
:
733
9
.

30.

Koren
Y
. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In:
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
.
Las Vegas: ACM
,
2008
,
426
34
.

31.

Sarwar
BM
,
Karypis
G
,
Konstan
JA
, et al. Item-based collaborative filtering recommendation algorithms. In:
Proceedings of the 10th international conference on World Wide Web
.
Raleigh: ACM
,
2001
,
285
95
.

32.

Siegel
RL
,
Miller
KD
,
Jemal
A
.
Cancer statistics, 2019
.
CA Cancer J Clin
2019
;
69
:
7
34
.

33.

Chiang
DJ
,
Pritchard
MT
,
Nagy
LE
.
Obesity, diabetes mellitus, and liver fibrosis
.
Am J Physiol Gastrointest Liver Physiol
2011
;
300
:
G697
702
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)