Predicting circRNA-drug sensitivity associations by learning multimodal networks using graph auto-encoders and attention mechanism

Yang, Bo; Chen, Hailin

doi:10.1093/bib/bbac596

Abstract

Recent studies have shown that the expression of circRNAs would affect drug sensitivity of cells and thus significantly influence the efficacy of drugs. Traditional biomedical experiments to validate such relationships are time-consuming and costly. Therefore, developing effective computational methods to predict potential associations between circRNAs and drug sensitivity is an important and urgent task. In this study, we propose a novel method, called MNGACDA, to predict possible circRNA–drug sensitivity associations for further biomedical screening. First, MNGACDA uses multiple sources of information from circRNAs and drugs to construct multimodal networks. It then employs node-level attention graph auto-encoders to obtain low-dimensional embeddings for circRNAs and drugs from the multimodal networks. Finally, an inner product decoder is applied to predict the association scores between circRNAs and drug sensitivity based on the embedding representations of circRNAs and drugs. Extensive experimental results based on cross-validations show that MNGACDA outperforms six other state-of-the-art methods. Furthermore, excellent performance in case studies demonstrates that MNGACDA is an effective tool for predicting circRNA–drug sensitivity associations in real situations. These results confirm the reliable prediction ability of MNGACDA in revealing circRNA–drug sensitivity associations.

circRNA–drug sensitivity associations, multimodal networks, graph auto-encoders, attention mechanism

Introduction

As a family of noncoding RNA molecules with covalently closed circular structures, circRNAs have recently been discovered to be transcribed in eukaryotic organisms [1]. With the advances of high-throughput technologies, biological functions of circRNAs have continuously been identified [2]. For example, Hansen et al. [3] reported that the human circRNA ciRS-7 functions as a regulator by acting as a miRNA sponge. Other significant roles, like modulation of alternative splicing or transcription, and interaction with RNA-binding proteins, have also been detected in circRNAs [4]. Meanwhile, as circRNAs play significant roles in physiological and pathological processes, the dysregulation of circRNAs is closely related with human complex diseases [5]. The validated biological functions suggest that circRNAs are one new category of potential clinical diagnostic markers.

More recently, studies have found that circRNAs significantly affect drug sensitivity of cells. For example, Huang et al. [6] discovered two circRNAs (hsa_circ_0004350 and hsa_circ_0092857) transcribed from EIF3a could affect cisplatin resistance in lung cancer cells. Xia et al. [7] detected a circRNA named circTNPO3 contributed to paclitaxel (PTX) resistance in ovarian cancer cells through upregulating NEK2 expression by sponging miR-1299. These studies provide valuable resources to investigate drug mode of action and provide therapeutic implications for biomedical research community. To systematically reveal the effects of circRNAs on drug sensitivity, Ruan et al. [8] applied four identification algorithms to characterize the expression landscape of circRNAs across ~1000 human cancer cell lines, and observed strong associations between circRNA expressions and drug responses. It should be noted that our understanding about associations between circRNAs and drug sensitivity is far from completeness.

As traditional biomedical experiments are expensive and time-consuming, developing efficient and accurate computational methods to predict circRNA–drug sensitivity associations could greatly reduce cost and time. Deng et al. [9] proposed a deep learning-based computational framework GATECDA to predict associations between circRNAs and drug sensitivity, in which graph attention auto-encoder (GATE) was applied to extract the low-dimensional representation of circRNAs and drugs. Comprehensive experiments showed the effectiveness of GATECDA in inferring circRNA–drug sensitivity associations. As mentioned in their study, current computational efforts made in this direction are limited. To the best of our knowledge, GATECDA is the first computational effort for inferring associations between circRNAs and drug sensitivity. It should further be noted that known circRNA–drug sensitivity associations are incomplete, and many remain undetected. Therefore, more accurate computational methods are urgently in need for more reliable circRNA–drug sensitivity association predictions.

In this study, we propose a novel computational framework termed as MNGACDA to predict circRNA–drug sensitivity associations. First, MNGACDA uses multiple sources of data from circRNAs and drugs to construct integrated circRNA–circRNA and drug–drug similarity networks, as well as circRNA–drug sensitivity association networks. It then embeds node-level attention layers into a deep graph neural network framework to adaptively captures internal information between nodes in the multimodal networks through an attentional mechanism, and uses a convolution neural network (CNN) combiner module to receive the embedded representations of the layers. Finally, the embedding representations of circRNAs and drugs are used to construct an inner product decoder to predict circRNA–drug sensitivity association. To evaluate the effectiveness of MNGACDA, we compare it with six state-of-the-art methods on a benchmark data set under 5-fold and 10-fold cross-validations (5-CV and 10-CV). Experimental results demonstrate that MNGACDA is superior to the existing methods. In addition, we perform an ablation study and compare the results of our method under different views. Finally, case studies are conducted to illustrate the usefulness of MNGACDA in predicting circRNA–drug sensitivity associations in real situations.

Materials and methods

Data sets

We download the data sets from Ref. [9] for our study. In Ref. [9], Deng et al. collected circRNA–drug sensitivity associations from the database circRic [8], in which drug sensitivity data were obtained from the GDSC database [10]. After the Wilcoxon tests with false discovery rate <0.05, we extract these significant circRNA–drug sensitivity associations as a benchmark data set, which contains 4134 associations involving 271 circRNAs and 218 drugs. Based on these associations, we construct an association matrix |$A\in{R}^{271\times 218}$| to denote the relationships between circRNAs and drug sensitivity. For the element in |$A$|⁠, |${A}_{ij}=1$| indicates that the circRNA i and the sensitivity of drug j are interrelated, otherwise |${A}_{ij}=0$|⁠. In addition to the circRNA–drug sensitivity associations, we download the host gene sequences of circRNAs from the National Center for Biotechnology Information (NCBI) Gene Database [11] and the drug structure data from NCBI’s PubChem database [12] for similarity calculation.

Similarity measures

Sequence similarity of host genes of circRNAs

Similar to Ref. [9], sequence similarity between host genes of circRNAs is calculated as similarity between circRNAs. The similarity is calculated based on the Levenshtein distance of the sequences through the ratio function of Python’s Levenshtein package. We use a matrix |$\mathrm{CSS}\in{R}^{M\times M}$| to represent the circRNA sequence similarity, where |$M$| is the number of circRNAs.

Structural similarity of drugs

Based on the structural data of drugs from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/), we use RDKit [13] and Tanimoto to calculate the structural similarity of drugs. We first calculate the topological fingerprints of each drug using RDKit, and then calculate the structural similarity between drugs by the Tanimoto method. Finally, the derived structural similarity matrix of the drugs is denoted as |$\mathrm{DSS}\in{R}^{N\times N}$|⁠, where |$N$| is the number of drugs.

Gaussian interaction profile kernel similarity of circRNAs and drugs

As the circRNA sequence similarity matrix constructed above is sparse and does not always contain sufficient information, Gaussian interaction profile (GIP) kernel similarity is utilized as an alternative strategy to complement the original similarity. GIP kernel similarity has been widely used in previous studies for similarity calculation [14–16]. Similarly, we calculate the GIP kernel similarity of circRNAs based on the circRNA–drug sensitivity association matrix |$AA$| under the assumption that circRNAs associated with the same drug sensitivity are more likely to be similar. The GIP kernel similarity matrix of circRNAs is calculated as follows:

$$\begin{equation} \mathrm{CGS}\left({c}_i,{c}_j\right)=\exp \left(-{\gamma}_c{\left\Vert \mathrm{IP}\left({c}_i\right)-\mathrm{IP}\left({c}_j\right)\right\Vert}^2\right) \end{equation}$$

(1)

where |$\mathrm{CGS}\in{R}^{M\times M}$|⁠, |$\mathrm{IP}({c}_i)$| denotes the corresponding column of circRNA |${c}_i$| in the circRNA–drug association matrix, the parameter |${\gamma}_c$| is used to control the kernel bandwidth, which can be normalized with a new bandwidth, and the parameter |${\gamma}_c$| is measured by the average number of correlations of all drugs with circRNAs. The parameter |${\gamma}_c$| is defined as follows:

$$\begin{equation} {\gamma}_c={\gamma}_c^{\prime }/\left(\frac{1}{n_c}{\sum}_{k=1}^{n_c}{\left\Vert \mathrm{IP}\left({c}_k\right)\right\Vert}^2\right) \end{equation}$$

(2)

where |${\gamma}_c$| equals 1.0 and |${n}_c$| denotes the number of circRNAs in the circRNA–drug sensitivity association network.

We calculate the GIP kernel similarity of drugs similarly, and the similarity of drugs |${d}_i$| and |${d}_j$| can be computed as follows:

$$\begin{equation} \mathrm{DGS}\left({d}_i,{d}_j\right)=\exp \left(-{\gamma}_d{\left\Vert \mathrm{IP}\left({d}_i\right)-\mathrm{IP}\left({d}_j\right)\right\Vert}^2\right) \end{equation}$$

(3)

where |$\mathrm{DGS}\in{R}^{N\times N}$|⁠, |$\mathrm{IP}({d}_j)$| denotes the corresponding row of drug |${d}_j$| in the correlation matrix. The parameter |${\gamma}_d$| is also used to control the kernel bandwidth, which can be expressed as follows:

$$\begin{equation} {\gamma}_d={\gamma}_d^{\prime }/\left(\frac{1}{n_d}{\sum}_{k=1}^{n_d}{\left\Vert \mathrm{IP}\left({c}_k\right)\right\Vert}^2\right) \end{equation}$$

(4)

where |${\gamma}_d^{\prime }$| equals 1.0 and |${n}_d$| denotes the number of drugs in the circRNA–drug sensitivity network.

Similarity fusion

As described above, we calculate the similarities of different aspects of circRNAs and drugs separately. We further fuse the similarities to obtain integrated similarity matrix. The integrated similarity matrix of circRNAs is constructed as follows:

$$\begin{equation} C{S}_{ij}=\left\{{\displaystyle \begin{array}{@{}ll}\frac{\left( CS{S}_{ij}+ CG{S}_{ij}\right)}{2},&\ \mathrm{if}\ {\mathrm{CSS}}_{ij}\ne 0\\{} CG{S}_{ij}&, \mathrm{otherwise}\ \end{array}}\right. \end{equation}$$

(5)

Similarly, the drug’s integrated similarity matrix is computed as follows:

$$\begin{equation} D{S}_{ij}=\left\{{\displaystyle \begin{array}{@{}ll}\frac{\left( DS{S}_{ij}+ DG{S}_{ij}\right)}{2},&\ \mathrm{if}\ \mathrm{D}S{S}_{ij}\ne 0\\{} DG{S}_{ij}&, \mathrm{otherwise}\ \end{array}}\right. \end{equation}$$

(6)

MNGACDA

Graph attention networks (GAT) [17] use attention mechanisms in their aggregation process to prioritize neighbors with more relevant information. Several studies in bioinformatics using GAT, such as HGATMDA [18] and MKGAT [19], have achieved impressive predictive performance. Inspired by these achievements, we propose a node-level attention graph auto-encoder model named MNGACDA for predicting potential circRNA–drug sensitivity associations. The proposed model MNGACDA, shown in Figure 1, consists of three main steps:

Figure 1

The flow chart of MNGACDA for predicting circRNA–drug sensitivity associations.

Open in new tab Download slide

Step 1: using multiple sources of information to construct integrated similarity networks of circRNAs and drugs. In the integrated circRNA (or drug) similarity network, we link only the 25 most similar neighbors of each circRNA (or drug). In addition, we construct a circRNA–drug sensitivity association network based on the known relationships between circRNAs and drug sensitivities.

Step 2: learning and fusing multimodal circRNAs and drugs embedding representations. Since graph convolutional networks [20, 21] and GAT [22, 23] are widely used for representation learning, we apply a node-level attention auto-encoder to fuse the 1st-order neighborhood information from the integrated similarity networks and circRNA–drug association network for learning the embedding representations of circRNAs and drugs.

Step 3: predicting novel circRNA–drug sensitivity associations. The final circRNA–drug sensitivity scoring matrix is calculated by using inner product operation on the embedding representations of circRNAs and drugs. The predicted circRNA–drug sensitivity associations are prioritized according to the final scores.

Node-level attention graph auto-encoder

Let |${C}_N$|⁠, |${D}_N$| and |${A}_N$| denote the integrated circRNA similarity network, the integrated drug similarity network and the circRNA–drug association network, |$\mathrm{CS}$|⁠, |$\mathrm{DS}$| and |$A$| denote the matrices of |${C}_N$|⁠, |${D}_N$| and |${A}_N$|⁠, respectively. Since learning embedding representations on |${C}_N$|⁠, |${D}_N$| and |${A}_N$|is a similar process, we take |${A}_N$| as an example to introduce the process of learning circRNA embedding in MNGACDA.

We construct a graph auto-encoder that applies a node-level attention mechanism to the graph and deploy this auto-encoder on the |${A}_N$| to learn the low-dimensional representations of the nodes. MNGACDA adaptively captures the representations of the nodes in the neighborhood for each node and preserves the substructural information of the graph contained in each embedding. MNGACDA uses |${H}^{(0)}$| as the initial input feature matrix, which we define as follows:

$$\begin{equation} {H}^{(0)}=\left[\begin{array}{cc}\mathrm{CS}& A\\{}{A}^T& \mathrm{DS}\end{array}\right] \end{equation}$$

(7)

Next, we project the node features of |${H}^{(0)}$| onto the |${F}^{(0)}$|-dimensional feature space, as follows:

$$\begin{equation} {H}^{(0)}={H}^{(0)}W \end{equation}$$

(8)

where |$W\in{R}^{(M+N)\times{F}^{(0)}}$| is the projection matrix.

For the input, |${H}^{(l-1)}=\Big\{{h}_1^{(l-1)},{h}_2^{(l-1)},\dots, {h}_{M+N}^{(l-1)}\Big\},{h}_i^{(l-1)}\in{R}^{F^{(l-1)}}$| at layer |$l$|⁠, where |$l=1,\dots, L$|⁠, |${F}^{(l-1)}$| denotes the input dimension size of each node. For each node |${h}_i^{(l-1)}$|⁠, we first compute the important scores from neighbor node |$j$| to node |$i$|⁠. The set of its 1st-order neighbors |${N}_i$| is determined by the edge index selected in |$G$|⁠, which is denoted as follows:

$$\begin{equation} G=\left[\begin{array}{cc}& A\\{}{A}^T& \end{array}\right]\in{\mathbb{R}}^{\left(M+N\right)\times \left(M+N\right)} \end{equation}$$

(9)

The weight of each relevant edge of the node |${e}_{ij}^{(l-1)}$| is denoted as follows:

$$\begin{equation} {e}_{ij}^{\left(l-1\right)}={a}^T\sigma \left({W}^{\left(l-1\right)}\left({h}_i^{\left(l-1\right)}\Vert{h}_j^{\left(l-1\right)}\right)\right),\forall j\in{N}_i \end{equation}$$

(10)

where |${W}^{(l-1)}$| is the parameter matrix, |${W}^{(l-1)}\in{\mathbb{R}}^{F^{(l)}\times{F}^{(l-1)}}$||$\sigma (\cdot )$| is the leaky ReLU function that performs an activation operation on each element of the input vector, |$\Vert$| denotes the vector connection operation and |$a\in{\mathbb{R}}^{2F}$| is a shared attention row vector and can be regarded as a single-layer feedforward neural network. Next, we use the softmax function to normalize the weights |${e}_{ij}^{(l-1)}$| to obtain the attention coefficients, as follows:

$$\begin{align} {\alpha}_{ij}^{\left(l-1\right)}=\mathrm{soft}{\max}_{k\in{N}_i}\left({e}_{ij}^{\left(l-1\right)}\right)\nonumber\\=\frac{\exp \left({e}_{ij}^{\left(l-1\right)}\right)}{{\sum_{ k\in{N}_i}\exp \left({e}_{ik}^{\left(l-1\right)}\right)}} \end{align}$$

(11)

We can therefore use these coefficients to calculate a new representation of node |$i$| by aggregating the information from its neighbors

$$\begin{equation} {h}_i^{(l)}=\sigma \left(\sum_{\,j\in{\mathcal{N}}_i}{\alpha}_{ij}^{\left(l-1\right)}{h}_j^{\left(l-1\right)}\right) \end{equation}$$

(12)

where |${\alpha}_{ij}^{(l-1)}$| is the attention coefficient of the projected feature |${h}_j^{(l-1)}$|⁠. The power of the node-level attention auto-encoder also lay in its multiheaded attention mechanism, where we apply T independent attentions of node i to its neighborhood and the output of the node features is as follows:

$$\begin{equation} {h}_i^{(l)}=\sigma \left(\frac{1}{T}{\sum}_{t=1}^T{\sum}_{j\in{\mathcal{N}}_i}{\alpha}_{ij}^{\left(l-1,t\right)}{h}_j^{\left(l-1\right)}\right) \end{equation}$$

(13)

Finally, we obtained the embedding of the |$l$|th layer |${H}^{(l)}=\{{h}_1^{(l)},{h}_2^{(l)},\dots, {h}_{M+N}^{(l)}\},{h}_i^{(l)}\in{R}^{F^{(l)}}$|⁠.

Residual module

Stacking multiple GAT layers tends to lead to a common vanishing gradient problem, which means that back propagation through these networks induces over-smoothing that eventually leads to convergence of graph vertex features within each connected component to the same value. Therefore, we propose a residual module that can be considered as a variant of DeepGCN [24]. By using the residual module, convolutional layers with residual structure tend to learn residual mappings instead of standard mappings, which makes the network easier to train. We combine the residual module with the GAT framework’s embedding update process as follows:

$$\begin{equation} {H}^{l+1}={H}_{\mathrm{agg}}^l+{H}_{\mathrm{res}}^l=\mathrm{Aggregate}\left({H}^l,{N}_H\right)+\mathrm{Linear}\left({H}^l\right)\kern0em \end{equation}$$

(14)

where |${H}_{\mathrm{agg}}^l$|⁠, |${H}_{\mathrm{res}}^l$|denote the lth aggregated representation and the projection representation, respectively, and |${N}_H$| is the set of neighboring nodes.

CNN combiner

Using multiple convolutional kernels of CNN can help learn the complex nonlinear relationships in networks, so after obtaining the embedding representation of each layer, we use CNN to integrate the embedding representation of each layer. Given the layer embedding sequence |${H}_A=({H_A}^1,{H_A}^2,\dots, {H_A}^{C_{in}^A})$|⁠, the final embedding |${H}_A^{\prime}\in{R}^{C_{out}^A\times (\mathrm{M}+\mathrm{N})}$| is defined as follows:

$$\begin{equation} \mathrm{Cou}{\mathrm{t}}_k={\phi}_{\mathrm{agg}}\left({H}_A\right)=\mathrm{bia}{\mathrm{s}}_k+\sum^{{C_{\mathrm{in}}^A}}_{i=1}{H}_A^i\ast{w}_k^A \end{equation}$$

(15)

$$\begin{equation} {H}_A^{\prime }=\left[\begin{array}{c}{H}_{C_A}\\{}{H}_{D_A}\end{array}\right]=\mathrm{stack}\left(\mathrm{Cou}{\mathrm{t}}_k\right) \end{equation}$$

(16)

where |${w}_k^A\in{R}^{F_A\times 1}$| denotes convolution filter, which belongs to the set |${W}_{\mathrm{out}}^A=\{{w}_1^A,{w}_2^A,\dots, {w}_{C_{\mathrm{out}}^A}^A\}$|⁠, |$\ast$| denotes convolution operator and |$\mathrm{Cou}{\mathrm{t}}_k\in{R}^{1\times (\mathrm{M}+\mathrm{N})}$| denotes the embedding from the kth output channel, where |$k=1,2\dots, {C}_{\mathrm{out}}^A$|⁠. The final embedding |${H}_A^{\prime }$|is obtained by stacking embeddings of multiple output channels. |${H}_{C_A}$|and |${H}_{D_A}$| are circRNA and drug embedding representations learned from |${A}_N$|⁠, respectively. Similarly, the circRNA embedding representation can be learned from |${C}_N$| in a similar way and is represented as |${H}_{C_C}$|⁠, and the embedding representation of drug is learned from |${D}_N$| in a similar way and is represented as |${H}_{D_D}$|⁠.

Inner product decoder

After performing the above steps, we can obtain the final embedding

$$\begin{equation} \left[\begin{array}{c}{H}_C\\{}{H}_D\end{array}\right]=\left[\begin{array}{c}{H}_{C_A}\Vert{H}_{C_C}\\{}{H}_{D_A}\Vert{H}_{D_D}\end{array}\right] \end{equation}$$

(17)

where |${H}_C\in{R}^{M\times 2F}$| and |${H}_D\in{R}^{N\times 2F}$| denote the final circRNA and drug embeddings, respectively, and |$\Vert$| denotes the vector concatenation operation. We implement an inner product decoder to reconstruct the final adjacency matrix. Since circRNA–drug association prediction can be considered as a specific binary classification problem, we choose the sigmoid function as the activation function in the decoder, which is widely used for binary classification tasks. After decoding, MNGACDA outputs the following prediction score matrix

$$\begin{equation} {A}^{\prime }=\mathrm{sigmoid}\left({H}_C{H}_D^T\right) \end{equation}$$

(18)

where |${A}_{ij}^{\prime }$| denotes the predicted association score between the circRNA |${c}_i$|and the drug |${d}_j$|⁠. When training MNGACDA, we use binary cross-entropy loss to optimize the model parameters

$$\begin{equation} \mathcal{L}=-{\sum}_{\left(i,j\right)\in{\mathcal{Y}}^{+}\cup{\mathcal{Y}}^{-}}\left[{A}_{ij}\ln{A}_{ij}^{\prime }+\left(1-{A}_{ij}\right)\ln \left(1-{A}_{ij}^{\prime}\right)\right] \end{equation}$$

(19)

where |${\mathcal{Y}}^{+}$| and |${\mathcal{Y}}^{-}$| denote the set of positive and negative instances for training, respectively. |$(i,j)$| denotes the training pair of circRNA |$i$|with drug |$j$|⁠. |${A}_{ij}$|and |${A}_{ij}^{\prime }$| represent the ground truth association score and the prediction score of circRNA |$i$| with drug |$j$|⁠.

Results

Evaluation metrics

In order to comprehensively evaluate the performance of MNGACDA, we conduct 5-CV and 10-CV experiments on the benchmark data set. In the 5-CV experiments, we first randomly generate the same number of negative circRNA–drug sensitivity pairs as that of the experimentally confirmed positive samples, and then divide them into five subsets with equal numbers. Each subset is used in turn as a test sample set, and the remaining four subsets are used as training sample sets. We repeat this five times to receive reliable results. The same evaluation steps are implemented in the 10-CV experiments.

We compare the model performance by using seven metrics: area under the receiver operating characteristic (ROC) curve (AUC), area under the precision-recall (PR) curve (AUPR), Accuracy, Precision, Recall, F1-Score and Specificity, which are defined by Equations (20–24). Specifically, the ROC curve reflects the relationship between the sensitivity and specificity of the prediction model. We draw ROC curves with the corresponding true positive rate (TPR) and false positive rate (FPR) at different thresholds. The PR curve depicts the relationship between prediction accuracy and recall. Similar to the ROC curve, we plot the PR curve based on the accuracy and recall at different thresholds. Larger AUC and AUPR indicate better performance of the model. F1-Score is the summed average of accuracy and recall, and specificity indicates the proportion of all negative cases identified and measures the ability of the classifier to identify negative cases.

$$\begin{equation} \mathrm{TPR}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}},\mathrm{FPR}=\frac{\mathrm{FP}}{\mathrm{TN}+\mathrm{FP}} \end{equation}$$

(20)

$$\begin{equation} \Pr \mathrm{ecision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}},\operatorname{Re}\mathrm{call}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} \end{equation}$$

(21)

$$\begin{equation} \mathrm{Specificity}=\frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{FP}} \end{equation}$$

(22)

$$\begin{equation} \mathrm{F}1-\mathrm{Score}=2\times \frac{\mathrm{Precision}\cdot \mathrm{Recall}}{\mathrm{Precision}+\mathrm{Recall}} \end{equation}$$

(23)

$$\begin{equation} \mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}} \end{equation}$$

(24)

Performance comparison with other methods under 5-CV and 10-CV experiments

As stated in Ref. [9], current computational methods for circRNA–drug sensitivity association predictions are limited. In order to validate the performance of MNGACDA, we compare it with six state-of-the-art models, i.e. GATECDA [9], MINIMDA [25], LAGCN [26], MMGCN [27], GANLDA [28] and CRPGCN [29]. It should be noted that except GATECDA, the other five well-known methods have been used in other association prediction fields, such as miRNA–disease associations and drug–disease associations.

GATECDA [9]: a computational framework for predicting circRNA–drug sensitivity associations based on a graph attentional auto-encoder.
MINIMDA [25]: predicting miRNA–disease associations using GCN in a multimodal network by fusing higher-order neighborhood information of miRNAs and diseases.
LAGCN [26]: a method integrates known drug–disease associations, drug–drug similarities and disease–disease similarities into a heterogeneous network, and applies graph convolution operation for drug–disease association prediction.
MMGCN [27]: predicting potential miRNA–disease associations using multi-view multichannel attentional graph convolutional networks.
GANLDA [28]: an end-to-end computational model based on GAT to predict associations between lncRNAs and diseases.
CRPGCN [29]: a novel algorithm that is based on GCN constructed with random walk with restart and principal component analysis to predict associations between circRNAs and diseases.

We conduct 5-CV and 10-CV experiments on the data set for prediction performance evaluation, respectively. All methods are compared under the same experimental settings with the recommended optimal parameters in their studies. For the 5-CV experiments, as shown in Figure 2A, the average AUC value of MNGACDA is 0.9139, which is higher than that of other methods by 2.66% (GATECDA), 5.77% (MINIMDA), 6.34% (LAGCN), 3.73% (MMGCN), 6.22% (GANLDA) and 6.52%(CRPGCN), respectively. The results of AUPR are shown in Figure 2B, and we can observe that the average AUPR value of MNGACDA is 0.9209, which is higher than that of other methods by 2.94% (GATECDA), 6.75% (MINIMDA), 7.31% (LAGCN), 5.45% (MMGCN), 7.41% (GANLDA) and 5.25% (CRPGCN), respectively. In addition, the values of other performance metrics, including Accuracy, Precision, Recall, Specificity and F1-Score, are shown in Table 1, and MNGACDA obtains the best results of 0.8472, 0.8424, 0.8723, 0.8155 and 0.8247, respectively.

Figure 2

ROC and PR curves based on 5-CV experiments.

Open in new tab Download slide

Table 1

Open in new tab

Performance comparison based on 5-CV

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8472	0.8424	0.8723	0.8155	0.8247
GATECDA [9]	0.8224	0.8186	0.8404	0.7966	0.8054
MINIMDA [25]	0.7988	0.7901	0.8331	0.7472	0.7684
LAGCN [26]	0.7900	0.7786	0.8338	0.7233	0.7516
MMGCN [27]	0.8190	0.8183	0.8231	0.8135	0.8156
GANLDA [28]	0.7936	0.7822	0.8384	0.7259	0.7542
CRPGCN [29]	0.7899	0.7937	0.7738	0.8135	0.8081

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8472	0.8424	0.8723	0.8155	0.8247
GATECDA [9]	0.8224	0.8186	0.8404	0.7966	0.8054
MINIMDA [25]	0.7988	0.7901	0.8331	0.7472	0.7684
LAGCN [26]	0.7900	0.7786	0.8338	0.7233	0.7516
MMGCN [27]	0.8190	0.8183	0.8231	0.8135	0.8156
GANLDA [28]	0.7936	0.7822	0.8384	0.7259	0.7542
CRPGCN [29]	0.7899	0.7937	0.7738	0.8135	0.8081

The bold result indicates the best one in each column.

Table 1

Open in new tab

Performance comparison based on 5-CV

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8472	0.8424	0.8723	0.8155	0.8247
GATECDA [9]	0.8224	0.8186	0.8404	0.7966	0.8054
MINIMDA [25]	0.7988	0.7901	0.8331	0.7472	0.7684
LAGCN [26]	0.7900	0.7786	0.8338	0.7233	0.7516
MMGCN [27]	0.8190	0.8183	0.8231	0.8135	0.8156
GANLDA [28]	0.7936	0.7822	0.8384	0.7259	0.7542
CRPGCN [29]	0.7899	0.7937	0.7738	0.8135	0.8081

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8472	0.8424	0.8723	0.8155	0.8247
GATECDA [9]	0.8224	0.8186	0.8404	0.7966	0.8054
MINIMDA [25]	0.7988	0.7901	0.8331	0.7472	0.7684
LAGCN [26]	0.7900	0.7786	0.8338	0.7233	0.7516
MMGCN [27]	0.8190	0.8183	0.8231	0.8135	0.8156
GANLDA [28]	0.7936	0.7822	0.8384	0.7259	0.7542
CRPGCN [29]	0.7899	0.7937	0.7738	0.8135	0.8081

The bold result indicates the best one in each column.

For the 10-CV experiments, as shown in Figure 3A, MNGACDA receives an average AUC value of 0.9182, which is higher than that of other methods by 2.48% (GATECDA), 4.71% (MINIMDA), 6.15% (LAGCN), 3.70% (MMGCN), 5.59% (GANLDA) and 7.08%(CRPGCN), respectively. The results of AUPR are shown in Figure 3B, and it can be seen that the average AUPR value of MNGACD is 0.9249, which is higher than that of other methods by 2.27% (GATECDA), 5.24% (MINIMDA), 6.99% (LAGCN), 5.37% (MMGCN), 6.06% (GANLDA) and 7.48% (CRPGCN), respectively. The values of other performance metrics, including Accuracy, Precision, Recall, Specificity and F1-Score, are shown in Table 2, and MNGACDA obtains the best results of 0.8519, 0.8498, 0.8646, 0.8350 and 0.8401, respectively. These comprehensive results show that MNGACDA outperforms the six state-of-the-art methods.

Figure 3

ROC and PR curves based on 10-CV experiments.

Open in new tab Download slide

Table 2

Open in new tab

Performance comparison based on 10-CV

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8519	0.8498	0.8646	0.8350	0.8401
GATECDA [9]	0.8286	0.8273	0.8367	0.8172	0.8227
MINIMDA [25]	0.8129	0.8031	0.8537	0.7526	0.7767
LAGCN [26]	0.7994	0.7861	0.8522	0.7199	0.7542
MMGCN [27]	0.8241	0.8182	0.8510	0.7855	0.7999
GANLDA [28]	0.8065	0.7956	0.8505	0.7407	0.7679
CRPGCN [29]	0.7993	0.7987	0.8007	0.7966	0.7998

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8519	0.8498	0.8646	0.8350	0.8401
GATECDA [9]	0.8286	0.8273	0.8367	0.8172	0.8227
MINIMDA [25]	0.8129	0.8031	0.8537	0.7526	0.7767
LAGCN [26]	0.7994	0.7861	0.8522	0.7199	0.7542
MMGCN [27]	0.8241	0.8182	0.8510	0.7855	0.7999
GANLDA [28]	0.8065	0.7956	0.8505	0.7407	0.7679
CRPGCN [29]	0.7993	0.7987	0.8007	0.7966	0.7998

Table 2

Open in new tab

Performance comparison based on 10-CV

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8519	0.8498	0.8646	0.8350	0.8401
GATECDA [9]	0.8286	0.8273	0.8367	0.8172	0.8227
MINIMDA [25]	0.8129	0.8031	0.8537	0.7526	0.7767
LAGCN [26]	0.7994	0.7861	0.8522	0.7199	0.7542
MMGCN [27]	0.8241	0.8182	0.8510	0.7855	0.7999
GANLDA [28]	0.8065	0.7956	0.8505	0.7407	0.7679
CRPGCN [29]	0.7993	0.7987	0.8007	0.7966	0.7998

Method	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.8519	0.8498	0.8646	0.8350	0.8401
GATECDA [9]	0.8286	0.8273	0.8367	0.8172	0.8227
MINIMDA [25]	0.8129	0.8031	0.8537	0.7526	0.7767
LAGCN [26]	0.7994	0.7861	0.8522	0.7199	0.7542
MMGCN [27]	0.8241	0.8182	0.8510	0.7855	0.7999
GANLDA [28]	0.8065	0.7956	0.8505	0.7407	0.7679
CRPGCN [29]	0.7993	0.7987	0.8007	0.7966	0.7998

Parameter sensitivity analysis

The following parameters exist in MNGACDA: (1) the number of heads T of the multiheaded attention mechanism, (2) the hidden layer embedding dimension F and (3) the number of layers n of the GAT. We implement experiments on the benchmark data set and evaluate the prediction performance under 5-CV for parametric analysis.

First, MNGACDA uses a multiheaded attention mechanism to provide it with more powerful representation learning capability. As shown in Figure 4, the model shows the best performance when the number of attention heads reaches 3. The results show that increasing the number of attention heads can improve the performance of the GAT model within a certain range.

Figure 4

Parameter sensitivity analysis for the number of heads T.

Open in new tab Download slide

Second, we analyze the effect of varying the hidden layer embedding F. As shown in Figure 5, the best performance was achieved for both AUC and AUPR when F = 128.

Figure 5

Parameter sensitivity analysis for the embedding dimension F.

Open in new tab Download slide

Finally, we analyze the effect of the number of graph attention layers on the prediction performance in MNGACDA. As shown in Figure 6, the best AUC and AUPR values are obtained when n is 2. In addition, we observe that the performance of the GAT encoder decreases as the number of graph attention layers increases, and that there still exists good performance when n increases to 4 because of the addition of the residual module to alleviate over-smoothing, but performance decreases sharply when n continues to increase.

Figure 6

Parameter sensitivity analysis for the number of layers n.

Open in new tab Download slide

Based on the above evaluation, we set T to 3, F to 128 and n to 2 to obtain the best performance. In addition, MNGACDA uses Xavier [30] to initialize the model parameters and Adam [31] as an optimizer (in which learning rate = 0.001, weight decay = 0.005, and the epochs used for training MNGACDA is 1000).

Ablation tests

Since our method MNGACDA utilizes both information of similarity networks and association networks to learn embeddings for circRNAs and drugs. We propose another two embedding extraction models, called MNGACDA-sim and MNGACDA-asso, to evaluate the impact of node embeddings from different modal networks on the prediction performance, respectively. Specifically, MNGACDA-sim learns the embeddings of circRNAs and drugs only from their integrated similarity networks, whereas MNGACDA-asso learns the embedding representations of circRNAs and drugs only from the circRNA–drug association network. We compare the prediction performance of MNGACDA, MNGACDA-asso and MNGACDA-sim under different embedding sizes (F = {32, 64, 128, 256}). Moreover, MNGACA includes three key modules: node-level attention mechanism, residual module and CNN combiner. We separately remove each component in MNGACDA and use 5-CV experiments on the benchmark data set to investigate the impact of different components on prediction ability. The followings are four models we test and compare:

MNGACDA-RES model: preserves the backbone graph attention layer and CNN combiner module, and removes the residual structure from the original framework.
MNGACDA-CNN model: retains the GAT backbone and residual modules, and replaces the CNN combiner module with a simple concatenation operation.
MNGACDA-GAT model: preserves the residual and CNN combiner module, and replaces the GAT backbone with a common GCN backbone.
MNGACDA-LA: to further demonstrate that the CNN combiner module can help us learn complex nonlinear relationships in node features, we keep the GAT backbone and residual modules, and replace the CNN combiner module with the layer attention module of LAGCN [26], which uses layer attention to compute the attention score of each layer embedding, and the final embedding is a linearly weighted sum of each layer.

Figure 7

Effect of node embedding extracted from different networks on prediction.

Open in new tab Download slide

Table 3

Open in new tab

Ablation tests based on 5-CV

Method	AUC	AUPR	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.9139	0.9209	0.8472	0.8424	0.8723	0.8155	0.8247
MNGACDA-LA	0.9016	0.9106	0.8374	0.8350	0.8495	0.8205	0.8269
MNGACDA-CNN	0.8864	0.8971	0.8356	0.8303	0.8626	0.7980	0.8107
MNGACDA-GAT	0.9055	0.9160	0.8399	0.8377	0.8508	0.8246	0.8307
MNGACDA-RES	0.9120	0.9193	0.8412	0.8384	0.8541	0.8227	0.8302

Method	AUC	AUPR	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.9139	0.9209	0.8472	0.8424	0.8723	0.8155	0.8247
MNGACDA-LA	0.9016	0.9106	0.8374	0.8350	0.8495	0.8205	0.8269
MNGACDA-CNN	0.8864	0.8971	0.8356	0.8303	0.8626	0.7980	0.8107
MNGACDA-GAT	0.9055	0.9160	0.8399	0.8377	0.8508	0.8246	0.8307
MNGACDA-RES	0.9120	0.9193	0.8412	0.8384	0.8541	0.8227	0.8302

Table 3

Open in new tab

Ablation tests based on 5-CV

Method	AUC	AUPR	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.9139	0.9209	0.8472	0.8424	0.8723	0.8155	0.8247
MNGACDA-LA	0.9016	0.9106	0.8374	0.8350	0.8495	0.8205	0.8269
MNGACDA-CNN	0.8864	0.8971	0.8356	0.8303	0.8626	0.7980	0.8107
MNGACDA-GAT	0.9055	0.9160	0.8399	0.8377	0.8508	0.8246	0.8307
MNGACDA-RES	0.9120	0.9193	0.8412	0.8384	0.8541	0.8227	0.8302

Method	AUC	AUPR	F1-Score	Accuracy	Recall	Specificity	Precision
MNGACDA	0.9139	0.9209	0.8472	0.8424	0.8723	0.8155	0.8247
MNGACDA-LA	0.9016	0.9106	0.8374	0.8350	0.8495	0.8205	0.8269
MNGACDA-CNN	0.8864	0.8971	0.8356	0.8303	0.8626	0.7980	0.8107
MNGACDA-GAT	0.9055	0.9160	0.8399	0.8377	0.8508	0.8246	0.8307
MNGACDA-RES	0.9120	0.9193	0.8412	0.8384	0.8541	0.8227	0.8302

Table 4

Open in new tab

The Top 20 circRNAs associated with the drug Vorinostat

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	CALD^*	CTRP	11	FKBP10^*	CTRP
2	ANP32B^*	CTRP	12	ZFP36L1^*	CTRP
3	JUP	Nonsignificant	13	ARID1A^*	CTRP
4	NOP53^*	CTRP	14	HSP90B1	Nonsignificant
5	FN1^*	CTRP	15	LGALS3BP^*	CTRP
6	FLNA^*	CTRP	16	PYGB^*	CTRP
7	PLEC^*	CTRP	17	CXCL1^*	CTRP
8	ANKRD36C^*	CTRP	18	PRRC2A^*	CTRP
9	CTSB^*	CTRP	19	GRN^*	CTRP
10	ALDH3A2^*	CTRP	20	ESRP2	Nonsignificant

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	CALD^*	CTRP	11	FKBP10^*	CTRP
2	ANP32B^*	CTRP	12	ZFP36L1^*	CTRP
3	JUP	Nonsignificant	13	ARID1A^*	CTRP
4	NOP53^*	CTRP	14	HSP90B1	Nonsignificant
5	FN1^*	CTRP	15	LGALS3BP^*	CTRP
6	FLNA^*	CTRP	16	PYGB^*	CTRP
7	PLEC^*	CTRP	17	CXCL1^*	CTRP
8	ANKRD36C^*	CTRP	18	PRRC2A^*	CTRP
9	CTSB^*	CTRP	19	GRN^*	CTRP
10	ALDH3A2^*	CTRP	20	ESRP2	Nonsignificant

Note: circRNAs marked with ‘^*’ are verified.

Table 4

Open in new tab

The Top 20 circRNAs associated with the drug Vorinostat

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	CALD^*	CTRP	11	FKBP10^*	CTRP
2	ANP32B^*	CTRP	12	ZFP36L1^*	CTRP
3	JUP	Nonsignificant	13	ARID1A^*	CTRP
4	NOP53^*	CTRP	14	HSP90B1	Nonsignificant
5	FN1^*	CTRP	15	LGALS3BP^*	CTRP
6	FLNA^*	CTRP	16	PYGB^*	CTRP
7	PLEC^*	CTRP	17	CXCL1^*	CTRP
8	ANKRD36C^*	CTRP	18	PRRC2A^*	CTRP
9	CTSB^*	CTRP	19	GRN^*	CTRP
10	ALDH3A2^*	CTRP	20	ESRP2	Nonsignificant

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	CALD^*	CTRP	11	FKBP10^*	CTRP
2	ANP32B^*	CTRP	12	ZFP36L1^*	CTRP
3	JUP	Nonsignificant	13	ARID1A^*	CTRP
4	NOP53^*	CTRP	14	HSP90B1	Nonsignificant
5	FN1^*	CTRP	15	LGALS3BP^*	CTRP
6	FLNA^*	CTRP	16	PYGB^*	CTRP
7	PLEC^*	CTRP	17	CXCL1^*	CTRP
8	ANKRD36C^*	CTRP	18	PRRC2A^*	CTRP
9	CTSB^*	CTRP	19	GRN^*	CTRP
10	ALDH3A2^*	CTRP	20	ESRP2	Nonsignificant

Note: circRNAs marked with ‘^*’ are verified.

As shown in Figure 7, the AUC and AUPR values of MNGACDA with different embedding sizes are consistently higher than those of MNGACDA-asso and MNGACDA-sim, which demonstrates the effectiveness of integrating different modal networks to improve the model performance. As shown in Table 3, the performance of MNGACDA-RES is only slightly higher than MNGACDA in Specificity and Precision, whereas all other performance metrics are worse than MNGACDA, which indicates that the introduction of the residual module can train a more efficient deep neural network, and the performance of MNGACDA-CNN is also worse compared with MNGACDA, which indicates that the CNN combiner module can effectively learn the complex nonlinear relationships in the node features. MNGACDA-LA is also only slightly better than MNGACDA in terms of specificity and accuracy, because the CNN combination module takes into account the connections that exist between embeddings, rather than calculating the attention scores for each layer of embeddings independently. MNGACDA-GAT also performs less well than MNGACDA, which implies the effectiveness of introducing a node-level attention mechanism, in which attention coefficients are used to adjust the priority of messages sent by different neighboring nodes to aggregate and update node embeddings, therefore facilitates GAT to achieve better prediction performance than normal GCN.

Case studies

In this section, we first apply the model MNGACDA to all known information used in this study for new association predictions. Because the known associations are from the database GDSC [10], we search another independent database CTRP [32] for identifying predicted new associations. We select the Top 20 inferred results with the highest scores from two drugs (Vorinostat and PAC-1) for validation.

Vorinostat, a synthetic hydroxamic acid derivative with antitumor activity, is approved for refractory or relapsed cutaneous T-cell lymphoma [33]. As shown in Table 4, 17 circRNAs have been confirmed in CTRP among the Top 20 results predicted to be associated with the drug Vorinostat. Meanwhile, PAC-1 is a potent procaspase-3 activator, inducing apoptosis in cancer cells and non-cancer cells dependent on procaspase-3 concentration [34]. We list the Top 20 predicted PAC-1-related circRNAs in Table 5, and discover that 15 circRNAs have been identified in CTRP.

Table 5

Open in new tab

The Top 20 circRNAs associated with the drug PAC-1

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	11	FKBP10^*	CTRP
2	VIM^*	CTRP	12	EHBP1L1	Nonsignificant
3	TCOF1	Nonsignificant	13	EFEMP1^*	CTRP
4	THBS1^*	CTRP	14	COL1A1^*	CTRP
5	ENO2	Nonsignificant	15	AATF	Nonsignificant
6	MEF2D^*	CTRP	16	CRIM1^*	CTRP
7	FBLN1^*	CTRP	17	CRIM1^*	CTRP
8	NCL	Nonsignificant	18	ANKRD36^*	CTRP
9	COL1A2^*	CTRP	19	CTSB^*	CTRP
10	ANP32B^*	CTRP	20	COL6A2^*	CTRP

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	11	FKBP10^*	CTRP
2	VIM^*	CTRP	12	EHBP1L1	Nonsignificant
3	TCOF1	Nonsignificant	13	EFEMP1^*	CTRP
4	THBS1^*	CTRP	14	COL1A1^*	CTRP
5	ENO2	Nonsignificant	15	AATF	Nonsignificant
6	MEF2D^*	CTRP	16	CRIM1^*	CTRP
7	FBLN1^*	CTRP	17	CRIM1^*	CTRP
8	NCL	Nonsignificant	18	ANKRD36^*	CTRP
9	COL1A2^*	CTRP	19	CTSB^*	CTRP
10	ANP32B^*	CTRP	20	COL6A2^*	CTRP

Note: circRNAs marked with ‘^*’ are verified.

Table 5

Open in new tab

The Top 20 circRNAs associated with the drug PAC-1

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	11	FKBP10^*	CTRP
2	VIM^*	CTRP	12	EHBP1L1	Nonsignificant
3	TCOF1	Nonsignificant	13	EFEMP1^*	CTRP
4	THBS1^*	CTRP	14	COL1A1^*	CTRP
5	ENO2	Nonsignificant	15	AATF	Nonsignificant
6	MEF2D^*	CTRP	16	CRIM1^*	CTRP
7	FBLN1^*	CTRP	17	CRIM1^*	CTRP
8	NCL	Nonsignificant	18	ANKRD36^*	CTRP
9	COL1A2^*	CTRP	19	CTSB^*	CTRP
10	ANP32B^*	CTRP	20	COL6A2^*	CTRP

Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	11	FKBP10^*	CTRP
2	VIM^*	CTRP	12	EHBP1L1	Nonsignificant
3	TCOF1	Nonsignificant	13	EFEMP1^*	CTRP
4	THBS1^*	CTRP	14	COL1A1^*	CTRP
5	ENO2	Nonsignificant	15	AATF	Nonsignificant
6	MEF2D^*	CTRP	16	CRIM1^*	CTRP
7	FBLN1^*	CTRP	17	CRIM1^*	CTRP
8	NCL	Nonsignificant	18	ANKRD36^*	CTRP
9	COL1A2^*	CTRP	19	CTSB^*	CTRP
10	ANP32B^*	CTRP	20	COL6A2^*	CTRP

Note: circRNAs marked with ‘^*’ are verified.

To further evaluate the performance of MNGACDA in predicting potential circRNAs associated with new drugs, we choose two drugs (Crizotinib and Bortezomib) with only one known circRNA–drug association in the data set for de novo testing. For each drug, we remove the only association information and consider it as a new drug, whereas the other existing associations are used as training samples. We prioritize the associated circRNAs according to the final association scores.

Crizotinib is an orally available aminopyridine-based inhibitor of the receptor tyrosine kinase anaplastic lymphoma kinase and the c-Met/hepatocyte growth factor receptor with antineoplastic activity [35]. Bortezomib is a proteasome inhibitor and antineoplastic agent that is used in treatment of refractory multiple myeloma and certain lymphomas [36]. As shown in Table 6, 8 out of the Top 10 predicted circRNAs associated with Crizotinib have been identified in CTRP, and 5 of the Top 10 circRNAs associated with Bortezomib have been identified in CTRP.

Table 6

Open in new tab

The Top 10 predicted circRNAs associated with the two new drugs Crizotinib and Bortezomib

Crizotinib			Bortezomib
Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	1	POLR2A	Nonsignificant
2	VIM^*	CTRP	2	THBS1^*	CTRP
3	THBS1^*	CTRP	3	ANP32B	Nonsignificant
4	ANP32B^*	CTRP	4	ENO2	Nonsignificant
5	ENO2	Nonsignificant	5	ANKRD36	Nonsignificant
6	ANKRD36^*	CTRP	6	SPINT2^*	CTRP
7	SPINT2^*	CTRP	7	FBLN1^*	CTRP
8	TCOF1	Nonsignificant	8	TCOF1	Nonsignificant
9	FBLN1^*	CTRP	9	EFEMP1^*	CTRP
10	EFEMP1^*	CTRP	10	MEF2D^*	CTRP

Crizotinib			Bortezomib
Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	1	POLR2A	Nonsignificant
2	VIM^*	CTRP	2	THBS1^*	CTRP
3	THBS1^*	CTRP	3	ANP32B	Nonsignificant
4	ANP32B^*	CTRP	4	ENO2	Nonsignificant
5	ENO2	Nonsignificant	5	ANKRD36	Nonsignificant
6	ANKRD36^*	CTRP	6	SPINT2^*	CTRP
7	SPINT2^*	CTRP	7	FBLN1^*	CTRP
8	TCOF1	Nonsignificant	8	TCOF1	Nonsignificant
9	FBLN1^*	CTRP	9	EFEMP1^*	CTRP
10	EFEMP1^*	CTRP	10	MEF2D^*	CTRP

Note: circRNAs marked with ‘^*’ are verified.

Table 6

Open in new tab

The Top 10 predicted circRNAs associated with the two new drugs Crizotinib and Bortezomib

Crizotinib			Bortezomib
Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	1	POLR2A	Nonsignificant
2	VIM^*	CTRP	2	THBS1^*	CTRP
3	THBS1^*	CTRP	3	ANP32B	Nonsignificant
4	ANP32B^*	CTRP	4	ENO2	Nonsignificant
5	ENO2	Nonsignificant	5	ANKRD36	Nonsignificant
6	ANKRD36^*	CTRP	6	SPINT2^*	CTRP
7	SPINT2^*	CTRP	7	FBLN1^*	CTRP
8	TCOF1	Nonsignificant	8	TCOF1	Nonsignificant
9	FBLN1^*	CTRP	9	EFEMP1^*	CTRP
10	EFEMP1^*	CTRP	10	MEF2D^*	CTRP

Crizotinib			Bortezomib
Ranking	circRNA	Evidence	Ranking	circRNA	Evidence
1	POLR2A^*	CTRP	1	POLR2A	Nonsignificant
2	VIM^*	CTRP	2	THBS1^*	CTRP
3	THBS1^*	CTRP	3	ANP32B	Nonsignificant
4	ANP32B^*	CTRP	4	ENO2	Nonsignificant
5	ENO2	Nonsignificant	5	ANKRD36	Nonsignificant
6	ANKRD36^*	CTRP	6	SPINT2^*	CTRP
7	SPINT2^*	CTRP	7	FBLN1^*	CTRP
8	TCOF1	Nonsignificant	8	TCOF1	Nonsignificant
9	FBLN1^*	CTRP	9	EFEMP1^*	CTRP
10	EFEMP1^*	CTRP	10	MEF2D^*	CTRP

Note: circRNAs marked with ‘^*’ are verified.

Conclusion

Recent studies have shown that circRNAs play a critical role in affecting drug sensitivity. Predicting circRNA–drug sensitivity associations can facilitate drug discovery, thus contributing to the treatment of diseases. In this study, we propose a deep learning-based approach called MNGACDA to discover potential circRNA–drug sensitivity associations. In order to validate the effectiveness of our model, we compare MNGACDA with six state-of-the-art methods on a benchmark data set based on 5-CV and 10-CV, and MNGACDA achieves the best prediction results. Furthermore, most of the top results of case studies are validated by independent database, suggesting that MNGACDA is an effective tool for predicting new circRNA–drug sensitivity associations.

It should be noted that the number of circRNA–drug sensitivity associations validated by biological experiments is small, which might produce biased prediction results. Integrating more experimentally supported circRNA–drug sensitivity associations would make the predictions more reliable. Meanwhile, similarity measures are a key factor in determining model accuracy in this study. In future work, we plan to incorporate more sources of biomedical date to construct more comprehensive similarities. Finally, current computational methods to investigate the relationships between circRNAs and drug sensitivity are limited. More efforts are in need in this research field.

Key Points

We propose a computational method MNGACDA for circRNA–drug sensitivity association inference, in which embedded representations of circRNAs and drugs are learnt from multimodal networks.
MNGACDA adaptively captures the internal information between nodes in the multimodal networks through a node-level attention graph auto-encoder.
Experimental results on the benchmark data set show that MNGACDA outperforms other state-of-the-art methods. In addition, case studies suggest that it is an effective tool for predicting potential circRNA–drug sensitivity associations.

Authors’ contribution

H.C. conceived and designed this study. B.Y. implemented the experiments. B.Y. and H.C. analyzed the results. B.Y. and H.C. wrote the manuscript. Both authors read and approved the final manuscript.

Data availability

The data sets and source codes used in this study are freely available at https://github.com/youngbo9i/MNGACDA.

Acknowledgements

We would like to thank Zixuan Liu at School of Software, Xinjiang University for useful discussion.

Funding

National Natural Science Foundation of China (61862026).

Bo Yang is a graduate student at School of Software, East China Jiaotong University. His research interest is deep learning and bioinformatics.

Hailin Chen, PhD, is an associate professor at School of Software, East China Jiaotong University. His research interest includes data mining and bioinformatics.

References

1.

Salzman

J

,

Gawad

C

,

Wang

PL

, et al.

Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types

.

PLoS One

2012

;

7

:e30733.

Google Scholar

OpenURL Placeholder Text

WorldCat

2.

Memczak

S

,

Jens

M

,

Elefsinioti

A

, et al.

Circular RNAs are a large class of animal RNAs with regulatory potency

.

Nature

2013

;

495

:

333

–

8

.

3.

Hansen

TB

,

Jensen

TI

,

Clausen

BH

, et al.

Natural RNA circles function as efficient microRNA sponges

.

Nature

2013

;

495

:

384

–

8

.

4.

Xu

S

,

Zhou

L

,

Ponnusamy

M

, et al.

A comprehensive review of circRNA: from purification and identification to disease marker potential

.

PeerJ

2018

;

6

:e5503.

Google Scholar

OpenURL Placeholder Text

WorldCat

5.

Fan

C

,

Lei

X

,

Tie

J

, et al.

CircR2Disease v2. 0: an updated web server for experimentally validated circRNA–disease associations and its application

.

Genomics Proteomics Bioinformatics

2021

;

20

:435–45.

Google Scholar

OpenURL Placeholder Text

WorldCat

6.

Huang

MS

,

Yuan

FQ

,

Gao

Y

, et al.

Circular RNA screening from EIF3a in lung cancer

.

Cancer Med

2019

;

8

:

4159

–

68

.

7.

Xia

B

,

Zhao

Z

,

Wu

Y

, et al.

Circular RNA circTNPO3 regulates paclitaxel resistance of ovarian cancer cells by miR-1299/NEK2 signaling pathway

.

Mol Ther Nucleic Acids

2020

;

21

:

780

–

91

.

8.

Ruan

H

,

Xiang

Y

,

Ko

J

, et al.

Comprehensive characterization of circular RNAs in ~1000 human cancer cell lines

.

Genome Med

2019

;

11

:

55

.

9.

Deng

L

,

Liu

Z

,

Qian

Y

, et al.

Predicting circRNA-drug sensitivity associations via graph attention auto-encoder

.

BMC Bioinformatics

2022

;

23

:

160

.

10.

Yang

W

,

Soares

J

,

Greninger

P

, et al.

Genomics of drug sensitivity in cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells

.

Nucleic Acids Res

2012

;

41

:

D955

–

61

.

11.

Rangwala

SH

,

Kuznetsov

A

,

Ananiev

V

, et al.

Accessing NCBI data using the NCBI sequence viewer and genome data viewer (GDV)

.

Genome Res

2021

;

31

:

159

–

69

.

12.

Wang

Y

,

Bryant

SH

,

Cheng

T

, et al.

Pubchem bioassay: 2017 update

.

Nucleic Acids Res

2017

;

45

:

D955

–

63

.

13.

Landrum

G

.

RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling

.

2013

. http://www.rdkit.org/RDKit_Overview.pdf.

14.

Van Laarhoven

T

,

Nabuurs

SB

,

Marchiori

EJB

.

Gaussian interaction profile kernels for predicting drug–target interaction

.

Bioinformatics

2011

;

27

:

3036

–

43

.

15.

Zou

S

,

Zhang

J

,

Zhang

Z

.

A novel approach for predicting microbe-disease associations by bi-random walk on the heterogeneous network

.

PLoS One

2017

;

12

:e0184394.

Google Scholar

OpenURL Placeholder Text

WorldCat

16.

Chen

X

,

Xie

D

,

Wang

L

, et al.

BNPMDA: bipartite network projection for MiRNA-disease association prediction

.

Bioinformatics

2018

;

34

:

3178

–

86

.

17.

Veličković

P

,

Cucurull

G

,

Casanova

A

, et al.

Graph attention networks

.

arXiv preprint arXiv:1710.10903

,

2017

.

18.

Ji

C

,

Wang

Y

,

Ni

J

, et al.

Predicting miRNA-disease associations based on heterogeneous graph attention networks

.

Front Genet

2021

;

12

:727744.

Google Scholar

OpenURL Placeholder Text

WorldCat

19.

Wang

W

,

Chen

H

.

Predicting miRNA-disease associations based on graph attention networks and dual Laplacian regularized least squares

.

Brief Bioinform

2022

;

23

:

bbac292

.

20.

Kipf

TN

,

Welling

M

.

Semi-supervised classification with graph convolutional networks

.

arXiv preprint arXiv:1609.02907

,

2016

.

21.

Wu

F

,

Souza

A

,

Zhang

T

, et al. Simplifying graph convolutional networks. In:

International Conference on Machine Learning

.

PMLR

,

2019

,

6861

–

71

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

22.

He

T

,

Bai

L

,

Ong

Y-S

.

Graph joint attention networks

.

arXiv preprint arXiv:2102.03147

,

2021

.

23.

Brody

S

,

Alon

U

,

Yahav

E

.

How attentive are graph attention networks?

arXiv preprint arXiv:2105.14491

,

2021

.

24.

Li

G

,

Muller

M

,

Thabet

A

, et al. Deepgcns: can GCNS go as deep as CNNS? In:

Proceedings of the IEEE/CVF International Conference on Computer Vision

,

2019

,

9267

–

76

.

25.

Lou

Z

,

Cheng

Z

,

Li

H

, et al.

Predicting miRNA–disease associations via learning multimodal networks and fusing mixed neighborhood information

.

Brief Bioinform

2022

;

23

:

bbac159

.

26.

Yu

Z

,

Huang

F

,

Zhao

X

, et al.

Predicting drug-disease associations through layer attention graph convolutional network

.

Brief Bioinform

2021

;

22

:

bbaa243

.

27.

Tang

X

,

Luo

J

,

Shen

C

, et al.

Multi-view multichannel attention graph convolutional network for miRNA-disease association prediction

.

Brief Bioinform

2021

;

22

:

bbab174

.

28.

Lan

W

,

Wu

X

,

Chen

Q

, et al.

GANLDA: graph attention network for lncRNA-disease associations prediction

.

Neurocomputing

2022

;

469

:

384

–

93

.

Google Scholar

Crossref

WorldCat

29.

Ma

Z

,

Kuang

Z

,

Deng

L

.

CRPGCN: predicting circRNA-disease associations using graph convolutional network based on heterogeneous network

.

BMC Bioinformatics

2021

;

22

:

1

–

23

.

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

30.

Glorot

X

,

Bengio

Y

. Understanding the difficulty of training deep feedforward neural networks. In:

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics

,

JMLR Workshop and Conference Proceedings

,

2010

,

249

–

56

.

31.

Kingma

DP

,

Ba

J

.

Adam: a method for stochastic optimization

.

arXiv preprint arXiv:1412.6980

,

2014

.

32.

Rees

MG

,

Seashore-Ludlow

B

,

Cheah

JH

, et al.

Correlating chemical sensitivity and basal gene expression reveals mechanism of action

.

Nat Chem Biol

2016

;

12

:

109

–

16

.

33.

Bubna

AK

.

Vorinostat—an overview

.

Indian J Dermatol

2015

;

60

:

419

.

34.

Peterson

QP

,

Goode

DR

,

West

DC

, et al.

PAC-1 activates procaspase-3 in vitro through relief of zinc-mediated inhibition

.

J Mol Biol

2009

;

388

:

144

–

58

.

35.

Shaw

AT

,

Yasothan

U

,

Kirkpatrick

P

.

Crizotinib, nature reviews drug discovery

,

2011

;

10

(12):

897

–

98

.

36.

Paramore

A

,

Frantz

S

.

Fresh from the pipeline: Bortezomib

.

Nat Rev Drug Discov

2003

;

2

(8):

611

–

12

.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Month:	Total Views:
January 2023	150
February 2023	88
March 2023	63
April 2023	42
May 2023	40
June 2023	38
July 2023	44
August 2023	32
September 2023	29
October 2023	29
November 2023	75
December 2023	153
January 2024	430
February 2024	99
March 2024	120
April 2024	175
May 2024	99
June 2024	135
July 2024	88
August 2024	81
September 2024	112
October 2024	128
November 2024	84
December 2024	152
January 2025	94
February 2025	52
March 2025	103
April 2025	93

Article Contents

Predicting circRNA-drug sensitivity associations by learning multimodal networks using graph auto-encoders and attention mechanism

Abstract

Introduction

Materials and methods

Data sets

Similarity measures

Sequence similarity of host genes of circRNAs

Structural similarity of drugs

Gaussian interaction profile kernel similarity of circRNAs and drugs

Similarity fusion

MNGACDA

Node-level attention graph auto-encoder

Residual module

CNN combiner

Inner product decoder

Results

Evaluation metrics

Performance comparison with other methods under 5-CV and 10-CV experiments

Parameter sensitivity analysis

Ablation tests

Case studies

Conclusion

Authors’ contribution

Data availability

Acknowledgements

Funding

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Predicting circRNA-drug sensitivity associations by learning multimodal networks using graph auto-encoders and attention mechanism

Abstract

Introduction

Materials and methods

Data sets

Similarity measures

Sequence similarity of host genes of circRNAs

Structural similarity of drugs

Gaussian interaction profile kernel similarity of circRNAs and drugs

Similarity fusion

MNGACDA

Node-level attention graph auto-encoder

Residual module

CNN combiner

Inner product decoder

Results

Evaluation metrics

Performance comparison with other methods under 5-CV and 10-CV experiments

Parameter sensitivity analysis

Ablation tests

Case studies

Conclusion

Authors’ contribution

Data availability

Acknowledgements

Funding

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only