Abstract

Motivation

Identifying proteins that interact with drugs plays an important role in the initial period of developing drugs, which helps to reduce the development cost and time. Recent methods for predicting drug–protein interactions mainly focus on exploiting various data about drugs and proteins. These methods failed to completely learn and integrate the attribute information of a pair of drug and protein nodes and their attribute distribution.

Results

We present a new prediction method, GVDTI, to encode multiple pairwise representations, including attention-enhanced topological representation, attribute representation and attribute distribution. First, a framework based on graph convolutional autoencoder is constructed to learn attention-enhanced topological embedding that integrates the topology structure of a drug–protein network for each drug and protein nodes. The topological embeddings of each drug and each protein are then combined and fused by multi-layer convolution neural networks to obtain the pairwise topological representation, which reveals the hidden topological relationships between drug and protein nodes. The proposed attribute-wise attention mechanism learns and adjusts the importance of individual attribute in each topological embedding of drug and protein nodes. Secondly, a tri-layer heterogeneous network composed of drug, protein and disease nodes is created to associate the similarities, interactions and associations across the heterogeneous nodes. The attribute distribution of the drug–protein node pair is encoded by a variational autoencoder. The pairwise attribute representation is learned via a multi-layer convolutional neural network to deeply integrate the attributes of drug and protein nodes. Finally, the three pairwise representations are fused by convolutional and fully connected neural networks for drug–protein interaction prediction. The experimental results show that GVDTI outperformed other seven state-of-the-art methods in comparison. The improved recall rates indicate that GVDTI retrieved more actual drug–protein interactions in the top ranked candidates than conventional methods. Case studies on five drugs further confirm GVDTI’s ability in discovering the potential candidate drug-related proteins.

Contact

[email protected]Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.

1 Introduction

Drugs usually perform their functions by forming interactions with all sorts of molecular targets, among which proteins are a major class of targets [1, 2]. In the incipient stage of drug development, the identification of drug–target interactions (DTIs) is particularly important [3–6]. However, the identification of DTIs is a time-consuming and costly process [7]. Therefore, various calculation methods have been exploited to infer possible DTIs, providing biologists with information regarding drug-related protein candidates and reducing the workload of wet experiments [8–11].

Early calculation methods for determining drug–protein interactions are mainly divided into two categories. The first category comprises molecular-docking-based methods [12–14], which use the three-dimensional structure of the protein to predict DTIs. However, the three-dimensional structure of many proteins, such as the membrane protein GPCR [15, 16], is not available; this limits the performance of such approaches. The other category comprises ligand-based methods [17], which compare proteins with unknown ligands and those with known ligands. However, when the number of known binding ligands is small, these methods do not work well.

Over the years, computational methods based on machine learning have been proposed to predict drug–protein interactions. Ding et al. established a drug–protein interaction prediction model based on support vector machines, which mainly used the substructure fingerprint of drugs, the physical and chemical properties of the target organism, and the relationships between drugs and target proteins [18]. A support vector machine (SVM) framework based on bipartite local models (BLMs) was proposed by Bleakley and Yamanishi to predict drug–protein interactions [19]. DDR is established based on the random forest algorithm, which mainly utilizes the similarity information and interaction information of drugs and proteins for DTI prediction [20]. Xuan et al. proposed an approach, DTIGBDT, establishing a drug–protein interaction prediction model based on a gradient boosting decision tree (GBDT) [21].

However, most of these methods only use the similarity information and interaction information of drugs and proteins, and do not use other data sources. An information flow-based method is proposed to predict drug-related proteins [22]. DTINet, which mainly utilizes multifarious relationships among drugs, proteins and diseases to study the low-dimensional vector representation of nodes for predicting DTIs, has also been constructed [23]. There are complex correlations between various data regarding drugs and proteins. However, because most of the aforementioned approaches are shallow predictive models, it is difficult to learn such correlations using these methods.

Several recent prediction methods focus on building models based on deep learning to enhance the accuracy of the prediction of drug-related proteins. Sun et al. established a drug–protein interaction prediction model based on generative adversarial networks [24]. An ensemble learning method based on non-negative matrix factorization and GBDT was proposed to infer candidate proteins interacting with drugs [25]. Based on the Weisfeiler-Lehman Neural network, a drug–protein interaction prediction model is established; this model deeply fuses the similarity and interaction information of drugs and proteins [26]. The relationships of the associations among drugs, proteins and diseases also represent essential ancillary information for predicting drug–protein interactions. However, these methods fail to take advantage of information regarding drug-related and protein-related diseases. Zhang et al. established a model for drug–protein interaction prediction based on bidirectional gated recurrent unit, which deeply integrated multiple data related to drugs, proteins and diseases. However, this method failed to take into account the attribute distribution of node pairs [27].

To tackle the limitations in existing conventional methods for drug–protein interaction prediction, we propose a new model, GVDTI, to learn and integrate three pairwise representations from multi-source data, including the attention-enhanced topological representations, attribute distributions and attribute representations. The contributions of our model include:

  • To extract attention-enhanced topological structure and node attributes, we propose a graph convolutional autoencoder (GCA) based framework and an attribute-level attention mechanism. GCA extracts and embeds the hidden topological structure from drug similarity, drug–disease and drug–protein sub-networks. Since individual attribute in a node’s attribute vector have different contributions to topological embedding, we propose the new attention mechanism at node attribute level to adaptively learn and reflect the discriminative contributions of each sub-network’s node attribute.

  • To facilitate the extraction of attribute distribution and attribute representation of drug–protein node pairs, we first construct a drug–protein–disease heterogeneous network and an embedding strategy to associate the similarities, interactions and associations of pairs of nodes. The embedding strategy reflects the biological premise that a pair of drug and protein nodes is more likely to interact with each other if they share more common drugs, proteins or diseases.

  • We propose a novel convolutional variational autoencoder (CVAE) based approach to learn pairwise attribute distributions. The attribute distribution reveals the underlying drug–protein relationship in the established drug–protein–disease heterogeneous network by a convolutional variational encoding and decoding process to foster the prediction of drug-related proteins.

  • To extract drug–protein pairwise attribute representation, we design a new encoding strategy based on the multi-layer convolutional neural network (MCNN). The pairwise attribute representation integrates the similarities, interactions and correlations of a pair of drug and protein nodes. The ability of the proposed model, the learnt attention-enhanced topological representations, attribute representations and attribute distributions for drug–protein interaction prediction are demonstrated by comprehensive comparison with recently published models and case studies of five drugs.

2 Materials and Methods

Figure 1 demonstrates the framework of our model for predicting drug–protein interactions. First, we constructed a module based on graph convolutional networks with attention to capture the topology structures of multiple subnets, and the module learned the topology representation of each drug node and that of each protein node. Second, a tri-layer heterogeneous network composed of drug nodes, protein nodes and disease nodes was constructed. The attribute distribution representation and the attribute representation of a drug–protein node pair were encoded separately. Finally, these three representations were deeply fused to get the interaction score of the node pair.

2.1 Dataset

In this paper, the protein–disease associations, the drug–disease associations, the drug–protein interactions, the drug similarities and the protein similarities were obtained from a previously published paper [23]; the information obtained included 708 drugs, 1512 proteins, 5603 diseases, 199 214 known drug–disease associations, 1596 745 known protein–disease associations and 1923 known drug–protein interactions. The associations among proteins, drugs and diseases were originally obtained from the comparative toxigenics database (CTD), which mainly provides information about the relationships among chemistry, genes and diseases [28].

2.2 Calculation and representation of multi-source data

Five types of matrices are defined to represent data regarding drugs, proteins and diseases, including similarity matrices of drugs and proteins, drug–disease association matrix, protein–disease association matrix and drug–protein interaction matrix.

2.2.1 Association and interaction matrices

As shown in Figure 2, we use matrix |$A^{drug} \in{\mathbb{R}^{n_{r} \times{n_{d}}}}$| to represent the associations between |$n_{r}$| drugs and |$n_{d}$| diseases, where if drug |${r_{i}}$| is observed to be associated with disease |${d_{j}}$|⁠, then |$A_{ij}^{drug}$| is 1 (otherwise, it is 0). The protein–disease association matrix is denoted by |$A^{protein} \in{\mathbb{R}^{n_{p} \times{n_{d}}}}$|⁠, and the matrix element |$A_{ij}^{protein}$| is 0 or 1. 1 indicates that the protein is related to disease, while 0 indicates the opposite. The matrix |$Y \in{\mathbb{R}^{n_{r} \times{n_{p}}}}$| represents the interactions between drugs and proteins. When |$Y_{ij}$| is 1, it means that the interaction between drug |${r_{i}}$| and protein |${p_{j}}$| is observed, otherwise, it is 0.

The framework of the proposed GVDTI model, take ${r_{1}}$ and ${p_{2}}$ as examples. The topological embedding vector of each drug or protein node is learned by a graph convolutional autoencoder (a) and (b), and the multi-layer convolutional neural network is used to fuse the topology embedding of ${r_{1}}$ - ${p_{2}}$(c). (d) A tri-layer heterogeneous network is constructed, and the proposed embedding strategy is used to form an attribute-embedding matrix of ${r_{1}}$ - ${p_{2}}$. (e) Pairwise attribute distribution representation is extracted using a convolutional variational autoencoder. (f) Pairwise attribute representation is obtained by multi-layer convolutional coding. (g) Fusion of the three pairwise representations, attention-enhanced topological representation, attribute distribution and attribute representation.
Figure 1

The framework of the proposed GVDTI model, take |${r_{1}}$| and |${p_{2}}$| as examples. The topological embedding vector of each drug or protein node is learned by a graph convolutional autoencoder (a) and (b), and the multi-layer convolutional neural network is used to fuse the topology embedding of |${r_{1}}$| - |${p_{2}}$|(c). (d) A tri-layer heterogeneous network is constructed, and the proposed embedding strategy is used to form an attribute-embedding matrix of |${r_{1}}$| - |${p_{2}}$|⁠. (e) Pairwise attribute distribution representation is extracted using a convolutional variational autoencoder. (f) Pairwise attribute representation is obtained by multi-layer convolutional coding. (g) Fusion of the three pairwise representations, attention-enhanced topological representation, attribute distribution and attribute representation.

2.2.2 Similarity matrices

Based on the chemical substructure information of drugs, the authors of a previous study calculated the intra drug similarity using Tanimoto coefficient [29]. In Figure 2, matrix |$S^{drug} \in{\mathbb{R}^{n_{r} \times{n_{r}}}}$| is used to represent the similarity matrix of drugs, and |$S_{ij}^{drug} \in [0,1]$| is the similarity value between drug |${r_{i}}$| and drug |${r_{j}}$|⁠. The larger the |$S_{ij}^{drug}$|⁠, the higher the similarity between drug |${r_{i}}$| and drug |${r_{j}}$|⁠. As shown in Figure 2, the protein similarity matrix |$S^{protein} \in{\mathbb{R}^{n_{p} \times{n_{p}}}}$|⁠, which was described in [30], was constructed on the basis of the Smith–Waterman score based on the primary sequences of the targets. |$S_{ij}^{protein}$| indicates the similarity value between protein |${p_{i}}$| and protein |${p_{j}}$|⁠.

Similarity matrices, interaction matrices and association matrices derived from corresponding networks of drugs, proteins and diseases. The detailed process of the proposed embedding strategy. Considering drug ${r_{1}}$ and protein ${p_{2}}$ as examples, the attribute-embedding matrix of ${r_{1}}$ - ${p_{2}}$ is constructed.
Figure 2

Similarity matrices, interaction matrices and association matrices derived from corresponding networks of drugs, proteins and diseases. The detailed process of the proposed embedding strategy. Considering drug |${r_{1}}$| and protein |${p_{2}}$| as examples, the attribute-embedding matrix of |${r_{1}}$| - |${p_{2}}$| is constructed.

2.3 Pairwise attention-enhanced topological representation learning

2.3.1 Attention mechanism at attribute level

The drug–disease association matrix |$A^{drug}$| and the drug–protein interaction matrix |$Y$| are spliced back and forth to form a matrix |$X^{drug}$|⁠. Its |$i$|-th line records the association of drug |${r_{i}}$| with all diseases and the interaction with all proteins. Such a row can be used as the attribute vector of |${r_{i}}$|⁠, and each of its attribute nodes has a different contribution to the low-dimensional topological embedding vector of the drug. Therefore, we established an attribute attention mechanism to learn which attributes of the drug and protein nodes are the most informative for their low-dimensional topological embedding vector, as shown in Figure 3. Each attribute |$X_{ij}^{drug}$| of the drug node |${r_{i}}$| is assigned a different weight |${\alpha _{ij}^{r}}$|⁠,
(1)
(2)
where |$W^{r}$| is a weight matrix, and |$b^{r}$| is a bias vector. |$X_{i}^{drug}\in{\mathbb{R}^{{1} \times{(n_{d}+n_{p})}}}$| is the attribute vector of the drug node |$r_{i}$|⁠, |$H^{r}\in{\mathbb{R}^{{(n_{d}+n_{p})} \times{n_{f}}}}$| is used to capture contextual relationships among different drugs, and |$n_{f}$| is the number of low-dimensional features for a drug node. |$s_{i}^{r} = \big \{s_{i1}^{r},s_{i2}^{r},...,s_{ik}^{r},...,s_{i(n_{d}+n_{p})}^{r}\big \}$| is the vector that records the attention score and |$s_{ij}^{r}$| is the score of the |$j$|-th attribute |$X_{ij}^{drug}$| of the drug |$r_{i}$|⁠. |$\alpha _{i}^{r} = \big \{\alpha _{i1}^{r},\alpha _{i2}^{r},...,\alpha _{ik}^{r},...,\alpha _{i(n_{d}+n_{p})}^{r}\big \}$| is the result of normalizing the attention scores of all attributes of |${r_{i}}$| and |$\alpha _{ij}^{r}$| is the attention weight of the attribute node |$X_{ij}^{drug}$|⁠. Therefore, the enhancement vector of the drug node |${r_{i}}$| can be expressed as |$y_{i}^{r}$|⁠,
(3)
|$\circ $| is an element-wise product operator. |$y_{i}^{r}$| is the enhanced attribute vector of the drug node |$r_{i}$|⁠, and its transposition is the |$i$|-th row of the enhanced attribute matrix |$\widetilde{X}^{drug}$|⁠. Similarly, |$X^{protein}$| was obtained by the front-and-back splicing of the protein–disease association matrix |$A^{protein}$| and drug–protein interaction matrix |$Y$|⁠, and |$\widetilde{X}^{protein}$| was obtained by the attention enhancement of all attributes of protein |${p_{i}}$|⁠. Finally, we obtained the enhanced drug property matrix |$\widetilde{X}^{drug}$| and protein property matrix |$\widetilde{X}^{protein}$|⁠.
Extraction and fusion of the topological embedding of ${r_{1}}$ - ${p_{2}}$.
Figure 3

Extraction and fusion of the topological embedding of |${r_{1}}$| - |${p_{2}}$|⁠.

2.3.2 Pairwise attention-enhanced topology extraction by graph convolutional autoencoder

|$\widetilde{X}^{drug}$| is the enhanced drug attribute matrix, and |$S^{drug}$| is the drug similarity matrix showing the similarity between the drugs. |$\hat{S}^{drug}=D^{\frac{-1}{2}}{S}^{drug} D^{\frac{-1}{2}}$|⁠, where |$D$| is a diagonal matrix and |$D_{ii}=\sum \nolimits _{j} {S}_{ij}^{drug}$|⁠. |$\hat{S}^{drug}$| is multiplied by |$\widetilde{X}^{drug}$| to fuse the properties of the drug node with the topological structures within the drugs. The result of matrix multiplication is the input of the module based on the graph convolution autoencoder [31–33]. Then, following multiplication with the weight matrix |$W_{e}^{1}$|⁠, the drug node is mapped to a potential low-dimensional space, and the low-dimensional topology representation matrix of the drug, |$Z_{e}^{r}$| is obtained. Similarly, we performed graph convolution on |$Z_{e}^{r}$| again to obtain the low-dimensional topology representation matrix |$Z^{r}$| of the drug,
(4)
(5)
We decoded the matrix |$Z^{r}$| back to the original feature space and obtained |$\hat{X}^{drug}$|⁠,
(6)
where |$W_{d}^{1}$| and |$W_{d}^{2}$| are weight matrices. The gap between the original matrix |$X^{drug}$| and the reconstructed matrix |$\hat{X}^{drug}$| should be minimized, so that the mean square error can be considered the loss function for this module [34].
(7)

Homoplastically, we also need to fuse the protein attribute matrix |$\widetilde{X}^{protein}$| and similarity matrix |$S^{protein}$| and extract the low-dimensional topological representation of the protein to obtain matrix |$Z^{p}$|⁠.

Let |$z_{1}^{r}$| be the first row of |$Z^{r}$| and |$z_{2}^{p}$| be the second row of |$Z^{p}$|⁠. Taking drugs |$r_{1}$| and protein |$p_{2}$| as examples, we stacked the topological embedding vector |$z_{1}^{r}$| of the drug node and the topological embedding vector |$z_{2}^{p}$| of the protein node up and down to obtain

$x=\begin{bmatrix} z_{1}^{r} \\ z_{2}^{p} \end{bmatrix}$
⁠. |$x$| went through two convolution-pool layers to fuse the topological embedding vectors of |${r_{1}}$|-|${p_{2}}$|⁠, and the topology representation |$u_{topology}$| is learned.

2.4 Construction of attribute-embedding matrix

The biological premise of our embedding strategy is that if a pair of drug and protein nodes have interactions, associations or similarities with more of the same drugs, proteins or diseases, the said pairs of nodes are more likely to interact. Based on this biological premise, we established pairwise attribute-embedding matrices, for example, the attribute-embedding matrix of |${r_{1}}$|-|${p_{2}}$|⁠.

Figure 2 shows the interaction matrix |$Y$|⁠, the association matrices |$A^{protein}$| and |$A^{drug}$|⁠, and the similarity matrices |$S^{protein}$| and |$S^{drug}$|⁠. Firstly, when |${r_{1}}$| and |${p_{2}}$| have interactions or similarities with more identical proteins, |${r_{1}}$| is more likely to interact with |${p_{2}}$|⁠. The first row of |$Y$|⁠, |$Y_{1,*}$|⁠, and the second row of matrix |$S^{protein}$|⁠, |$S_{2,*}^{protein}$|⁠, record the interactions of |${r_{1}}$| and |${p_{2}}$|⁠, respectively, with all proteins; thus, we spliced them up and down to obtain matrix |${f_{1}}$|⁠,
(8)
Secondly, when |${r_{1}}$| and |${p_{2}}$| have similarities and interactions with more of the same drugs, |${r_{1}}$| and |${p_{2}}$| are more likely to interact; thus, the first row of matrix |$S^{drug}$|⁠, |$S_{1,*}^{drug}$|⁠, and the transpose of the second column of matrix |$Y$|⁠, |$Y_{*,2}^{T}$|⁠, were spliced to form a matrix |${f_{2}}$|⁠,
(9)
where |$Y_{*,2}^{T}$| and |$S_{1,*}^{drug}$|⁠, respectively, show the connections of |${p_{2}}$| and |${r_{1}}$| to all drugs. Similarly, when |${r_{1}}$| and |${p_{2}}$| are associated with more common diseases, |${r_{1}}$| is more likely to interact with |${p_{2}}$|⁠, to form an association matrix |${f_{3}}$| between them and the disease,
(10)
Finally, we obtained the attribute-embedding matrix |$F \in{\mathbb R^{ 2 \times (n_{p}+n_{r}+n_{d})}}$| of |${r_{1}}$| and |${p_{2}}$| by concatenating |${f_{1}}$|⁠, |${f_{2}}$| and |${f_{3}}$| from end to end,
(11)
|$n_p$|⁠, |$n_r$| and |$n_d$| represent the numbers of proteins, drugs and diseases, respectively.

2.5 Pairwise attribute distribution learning by CVAE

A CVAE is a deep generative model [35]. Unlike traditional autoencoders [36], which describe the low-dimensional feature representation of |${r_{1}}$|-|${p_{2}}$| numerically, a CVAE describes this in a probabilistic way, and finally yields the pairwise attribute distribution representation |$m$|⁠.

Variational encoder. The coding part of the CVAE takes the embedding matrix |$F$| of |${r_{1}}$| and |${p_{2}}$| as the input, learning the pairwise attribute distribution representation |$m$|⁠. The inference network consists of two hidden layers, each of which contains a convolution layer and a pooling layer. The specific parameter settings are shown in Figure 4. In Figure 4, k refers to the size of the filter, s is the strides of convolution and pooling operations, p is the zero-padding operation. The output of each hidden layer in the encoding process is,
(12)
where |${L_{en}}$| is the number of layers of the inference network and |$X_{en}^{0} = F$|⁠. |$W_{en}^{l}$| and |$b_{en}^{l}$| are the weight matrix and bias vector of the l-th layer, respectively. * represents the convolution operation, |$\eta $| is the nonlinear activation function |$LeakyRelu$|⁠, and |$max$| represents the maximum pooling. We flattened the output |$X_{en}^{L_{en}}$| of the last layer of convolution into a vector |$x_{en}$|⁠, and then performed two fully connected mappings on |$x_{en}$|⁠, as described previously [37], to obtain two parameters: |$\mu $| and |$\sigma $|⁠,
(13)
where |$W_{\mu }$| and |$W_{\sigma }$| are the weights matrices of the linear layer derived from |$\mu $| and |$\sigma $|⁠, respectively, |$b_{\mu }$| and |$b_{\sigma }$| are the corresponding bias vectors and |$\eta $| is the nonlinear activation function |$LeakyRelu$|⁠. Following the reparameterization scheme in [38, 39], we considered |$\mu $| and |$\sigma $| the mean and variance, respectively, and constructed the attribute distribution representation of |${r_{1}}$|-|${p_{2}}$|⁠, |$m$|⁠, as follows,
(14)
where |$\circ $| is the dot product symbol and |$\xi $| is the noise sampled from the Gaussian distribution N(0,I), with zero mean and an identical covariance matrix I. If |$q \big ( m|F\big )$| is considered the posterior probability and |$q \big ( m|F\big ) = m$|⁠, the attribute distribution representation |$m$| is obtained by sampling from |$F$|⁠.
Variational decoder. The goal of the decoder is to restore the latent variable variational probability distribution |$q \big ( m|F\big )$| to the approximate probability distribution |$p \big ( F|m\big )$| of the original data. |$p \big ( F|m\big )$| represents the probability that a given implicit variable |$m$| that obeys the standard normal distribution can generate |$F$|⁠. The formal definition of |$p \big ( F|m\big )$| is
(15)
|$f_{de}\big ( m\big )$| indicates that the attribute distribution representation |$m$| is first linearly mapped, and then reshaped into a feature map to perform a deconvolution operation. The specific details regarding this process are shown in Figure 4. The result of the linear mapping is as follows,
(16)
where |$W_{lin}$| and |$b_{lin}$| are the weight matrix and bias vector to be learnt by the linear layer, respectively. We reshaped |$x_{de}$| into a feature map form, and then used it as the input of deconvolution to obtain |$X_{de}^{l}$|⁠,
(17)
where |$X_{de}^{0} = x_{de}$|⁠, |$L_{de}$| is the number of layers of the generated network. |$W_{de}^{l}$| is the weight matrix of layer l, |$b_{de}^{l}$| is the corresponding bias vector, |$\star $| represents the deconvolution operation and |$X_{de}^{l}$| is feature map acquired through layer l.
Loss calculation based on the CVAE. We optimized the representation of the attribute distribution based on the following loss,
(18)
where |$p \big (m\big )$| is the prior distribution, which makes the target distribution of |$m$| a Gaussian distribution. |$KL$| is the KL divergence, which is used to measure the distance between the posterior distribution |$q \big ( m|F\big )$| and the prior distribution |$p \big (m\big )$|⁠. We used the Adam algorithm to optimize |$loss_{v}$| [40]. After the training is completed, the pairwise attribute distribution representation |$m$| can be obtained, which is defined as |$u_{distribution}$|⁠.
shows the process for the extraction of the pairwise attribute distribution and attribute representation of ${r_{1}}$-${p_{2}}$.
Figure 4

shows the process for the extraction of the pairwise attribute distribution and attribute representation of |${r_{1}}$|-|${p_{2}}$|⁠.

2.6 Pairwise attribute representation learning by multi-layer convolutional neural network

The embedded matrix |$F$| of |${r_{1}}$| and |${p_{2}}$| is inputted into MCNN to learn the pairwise attribute representations of |${r_{1}}$| and |${p_{2}}$| to assist the entire model in predicting drug–protein interactions. The convolution module contains two convolution layers and two max-pooling layers, as shown in Figure 4. In order to learn the edge information of |$F$| during the convolution process, we have performed zero-padding operations on the input of each convolution layer. The output feature map of each hidden layer is
(19)
where |$c^{0} = F$|⁠, |$b_{cn}^{l}$| is the bias vector of the l-th layer and |$W_{cn}^{l}$| is the corresponding weight matrix. |$c^{l}$| is feature map output by the lth hidden layer. We flattened the feature map |$c^{2}$| output by the last hidden layer into a vector |$c^{^{\prime}}$|⁠, which is the pairwise attribute representation, and defined it as |$u_{attribute}$|⁠.

2.7 Integration of the multiple pairwise representations

The attention-enhanced topological representation, attribute distribution and attribute representation of drug |$r_{1}$| and protein |$p_{2}$| are |$u_{topology}$|⁠, |$u_{distribution}$| and |$u_{attribute}$|⁠, respectively. We concatenated these three representations before and after to obtain the vector |$p$|⁠,
(20)
In order to obtain the associated probability of |$r_{1}$|-|$p_{2}$|⁠, |$p$| goes through two convolution-pooling layers to obtain its feature map which is flattened as a vector |$p^{^{\prime}}$|⁠. |$p^{^{\prime}}$| passes through a fully connected layer and |$softmax$| layer to obtained two types of associated probability distributions |$o$| [41],
(21)
|$o = [o_{1}, o_{2}]$| and |$o_{1}$| and |$o_{2}$| represent the probability that |$r_{1}$| and |$p_{2}$| are determined to have an interaction relationship and the probability that there is no interaction relationship, respectively. We used the cross-entropy loss function to optimize the above process,
(22)
where |$y$| is the real label. The loss function (22) is optimized using the Adam algorithm [40]. The module is trained by the backpropagation (BP) algorithm [42].
ROC curves and PR curves of all the methods in comparison of all the 708 drugs.
Figure 5

ROC curves and PR curves of all the methods in comparison of all the 708 drugs.

3 Experimental evaluations and discussions

3.1 Evaluation metrics

In this article, we treated the known drug–protein interaction samples as positive samples, and the unknown drug–protein interactions as the negative samples. In our dataset, there are 1923 positive samples and |$708 * 1512 - 1923 = 1068\,573$| negative samples. Obviously, there is a serious class imbalance between the positive and negative samples. Therefore, we randomly extracted negative samples at the same amount as the positive samples and formed the set A together with the positive samples. Set B contains |$1068\,573-1923=1066\,650$| negative samples.

We utilized 5-fold cross-validation to evaluate the performance of GVDTI and several other more advanced forecasting methods. The same training data and test data were used to verify these methods. In every cross-validation, we randomly divided the samples in set A into five equal subsets, four of which are for training; the fifth subset is combined with set B as the test set.

Given a threshold |$\omega $|⁠, when the drug–protein node pair has a known interaction relationship in the sample, and its interaction score is greater than |$\omega $|⁠, we consider the sample as a positive sample that is successfully identified. Otherwise, it is judged as a negative sample. We calculated the true positive rates (TPRs) and false positive rates (FPRs) by changing |$\omega $|⁠, and plotted the receiver operating characteristic (ROC) curve [43]. The TPR and FPR are defined as follows,
(23)
where TP and TN are the number of positive samples and negative samples that were successfully identified, respectively. FN(FP) represents the number of negative(positive) examples that are incorrectly identified.
AUC is the area under the ROC curve [44]; it is utilized to evaluate the predictive performance of the model. Nevertheless, the number of negative samples is much larger than that of the positive samples. In this case, the area under the precision-recall (PR) curve (AUPR) is more informative with regard to evaluating the overall performance of the prediction method [45]. Precision and Recall are defined as
(24)
Precision is the percentage of correctly identified positive samples relative to those that are judged to be positive, and Recall is the same as TPR. Since biologists often choose the top candidate proteins and then further verify their interaction with the drugs, the recall rate of the top k is calculated.
Table 1

The statistical results of the paired Wilcoxon test on the AUCs and AUPRs over all the 708 drugs by comparing GVDTI and all other seven methods.

DTIPGANDTINGDTPDTINetGRMFDDRLee’s method
p-value of AUC1.2215e-1530.2123e-1122.2055e-1335.0918e-623.5449e-752.5239e-925.1732e-89
p-value of AUPR5.1432e-2947.6154e-1346.6362e-2618.5746e-2242.9768e-2491.5273e-1041.0503e-114
DTIPGANDTINGDTPDTINetGRMFDDRLee’s method
p-value of AUC1.2215e-1530.2123e-1122.2055e-1335.0918e-623.5449e-752.5239e-925.1732e-89
p-value of AUPR5.1432e-2947.6154e-1346.6362e-2618.5746e-2242.9768e-2491.5273e-1041.0503e-114
Table 1

The statistical results of the paired Wilcoxon test on the AUCs and AUPRs over all the 708 drugs by comparing GVDTI and all other seven methods.

DTIPGANDTINGDTPDTINetGRMFDDRLee’s method
p-value of AUC1.2215e-1530.2123e-1122.2055e-1335.0918e-623.5449e-752.5239e-925.1732e-89
p-value of AUPR5.1432e-2947.6154e-1346.6362e-2618.5746e-2242.9768e-2491.5273e-1041.0503e-114
DTIPGANDTINGDTPDTINetGRMFDDRLee’s method
p-value of AUC1.2215e-1530.2123e-1122.2055e-1335.0918e-623.5449e-752.5239e-925.1732e-89
p-value of AUPR5.1432e-2947.6154e-1346.6362e-2618.5746e-2242.9768e-2491.5273e-1041.0503e-114
Table 2

The top 10 candidate proteins of five drugs

Drug nameRankTargetsEvidenceRankTargetsEvidence
1HTR6DrugBank/STITCH6DRD5DrugBank
2CHRM2DrugBank7ADRA2CDrugBank
Quetiapine3CHRM4DrugBank8ADRA1BDrugBank
4HTR2CDrugBank/STITCH9CHRM5DrugBank
5ADRA1DDrugBank10DRD2DrugBank/STITCH
1HTR2CDrugBank/STITCH6NR1I2Unconfirmed
2HTR7DrugBank/STITCH7CHRM1DrugBank/STITCH
Clozapine3HTR1DDrugBank8ADRA1ADrugBank
4HTR6DrugBank/STITCH9CHRM5DrugBank
5HTR1BDrugBank10HRH4DrugBank
1KCNJ11DrugBank6CACNA1ADrugBank
2ADRA2BUnconfirmed7CACNA1BDrugBank
Verapamil3KCNH2DrugBank/STITCH8CACNB4Literature [49]
4CACNB2Literature [49]9CACNA1CDrugBank/STITCH
5CACNA1SLiterature [49]10CACNA1FLiterature [49]
1CHRM2DrugBank6SLC6A2DrugBank
2NTRK1DrugBank7HTR2ADrugBank/STITCH
Amitriptyline3KCNQ2DrugBank8KCND2Literature [50]
4KCNA1DrugBank9OPRD1DrugBank
5ADRA1ADrugBank/STITCH10KCND3Literature [50]
1HTR2CDrugBank/STITCH6HTR3ADrugBank
2DRD2DrugBank7CHRM3DrugBank
Ziprasidone3DRD5DrugBank8ADRA2CLiterature [51]
4ADRA2ADrugBank9HRH2Unconfirmed
5HTR1DDrugBank/STITCH10HTR6DrugBank
Drug nameRankTargetsEvidenceRankTargetsEvidence
1HTR6DrugBank/STITCH6DRD5DrugBank
2CHRM2DrugBank7ADRA2CDrugBank
Quetiapine3CHRM4DrugBank8ADRA1BDrugBank
4HTR2CDrugBank/STITCH9CHRM5DrugBank
5ADRA1DDrugBank10DRD2DrugBank/STITCH
1HTR2CDrugBank/STITCH6NR1I2Unconfirmed
2HTR7DrugBank/STITCH7CHRM1DrugBank/STITCH
Clozapine3HTR1DDrugBank8ADRA1ADrugBank
4HTR6DrugBank/STITCH9CHRM5DrugBank
5HTR1BDrugBank10HRH4DrugBank
1KCNJ11DrugBank6CACNA1ADrugBank
2ADRA2BUnconfirmed7CACNA1BDrugBank
Verapamil3KCNH2DrugBank/STITCH8CACNB4Literature [49]
4CACNB2Literature [49]9CACNA1CDrugBank/STITCH
5CACNA1SLiterature [49]10CACNA1FLiterature [49]
1CHRM2DrugBank6SLC6A2DrugBank
2NTRK1DrugBank7HTR2ADrugBank/STITCH
Amitriptyline3KCNQ2DrugBank8KCND2Literature [50]
4KCNA1DrugBank9OPRD1DrugBank
5ADRA1ADrugBank/STITCH10KCND3Literature [50]
1HTR2CDrugBank/STITCH6HTR3ADrugBank
2DRD2DrugBank7CHRM3DrugBank
Ziprasidone3DRD5DrugBank8ADRA2CLiterature [51]
4ADRA2ADrugBank9HRH2Unconfirmed
5HTR1DDrugBank/STITCH10HTR6DrugBank
Table 2

The top 10 candidate proteins of five drugs

Drug nameRankTargetsEvidenceRankTargetsEvidence
1HTR6DrugBank/STITCH6DRD5DrugBank
2CHRM2DrugBank7ADRA2CDrugBank
Quetiapine3CHRM4DrugBank8ADRA1BDrugBank
4HTR2CDrugBank/STITCH9CHRM5DrugBank
5ADRA1DDrugBank10DRD2DrugBank/STITCH
1HTR2CDrugBank/STITCH6NR1I2Unconfirmed
2HTR7DrugBank/STITCH7CHRM1DrugBank/STITCH
Clozapine3HTR1DDrugBank8ADRA1ADrugBank
4HTR6DrugBank/STITCH9CHRM5DrugBank
5HTR1BDrugBank10HRH4DrugBank
1KCNJ11DrugBank6CACNA1ADrugBank
2ADRA2BUnconfirmed7CACNA1BDrugBank
Verapamil3KCNH2DrugBank/STITCH8CACNB4Literature [49]
4CACNB2Literature [49]9CACNA1CDrugBank/STITCH
5CACNA1SLiterature [49]10CACNA1FLiterature [49]
1CHRM2DrugBank6SLC6A2DrugBank
2NTRK1DrugBank7HTR2ADrugBank/STITCH
Amitriptyline3KCNQ2DrugBank8KCND2Literature [50]
4KCNA1DrugBank9OPRD1DrugBank
5ADRA1ADrugBank/STITCH10KCND3Literature [50]
1HTR2CDrugBank/STITCH6HTR3ADrugBank
2DRD2DrugBank7CHRM3DrugBank
Ziprasidone3DRD5DrugBank8ADRA2CLiterature [51]
4ADRA2ADrugBank9HRH2Unconfirmed
5HTR1DDrugBank/STITCH10HTR6DrugBank
Drug nameRankTargetsEvidenceRankTargetsEvidence
1HTR6DrugBank/STITCH6DRD5DrugBank
2CHRM2DrugBank7ADRA2CDrugBank
Quetiapine3CHRM4DrugBank8ADRA1BDrugBank
4HTR2CDrugBank/STITCH9CHRM5DrugBank
5ADRA1DDrugBank10DRD2DrugBank/STITCH
1HTR2CDrugBank/STITCH6NR1I2Unconfirmed
2HTR7DrugBank/STITCH7CHRM1DrugBank/STITCH
Clozapine3HTR1DDrugBank8ADRA1ADrugBank
4HTR6DrugBank/STITCH9CHRM5DrugBank
5HTR1BDrugBank10HRH4DrugBank
1KCNJ11DrugBank6CACNA1ADrugBank
2ADRA2BUnconfirmed7CACNA1BDrugBank
Verapamil3KCNH2DrugBank/STITCH8CACNB4Literature [49]
4CACNB2Literature [49]9CACNA1CDrugBank/STITCH
5CACNA1SLiterature [49]10CACNA1FLiterature [49]
1CHRM2DrugBank6SLC6A2DrugBank
2NTRK1DrugBank7HTR2ADrugBank/STITCH
Amitriptyline3KCNQ2DrugBank8KCND2Literature [50]
4KCNA1DrugBank9OPRD1DrugBank
5ADRA1ADrugBank/STITCH10KCND3Literature [50]
1HTR2CDrugBank/STITCH6HTR3ADrugBank
2DRD2DrugBank7CHRM3DrugBank
Ziprasidone3DRD5DrugBank8ADRA2CLiterature [51]
4ADRA2ADrugBank9HRH2Unconfirmed
5HTR1DDrugBank/STITCH10HTR6DrugBank

3.2 Comparison with other methods

The performance of the proposed GVDTI method for drug–protein interaction prediction is compared with that of several advanced methods, including DTIP [27], GANDTI [24], NGDTP [25], DTINet [23], GRMF [46], DDR [20] and Lee’s method [47]. As shown in Figure 5(A), GVDTI achieved the highest average AUC (AUC = 0.983) of all the 708 tested drugs, which is 0.2% higher than that of the model showing second-best performance, DTIP, 2.8% higher than that of NGDTP, 4.8% higher than that of GANDTI, 6.2% higher than that of DTINET, 8.9% higher than that of GRMF, 10.4% higher than that of DDR and 17.1% higher than that of the worst-performing method, i.e. Lee’s method. The GVDTI method showed the best performance, achieving an AUPR of 0.435, which is superior to that of DTIP, NGDTP, GRMF, DTINet, GANDTI, Lee’s method and DDR by 3.6%, 7.4%, 11.6%, 26%, 36.1%, 37.6% and 38.5%, respectively.

DTIP showed the second-best performance. Based on bidirectional GRU, this method learns the multi-scale neighbor topology of drugs and proteins, and deeply explores the potential relationship between drugs and proteins. NGDTP also showed a good performance. Based on non-negative matrix factorization, this method fuses multiple connection data between drugs and proteins, learning the topological representation of drugs and proteins. These findings indicate that it is necessary to integrate various information about drugs and proteins to obtain the topological representation of nodes.

The average recalls over all the drugs at different top $k$ values.
Figure 6

The average recalls over all the drugs at different top |$k$| values.

DTINet performed well with regard to the AUC (AUC = 0.921), but it did not achieve a good AUPR (AUPR = 0.175). On the contrary, GRMF achieved a good AUPR (AUPR = 0.319), but its AUC was a little worse (AUC = 0.894). NGDTP, DTINet and GRMF are all shallow prediction models based on matrix factorization; these cannot deeply learn the complex associations between the information regarding various drugs and proteins. Based on the generative adversarial network, GANDTI established a drug–protein interaction prediction model. However, this approach does not utilize drug–disease associations and protein–disease associations. DDR and Lee’s method performance was even worse, because the former does not use network topology information and the latter ignores the attribute information of the node. Our method not only performed the in-depth fusion of a variety of information related to the drugs and proteins, but also made the most of the attribute messages of the nodes.

In addition, for each prediction method, we obtain 708 AUCs and 708 AUPRs for all the 708 drugs. We performed Paired Wilcoxon test on the 708 paired AUCs or AUPRs of every two methods. Wilcoxon tests were used to assess whether the AUCs and AUPRs of GVDTI were significantly greater than each of the other seven approaches for the 708 drugs. Table 1 shows that GVDTI with regard to the AUCs and the AUPRs was significantly better than the other methods (⁠|$p$|-value < 0.05).

Among the top k protein candidates of the prediction results, the higher the recall rate, the more correctly will real proteins be identified. Under different k values, GVDTI’s performance was always better than that of the other methods (Figure 6), accounting for 89.7% of the positive samples in the top 30, 91.8% in the top 60 and 94.9% in the top 120. The recall rates of DTIP ranked second, with 85.3%, 89.4% and 93.5% positive samples in the top 30, 60 and 120, respectively. GANDTI identified 38.2%, 62.9% and 86.1% of the positive samples in the top 30, 60 and 120, respectively. NGDTP identified 85.2%, 87.1% and 89.8% of the positive samples in the top 30, 60 and 120, respectively. DTINet identified 74.8%, 81.5% and 85.4% of the positive samples in the top 30, 60 and 120, respectively, which were slightly higher than those of the GRMF method (77.5%, 79.5% and 82.6%, respectively). In contrast, Lee’s method was inferior to other methods, identifying 23%, 33.4% and 51.9%, respectively.

3.3 Case studies on five drugs

To fully prove the ability of GVDTI to discover potential drug–protein interactions, we conducted case studies on five drugs (quetiapine, verapamil, amitriptyline, clozapine and ziprasidone), and each of drug has at least 14 known drug–protein interactions. We collected and analyzed the top 10 candidate proteins for each drug (Table 2). In addition, we also conducted case studies on five drugs (imipramine, triazolam, desipramine, clonazepam, diazepam) each of which has less than 14 known drug–protein interactions. We collected their top five candidates and listed them in supplementary Table ST1.

The DrugBank database not only contains detailed drug data, such as chemical data and pharmacological data, but also includes comprehensive drug–protein data, such as information regarding their sequence, structure and pathway of action [46]. STITCH (Search Tool for Interacting Chemicals), which is a database based on Compartments: cellular localizations, eggNOG: gene orthology and STRING: protein–protein networks, contains detailed protein-related data, drug-related data and information regarding drug–protein interactions. As shown in Table 2, 40 candidate proteins were recorded by DrugBank and 13 candidate proteins were identified by STITCH. This result shows that these candidate proteins do interact with the corresponding drugs.

Four candidate proteins of verapamil, two candidate proteins of amitriptyline and one candidate protein of ziprasidone are labeled as ‘literature’. They have been confirmed by several published articles; this validates their mutual interaction. Of the 50 candidate proteins, three were marked as ‘unconfirmed’.

The top five candidates for five drugs each of which has less than 14 known drug–protein interactions were listed in supplementary Table ST1. The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database which contains the detailed drug data, the protein data and the drug–protein interactions. The Comparative Toxicogenomics Database (CTD) is also a public database that includes the drug–protein interactions. Five candidate proteins were recorded by KEGG, and four candidate proteins were included by CTD. The databases, DrugBank and STITCH, covered 16 and 10 candidates, respectively. It indicates that these candidates indeed interact with their corresponding drugs. In these 25 candidates, only two candidates were marked as ‘unconfirmed’, which means there are no evidences to confirm their interactions. The above analysis indicates that GVDTI has the powerful ability in discovering potential drug–protein interactions.

3.4 Prediction of novel proteins related to drugs

After training the prediction model using all drug–protein interrelationships, we used it to predict the top 10 ranked protein candidates for each drug and provide it in the supplemental Table ST2 (https://github.com/pingxuan-hlju/GVDTI). This may help biologists identify actual drug-related proteins through wet laboratory experiments.

4 Conclusions

We propose a novel prediction method, GVDTI, which extracts and integrates the topological structure of multiple sub-networks of drugs and proteins, as well as the attribute distribution and attribute representation of drug–protein node pairs to predict drug-related candidate proteins. GVDTI captures the various intra-relationships between drugs and proteins, i.e. drug similarities and protein similarities. Simultaneously, it captures the inter-relationships among drugs, proteins and diseases, i.e. drug–protein interactions, drug–disease associations and protein–disease associations. The developed graph convolutional autoencoder based framework learns pairwise topological representation, attribute distribution and attribute representation. The node attribute attention mechanism distinguishes the contributions of different attributes of a drug or protein node from its topological embedding vector. The tri-layer heterogeneous network is conducive to the formulation of pairwise attribute-embedding and further promotes the learning of pairwise attribute distribution and attribute representation. The experimental results demonstrated that GVDTI improved the drug–protein candidates prediction and top candidate proteins identification results. Our model can be used as a tool to screen potential candidate proteins and then discover the real drug–protein interaction relationships through wet laboratory experiments.

Key Points
  • A newly proposed attention-enhanced pairwise topological representation to embed the topology structure of drug and protein nodes and reveal the underlying topological relationship of drug–protein sub-networks. The attribute-level attention mechanism distinguishes the different contributions of various attributes of each drug or protein node from its topological embedding vector.

  • A heterogeneous network to facilitate the association of similarities, interactions and associations across drug, protein and disease, which assists the modeling of further pairwise attribute distribution and attribute representation.

  • The novel drug–protein pairwise attribute distribution modeled by convolutional variational autoencoder reveals the deep underlying relationship among drug, protein and disease data sources.

  • The biological premise driven pairwise attribute representation infers the drug–protein interactions through their common drugs, proteins and diseases. The improved performance for drug–protein interaction prediction was demonstrated by comparing with seven state-of-the-art prediction methods. The improved recall rate and five drug case studies further prove the ability of the proposed model.

Funding

The work was supported by the Natural Science Foundation of China (61972135, 62172143); Natural Science Foundation of Heilongjiang Province (LH2019F049 and LH2019A029); China Postdoctoral Science Foundation (2019M650069, 2020M670939); Hei-longjiang Postdoctoral Scientific Research Staring Foundation (BHLQ18104); Fundamental Research Foundation of Universi-ties in Heilongjiang Province for Technology Innovation (KJCX201805); Innovation Talents Project of Harbin Science and Technology Bureau (2017RAQXJ094); Fundamental Research Foundation of Universities in Heilongjiang Province for Youth Innovation Team (RCYJTD201805).

Ping Xuan, PhD (Harbin Institute of Technology), is a professor at the School of Computer Science and Technology, Heilongjiang University, Harbin, China. Her current research interests include computational biology, complex network analysis and medical image analysis.

Mengsi Fan is studying for her master’s degree in the School of Computer Science and Technology at Heilongjiang University, Harbin, China. Her research interests include complex network analysis and deep learning.

Hui Cui, PhD (The University of Sydney), is a lecturer at Department of Computer Science and Information Technology, La Trobe University, Melbourne, Australia. Her research interests lie in data-driven and computerized models for biomedical and health informatics.

Tiangang Zhang, PhD (The University of Tokyo), is an associate professor of the School of Mathematical Science, Heilongjiang University, Harbin, China. His current research interests include complex network analysis and computational fluid dynamics.

Toshiya Nakaguchi, PhD (Sophia University), is a professor at the Center for Frontier Medical Engineering, Chiba University, Chiba, Japan. His current research interests include complex network analysis, medical image processing and biometrics measurement.

References

1.

Huang
K
,
Xiao
C
,
Glass
LM
, et al. .
MolTrans: Molecular Interaction Transformer for drug-target interaction prediction
.
Bioinformatics
2021
;
37
(
6
):
830
6
.

2.

Sun
C
,
Cao
Y
,
Wei
J-M
, et al. .
Autoencoder-based drug-target interaction prediction by preserving the consistency of chemical properties and functions of drugs
.
Bioinformatics
2021
;btab384.

3.

Chu
Y
,
Kaushik
AC
,
Wang
X
, et al. .
DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features
.
Brief Bioinform
2021
;
22
(
1
):
451
62
.

4.

Verma
N
,
Qu
X
,
Trozzi
F
, et al. .
SSnet: A Deep Learning Approach for Protein-Ligand Interaction Prediction
.
Int J Mol Sci
2021
;
22
(
3
):
1392
.

5.

Chen
Z-H
,
You
Z-H
,
Guo
Z-H
, et al. .
Prediction of Drug-Target Interactions From Multi-Molecular Network Based on Deep Walk Embedding Model
.
Front Bioeng Biotechnol
2020
;
8
:
338
.

6.

Ding
Y
,
Tang
J
,
Guo
F
.
Identification of drug-target interactions via multiple information integration
.
Inform Sci
2017
;
418-419
:
546
60
.

7.

Bagherian
M
,
Sabeti
E
,
Wang
K
, et al. .
Machine learning approaches and databases for prediction of drug-target interaction: a survey paper
.
Brief Bioinform
2021
;
22
(
1
):
247
69
.

8.

Lee
I
,
Keum
J
,
Nam
H
.
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
.
PLoS Comput Biol
2019
;
15
(
6
):e1007129.

9.

Chen
X
,
Yan
CC
,
Zhang
X
, et al. .
Drug-target interaction prediction: databases, web servers and computational models
.
Brief Bioinform
2016
;
17
(
4
):
696
712
.

10.

Ding
Y
,
Tang
J
,
Fei
G
.
Identification of Protein-Ligand Binding Sites by Sequence Information and Ensemble Classifier
.
J Chem Inf Model
2017
;
57
(
12
):
3149
61
.

11.

Whitebread
S
,
Hamon
J
,
Bojanic
D
, et al. .
Keynote review: In vitro safety pharmacology profiling: an essential tool for successful drug development- ScienceDirect
.
Drug Discov Today
2005
;
10
(
21
):
1421
33
.

12.

Morris
G
,
Huey
R
.
AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility
.
J Comput Chem
2009
;
48
:
443
53
.

13.

Shoichet
BK
,
McGovern
SL
,
Wei
B
, et al. .
Lead discovery using molecular docking
.
Curr Opin Chem Biol
2002
;
6
(
4
):
439
46
.

14.

Donald
BR
.
Algorithms in Structural Molecular Biology
.
The MIT Press
2011
;
1
:
1
429
.

15.

Ballesteros
JA
,
Palczewski
K
.
G protein-coupled receptor drug discovery: Implications from the crystal structure of rhodopsin
.
Current Opinion in Drug Discovery and Development
2001
;
4
(
5
):
561
74
.

16.

Zheng
X
,
Wu
LY
,
Zhou
X
, et al. .
Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces
.
BMC Syst Biol
2010
;
4
(
Suppl 2
):
S6
.

17.

Keiser
MJ
,
Roth
BL
,
Armbruster
BN
, et al. .
Relating Protein Pharmacology by Ligand Chemistry
.
Nat Biotechnol
2007
;
25
(
2
):
197
206
.

18

Ding
Y
,
Tang
J
,
Guo
F
.
Identification of drug-target interactions via multiple information integration
.
Information ences
2017
;
546
60
.

19.

Bleakley
K
,
Yamanishi
Y
.
Supervised prediction of drug-target interactions using bipartite local models
.
Bioinformatics
2009
;
25
(
18
):
2397
403
.

20.

Olayan
RS
,
Ashoor
H
,
Bajic
VB
.
DDR: Efficient computational method to predict drug-target interactions using graph mining and machine learning approaches
.
Bioinformatics
2018
;
34
(
7
):
1164
73
.

21.

Xuan
P
,
Sun
C
,
Zhang
T
, et al. .
Gradient Boosting Decision Tree-Based Method for Predicting Interactions Between Target Genes and Drugs
.
Front Genet
2019
;
10
:
459
.

22.

Wang
W
,
Yang
S
,
Zhang
X
, et al. .
Drug repositioning by integrating target information through a heterogeneous network model
.
Bioinformatics
2014
;
30
(
20
):
2923
30
.

23.

Luo
Y
,
Zhao
X
,
Zhou
J
, et al. .
A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information
.
Nat Commun
2017
;
8
(
1
):
573
.

24.

Sun
C
,
Xuan
P
,
Zhang
T
, et al. .
Graph convolutional autoencoder and generative adversarial network-based method for predicting drug-target interactions
.
IEEE/ACM Trans Comput Biol Bioinform
2020
;
1
:
1
11
.

25.

Xuan
P
,
Chen
B
,
Zhang
T
, et al. .
Prediction of drug-target interactions based on network representation learning and ensemble learning
.
IEEE/ACM Trans Comput Biol Bioinform
2020
;
4
:
1
12
.

26.

Manoochehri
HE
,
Kadiyala
SS
,
Nourani
M
.
Predicting Drug-Target Interactions Using Weisfeiler-Lehman Neural Network
.
IEEE EMBS International Conference on Biomedical and Health Informatics (BHI) IEEE
2019
;
88
(
11
):
1
4
.

27.

Ping
Xuan
,
Yu
Zhang
,
Hui
Cui
,
Tiangang
Zhang
,
Maozu
Guo
,
Toshiya
Nakaguchi
.
Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction
.
Brief Bioinform
2021
; bbab119:
1
10
.

28.

Allan
,
Peter
, Davis, et al. The Comparative Toxicogenomics Database:
update
2013
.
Nucleic Acids Research 2013
;
41
(
D1
):
D1104
14
.

29.

Iorio
F
,
Bosotti
R
,
Scacheri
E
, et al. .
Discovery of drug mode of action and drug repositioning from transcriptional responses
.
Proc Natl Acad Sci U S A
2010
;
107
(
8
):
14621
6
.

30

Wang
W
,
Yang
S
,
Zhang
X
, et al.
Drug repositioning by integrating target information through a heterogeneous network model
.
Bioinformatics
2014
;
20
:
2923
30
.

31.

Kipf
TN
,
Welling
M
.
Variational graph auto-encoders
.
Conference and Workshop on Neural Information Processing Systems NIPS
2016
;
1050
:
1
3
.

32.

Schlichtkrull
M
,
Kipf
TN
,
Bloem
P
, et al. .
Modeling Relational Data with Graph Convolutional Networks
.
European semantic web conference
2018
;
1
:
593
607
.

33.

Kipf
TN
,
Welling
M
.
Semisupervised classifification with graph convolutional networks
.
International Conference on Learning Representations
2016
;
1609
:
1
14
.

34.

Ma
T
,
Cao
X
,
Zhou
J
, et al. .
Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders
.
International Joint Conference on Artificial Intelligence IJCAI
2018
;
1804
:
1
7
.

35.

Chen
Y
,
Rijke
MD
.
A Collective Variational Autoencoder for Top-N Recommendation with Side Information
.
Association for Computing Machinery
2018
;
1807
:
3
9
.

36.

Vincent
P
,
Larochelle
H
,
Lajoie
I
, et al. .
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
.
Journal of Machine Learning Research
2010
;
11
(
12
):
3371
408
.

37.

Gligorijevic
V
, et al. .
deepNF: deep network fusion for protein function prediction
.
Bioinformatics
2018
;
34
:
3873
81
.

38.

Zeng
X
,
Zhu
S
,
Liu
X
, et al. .
deepDR: a network-based deep learning approach to in silico drug repositioning
.
Bioinformatics
2019
;
35
(
24
):
5191
8
.

39.

Kingma
DP
,
Welling
M
.
Auto-encoding variational Bayes
arXiv
.
2013
;
1312
:
6114
.

40.

Kingma
D
,
Ba
J
.
Adam: A Method for Stochastic Optimization
.
International Conference for Learning Representations
2015
;
1412
:
1
15
.

41.

Bahdanau
D
,
Cho
K
,
Bengio
Y
.
Neural Machine Translation by Jointly Learning to Align and Translate
.
International Conference on Learning Representations ICLR
2015
;
1409
:
1
15
.

42.

Leonard
J
,
Kramer
MA
.
Improvement of the backpropagation algorithm for training neural networks
.
Computers and Chemical Engineering
1990
;
14
(
3
):
337
41
.

43.

Karimollah
HT
.
Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation
.
Caspian J Intern Med
2013
;
4
:
627
35
.

44.

Ling
CX
,
Huang
J
,
Zhang
H
.
AUC: a better measure than accuracy in comparing learning algorithms
.
Conference of the Canadian Society for Computational Studies of Inteligence
2003
;
2671
:
329
41
.

45.

Takaya
S
,
Marc
R
,
Guy
B
.
The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets
.
PLoS ONE
2015
;
10
(
3
):e0118432.

46.

Ezzat
A
,
Zhao
P
,
Wu
M
, et al. .
Drug-target interaction prediction with graph regularized matrix factorization
.
IEEE/ACM Trans Comput Biol Bioinform
2016
;
14
(
3
):
646
56
.

47.

Li
Z-C
,
Huang
M-H
,
Zhong
W-Q
, et al. .
Identification of drugtarget interaction from interactome network with ‘guilt-byassociation’ principle and topology features
.
Bioinformatics
2015
;
32
(
7
):
1057
64
.

48.

Wishart
DS
,
Feunang
YD
,
An
CG
, et al. .
DrugBank 5.0: A major update to the DrugBank database for 2018
.
Nucleic Acids Res
2017
;
46
(
D1
):
D1074
82
.

49.

Tfelt-Hansen
P
,
Tfelt-Hansen
J
.
Verapamil for Cluster Headache. Clinical Pharmacology and Possible Mode of Action
.
The Journal of Head and Face Pain
2009
;
49
(
1
):
117
25
.

50.

Casis
O
,
Sánchez-Chapula
JA
.
Disopyramide, imipramine, and amitriptyline bind to a common site on the transient outward K+ channel
.
J Cardiovasc Pharmacol
1998
;
32
(
4
):
521
6
.

51.

Nasrallah
HA
.
Atypical antipsychotic-induced metabolic side effects: insights from receptor-binding profiles
.
Mol Psychiatry
2008
;
13
(
1
):
27
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data