Abstract

The discovery of drug–target interactions (DTIs) is a very promising area of research with great potential. The accurate identification of reliable interactions among drugs and proteins via computational methods, which typically leverage heterogeneous information retrieved from diverse data sources, can boost the development of effective pharmaceuticals. Although random walk and matrix factorization techniques are widely used in DTI prediction, they have several limitations. Random walk-based embedding generation is usually conducted in an unsupervised manner, while the linear similarity combination in matrix factorization distorts individual insights offered by different views. To tackle these issues, we take a multi-layered network approach to handle diverse drug and target similarities, and propose a novel optimization framework, called Multiple similarity DeepWalk-based Matrix Factorization (MDMF), for DTI prediction. The framework unifies embedding generation and interaction prediction, learning vector representations of drugs and targets that not only retain higher order proximity across all hyper-layers and layer-specific local invariance, but also approximate the interactions with their inner product. Furthermore, we develop an ensemble method (MDMF2A) that integrates two instantiations of the MDMF model, optimizing the area under the precision-recall curve (AUPR) and the area under the receiver operating characteristic curve (AUC), respectively. The empirical study on real-world DTI datasets shows that our method achieves statistically significant improvement over current state-of-the-art approaches in four different settings. Moreover, the validation of highly ranked non-interacting pairs also demonstrates the potential of MDMF2A to discover novel DTIs.

Introduction

The main objective of the drug discovery process is to identify drug–target interactions (DTIs) among numerous candidates. Although in vitro experimental testing can verify DTIs, it suffers from extremely high time and monetary costs. Computational (in silico) methods employ machine learning techniques [1], such as matrix factorization (MF) [2], kernel-based models [3], graph/network embedding [4] and deep learning [5], to efficiently infer a small amount of candidate drugs. This vastly shrinks the search scope and reduces the workload of experiment-based verification, thereby accelerating the drug discovery process significantly.

In the past, the chemical structure of drugs and the protein sequence of targets were the main source of information for inferring candidate DTIs [6–8]. Recently, with the advancements in clinical medical technology, abundant drug- and target-related biological data from multifaceted sources are exploited to boost the accuracy of DTI prediction. Some MF- and kernel-based methods utilize multiple types of drug and target similarities derived from heterogeneous information by integrating them into a single drug and target similarity [3, 9–11], but in doing so discard the distinctive information possessed by each similarity view.

In contrast, network-based approaches consider the diverse drug and target data as a (multiplex) heterogeneous DTI network that describes multiple aspects of drug and target relations, and learn topology-preserving representations of drugs and targets to facilitate DTI prediction. With deep neural networks showing consistently superior performance in the latest years in a plethora of different learning tasks, their adoption in the DTI prediction field, especially inferring new DTIs by mining DTI networks, is understandably rising [5, 12–14]. Although deep learning models achieve improved performance, they require larger amounts of data and are computationally intensive [1]. In addition, most deep learning models are sensitive to noise [15]. This is very important in DTI prediction, since there are many undiscovered interactions in the bipartite network of drugs and targets [2, 6, 16].

Apart from deep learning, another type of network-based model widely used in DTI prediction computes graph embeddings based on random walks [4, 17]. Although these methods can model high-order node proximity efficiently, they typically perform embedding generation and interaction prediction as two independent tasks. Hence, their embeddings are learned in an unsupervised manner, failing to preserve the topology information from the interaction network.

Random walk embedding methods are essentially factorizing a matrix capturing node co-occurrences within random walk sequences generated from the graph [18]—allowing to unify embedding generation and interaction prediction under a common MF framework. Nevertheless, the MF method proposed in [18] that approximates DeepWalk [19] can only handle single-layer networks. Thus, it is unable to fully exploit the topology information of multiple drug and target layers present in multiplex heterogeneous DTI networks. Furthermore, the area under the precision-recall curve (AUPR) and the area under the receiver operating characteristic curve (AUC) are two important evaluation metrics in DTI prediction, but no network-based approach directly optimizes them.

To address the issues mentioned above, we propose the formulation of a DeepWalk-based MF model, called Multiple similarity DeepWalk-based Matrix Factorization (MDMF), which incorporates multiplex heterogeneous DTI network embedding generation and DTI prediction within a unified optimization framework. It learns vector representations of drugs and targets that not only capture the multilayer network topology via factorizing the hyper-layer DeepWalk matrix with information from diverse data sources, but also preserve the layer-specific local invariance with the graph Laplacian for each drug and target similarity view. In addition, the DeepWalk matrix contains richer interaction information, exploiting high-order node proximity and implicitly recovering the possible missing interactions. Based on this formulation, we instantiate two models that leverage surrogate losses to optimize two essential evaluation measures in DTI prediction, namely AUPR and AUC. In addition, we integrate the two models to consider the maximization of both metrics. Experimental results on DTI datasets under various prediction settings show that the proposed method outperforms state-of-the-art approaches and can discover new reliable DTIs.

The rest of this article is organized as follows. Section 2 introduces some preliminaries of our work. Section 3 presents the proposed approach. Performance evaluation results and relevant discussions are offered in Section 4. Finally, Section 5 concludes this work.

Preliminaries

Problem formulation

Given a drug set |$D=\{d_i\}_{i=1}^{n_d}$| and a target set |$T=\{t_i\}_{i=1}^{n_t}$|⁠, the relation between drugs (targets) can be assessed in various aspects, which are represented by a set of similarity matrices |$\{\boldsymbol{S}^{d,h}\}_{h=1}^{m_d}$| (⁠|$\{\boldsymbol{S}^{t,h}\}_{h=1}^{m_t}$|⁠), where |$\boldsymbol{S}^{d,h} \in \mathbb{R}^{n_d \times n_d}$| (⁠|$\boldsymbol{S}^{t,h} \in \mathbb{R}^{n_t \times n_t}$|⁠) and |$m_d$| (⁠|$m_t$|⁠) is the number of relation types for drugs (targets). In addition, let the binary matrix |$\boldsymbol{Y} \in \{0,1\}^{n_d \times n_t}$| indicate the interactions between drugs in |$D$| and targets in |$T$|⁠, where |$Y_{ij}=1$| denotes that |$d_i$| and |$t_j$| interact with each other, and |$Y_{ij} = 0$| otherwise. A DTI dataset for |$D$| and |$T$| consists of |$\{\boldsymbol{S}^{d,h}\}_{h=1}^{m_d}$|⁠, |$\{\boldsymbol{S}^{t,h}\}_{h=1}^{m_t}$| and |$\boldsymbol{Y}$|⁠.

Let (⁠|$d_x$|⁠,|$t_z$|⁠) be a test drug–target pair, |$\{\boldsymbol{\bar{s}}^{d,h}_x\}^{m_d}_{h=1}$| be a set of |$n_d$|-dimensional vectors storing the similarities between |$d_x$| and |$D$| and |$\{\boldsymbol{\bar{s}}^{t,h}_z\}^{m_d}_{h=1}$| be a set of |$n_t$|-dimensional vectors storing the similarities between |$t_z$| and |$T$|⁠. A DTI prediction model predicts a real-valued score |$\hat{Y}_{xz}$| indicating the confidence of the affinity between |$d_x$| and |$t_z$|⁠. In addition, |$d_x \notin D$| (⁠|$t_z \notin T$|⁠), which does not belong to the training set, is considered as the new drug (target). There are four prediction settings according to whether the drug and target involved in the test pair are training entities [20]:

  • S1: predict the interaction between |$d_{x} \in D$| and |$t_z \in T$|⁠;

  • S2: predict the interaction between |$d_{x} \notin D$| and |$t_z \in T$|⁠;

  • S3: predict the interaction between |$d_x \in D$| and |$t_z \notin T$|⁠;

  • S4: predict the interaction between |$d_x \notin D$| and |$t_z \notin T$|⁠.

Matrix factorization for DTI prediction

In DTI prediction, MF methods typically learn two vectorized representations of drugs and targets that approximate the interaction matrix |$\boldsymbol{Y}$| by minimizing the following objective:
(1)
where |$\hat{\boldsymbol{Y}} = f(\boldsymbol{U}\boldsymbol{V}^\top ) \in \mathbb{R}^{n_d \times n_t}$| is the predicted interaction matrix, |$f$| is either the identity function |$\omega $| for standard MF [2] or the element-wise logistic function |$\sigma $| for Logistic MF [7], and |$\boldsymbol{U} \in \mathbb{R}^{n_d \times r}$|⁠, |$\boldsymbol{V} \in \mathbb{R}^{n_t \times r}$| are |$r$|-dimensional drug and target latent features (embeddings), respectively. The objective in Eq. (1) includes two parts: |$\mathcal{L}(\hat{\boldsymbol{Y}},\boldsymbol{Y})$| is the loss function to evaluate the inconsistency between the predicted and ground truth interaction matrix, and |$\mathcal{R}(\boldsymbol{U},\boldsymbol{V})$| concerns the regularization of the learned embeddings.
Given a test drug–target pair (⁠|$d_x,t_z$|⁠), its prediction with a specific instantiation of |$f$| is computed based on the embeddings of |$d_x$| (⁠|$\boldsymbol{U}_x \in \mathbb{R}^{r}$|⁠) and |$t_z$| (⁠|$\boldsymbol{V}_z \in \mathbb{R}^{r}$|⁠):
(2)

DeepWalk embeddings as matrix factorization

DeepWalk [19] is a network embedding approach, which generates a number of random walks over a graph to capture proximity among nodes. Qiu et al. [18] proved that the DeepWalk model could be interpreted as a matrix factorization task when the length of random walk approaches infinity. In particular, they introduced NetMF, a model that approximates DeepWalk to learn embeddings of a network |$G$| containing |$n$| nodes by factorizing the DeepWalk matrix defined as:
(3)
where |$\boldsymbol{A} \in \mathbb{R}^{n \times n}$| is the adjacency matrix of |$G$|⁠, |$\psi (\boldsymbol{A}) = \sum _i\sum _j A_{ij}$|⁠, |$\boldsymbol{\Lambda }= \textrm{diag}(\boldsymbol{Ae})$| is a diagonal matrix with row sum of |$\boldsymbol{A}$|⁠, |$n_w$| is the window size of the random walk controlling the number of context nodes, |$n_s$| plays the same role as the number of negative samples in DeepWalk, while the |$\max $| function guarantees that all elements in |$\boldsymbol{M}$| being non-negative. Considering the symmetry of |$\boldsymbol{M}$| for undirected networks, the factorization of the DeepWalk matrix could be expressed as |$\boldsymbol{M}=\boldsymbol{Q}\boldsymbol{Q}^\top $|⁠, where |$\boldsymbol{Q} \in \mathbb{R}^{n \times r}$| represents the |$r$|-dimensional network embeddings.

Materials and methods

Datasets

Two types of DTI datasets, constructed based on online biological and pharmaceutical databases, are used in this study. Their characteristics are shown in Table 1.

Table 1

Characteristics of datasets.

Dataset|$n_d$||$n_t$||$|P_1|$|Sparsity|$m_d$||$m_t$|
NR54261660.11844
GPCR2239510960.052
IC21020423310.054
E44566442560.014
Luo708151219230.00243
Dataset|$n_d$||$n_t$||$|P_1|$|Sparsity|$m_d$||$m_t$|
NR54261660.11844
GPCR2239510960.052
IC21020423310.054
E44566442560.014
Luo708151219230.00243
Table 1

Characteristics of datasets.

Dataset|$n_d$||$n_t$||$|P_1|$|Sparsity|$m_d$||$m_t$|
NR54261660.11844
GPCR2239510960.052
IC21020423310.054
E44566442560.014
Luo708151219230.00243
Dataset|$n_d$||$n_t$||$|P_1|$|Sparsity|$m_d$||$m_t$|
NR54261660.11844
GPCR2239510960.052
IC21020423310.054
E44566442560.014
Luo708151219230.00243

The first one is a collection of four golden standard datasets constructed by Yamanishi et al. [21], each one corresponding to a target protein family, namely Nuclear Receptors (NR), Ion Channel (IC), G-protein coupled receptors (GPCR), and Enzyme (E). Because the interactions in these datasets were discovered 14 years ago, we updated them by adding newly discovered interactions between drugs and targets in these datasets recorded in the last version of KEGG [22], DrugBank [23], and ChEMBL [24] databases. Details on new DTIs collection can be found in Supplementary Section A1. Four types of drug similarities, including SIMCOMP [25] built upon chemical structures, AERS-freq, AERS-bit [26] and SIDER [27] derived from drug side effects, as well as four types of target similarities, namely gene ontology (GO) term based semantic similarity, Normalized Smith-Waterman (SW), spectrum kernel with 3-mers length (SPEC-k3) and 4-mers length (SPEC-k4) based amino acid sequence similarities, obtained from [28], are utilized to describe diverse drug and target relations, since they possess higher local interaction consistency [10].

The second one provided by Luo et al. [17] (denoted as Luo) was built in 2017, which includes DTIs and drug–drug interactions (DDI) obtained from DrugBank 3.0 [23], as well as drug side effect (SE) associations, protein–protein interactions (PPI) and disease-related associations extracted from SIDER [27], HPRD [29], and Comparative Toxicogenomics Database [30], respectively. Based on diverse interaction/association profiles, three drug similarities derived from DDI, SE and drug–disease associations as well as two target similarities derived from PPI and target–disease associations are computed. The Jaccard similarity coefficient is employed to assess the similarity of drugs, SEs or diseases (proteins or diseases) associated/interacted with two drugs (targets). In addition, drug similarity based on chemical structure and target similarity based on genome sequence are also computed. Therefore, four drug similarities and three target similarities are used for this dataset.

Multiple similarity DeepWalk-based matrix factorization

A DTI dataset, associated with multiple drug and target similarities, can be viewed as a multiplex heterogeneous network |$G^{DTI}$|⁠. This can be done by treating drugs and targets as two types of vertices, and by considering non-zero similarities and interactions as edges connecting two homogeneous and heterogeneous entities, respectively, where the weight of each edge equals the corresponding similarity or interaction value. In a DTI dataset, the interaction matrix is typically more sparse than similarity matrices, causing that similarity-derived edges linking two drugs (targets) markedly outnumber more crucial bipartite interaction edges. To balance the distribution of different types of edges and stress relations of more similar entities in the DTI network, we replace each original dense similarity matrix with the sparse adjacency matrix of its corresponding |$k$|-nearest neighbors (⁠|$k$|-NNs) graph. Specifically, given a similarity matrix |$\boldsymbol{S}^{d,h}$|⁠, its sparsified matrix |$\hat{\boldsymbol{S}}^{d,h} \in \mathbb{R}^{n_d \times n_d}$| is defined as:
(4)
where |$\mathcal{N}^{k,h}_{d_i}$| is the set of the |$k$|-NNs of |$d_i$| based on similarity |$\boldsymbol{S}^{d,h}$|⁠.

Formally, |$G^{DTI}$| consists of three parts: (i) |$G^{d}=\{\hat{\boldsymbol{S}}^{d,h}\}_{h=1}^{m_d}$|⁠, which is a multiplex drug subnetwork containing |$m_d$| layers, with |$\hat{\boldsymbol{S}}^{d,h}$| being the adjacency matrix of the |$h$|-th drug layer; (ii) |$G^{t}=\{\hat{\boldsymbol{S}}^{t,h}\}_{h=1}^{m_t}$|⁠, which is a multiplex target subnetwork including |$m_t$| layers with |$\hat{\boldsymbol{S}}^{t,h}$| denoting the adjacency matrix of the |$h$|-th target layer; (iii) |$G^Y=\boldsymbol{Y}$|⁠, which is a bipartite interaction subnetwork connecting drug and target nodes in each layer. Figure 1A depicts an example DTI network.

Representing a DTI dataset with three drug and two target similarities as a network. (A) A multiplex heterogeneous network including three drug layers, two target layers and six identical bipartite interaction subnetworks connecting drug and target nodes in each layer. (B) Six multiple hyper-layers, where each of them is composed of a drug and a target layer along with the interaction subnetwork.
Figure 1

Representing a DTI dataset with three drug and two target similarities as a network. (A) A multiplex heterogeneous network including three drug layers, two target layers and six identical bipartite interaction subnetworks connecting drug and target nodes in each layer. (B) Six multiple hyper-layers, where each of them is composed of a drug and a target layer along with the interaction subnetwork.

The DeepWalk matrix cannot be directly calculated for the complex DTI network that includes two multiplex and a bipartite subnetwork. To facilitate its computation, we consider each combination of a drug and a target layer along with the interaction subnetwork as a hyper-layer, and reformulate the DTI network as a multiplex network containing |$m_d\cdot m_t$| hyper-layers. The hyper-layer incorporating the |$i$|-th drug layer and |$j$|-th target layer is defined by the adjacency matrix

$$\boldsymbol{A}^{i,j} = \begin{bmatrix} \hat{\boldsymbol{S}}^{d,i} & \boldsymbol{Y} \\ \boldsymbol{Y}^\top & \hat{\boldsymbol{S}}^{t,j} \\ \end{bmatrix} $$
⁠, upon which |$G^{DTI}$| could be expressed as a set of hyper-layers |$\{\boldsymbol{A}^{i,j}\}^{m_d,m_t}_{i=1,j=1}$|⁠. Figure 1B illustrates the multiple hyper-layer network corresponding to the original DTI network in Figure 1A.

Based on the above reformulation, we compute a DeepWalk matrix |$\boldsymbol{M}^{i,j} \in \mathbb{R}^{(n_d+n_t) \times (n_d+n_t)}$| for each |$\boldsymbol{A}^{i,j}$| using Eq. (3), which reflects node co-occurrences in truncated random walks and captures richer proximity among nodes than the original hyper-layer—especially the proximity between unlinked nodes. In particular, if a pair of unlinked drug and target in |$\boldsymbol{A}^{i,j}$| has a certain level of proximity (non-zero value) in |$\boldsymbol{M}^{i,j}$|⁠, the corresponding relation represented by the DeepWalk matrix could be interpreted as the recovery of their missing interaction, which supplements the incomplete interaction information and reduces the noise in the original dataset. See an example in Supplementary Section A2.1.

In order to mine multiple DeepWalk matrices effectively, we define a unified DeepWalk matrix for the whole DTI network by aggregating every |$\boldsymbol{M}^{i,j}$|⁠:
(5)
where |$w^d_i$| and |$w^t_j$| are weights of |$i$|-th drug and |$j$|-th target layers, respectively, with |$\sum _{i=1}^{m_d}w^d_i=1$| and |$\sum _{j=1}^{m_t}w^t_j=1$|⁠. In Eq. (5), the importance of each hyper-layer is determined by multiplying the weights of its involved drug and target layers. This work employs the local interaction consistency (LIC)-based similarity weight, which assesses the proportion of proximate drugs (targets) having the same interactions, and has been found more effective than other similarity weights for DTI prediction [10]. More details on LIC weights can be found in Supplementary Section A2.2.
Let
$$\boldsymbol{Q} = \begin{bmatrix} \boldsymbol{U} \\ \boldsymbol{V} \end{bmatrix}$$
be the vertical concatenation of drug and target embeddings. We encourage |$\boldsymbol{Q}\boldsymbol{Q}^\top $| to approximate |$\bar{\boldsymbol{M}}$|⁠, which enables the learned embeddings to capture the topology information characterized by the holistic DeepWalk matrix. Hence, we derive the DeepWalk regularization term that diminishes the discrepancy between |$\bar{\boldsymbol{M}}$| and |$\boldsymbol{Q}\boldsymbol{Q}^\top $|⁠:
(6)
Considering that the adjacency matrix |$\boldsymbol{A}^{i,j}$| includes four blocks, |$\bar{\boldsymbol{M}}$| and |$\boldsymbol{Q}\boldsymbol{Q}^\top $| could be divided into four blocks accordingly:
(7)
Thus, |$\mathcal{R}_{dw}(\boldsymbol{U},\boldsymbol{V})$| can be expressed as the sum of norms of these blocks:
(8)

However, aggregating all per-layer DeepWalk matrices to the holistic one inevitably leads to substantial loss of layer-specific topology information. To address this limitation, we employ graph regularization for each sparsified drug (target) layer to preserve per layer drug (target) proximity in the embedding space, i.e. similar drugs (targets) in each layer are likely to have similar latent features. To distinguish the utility of each layer, each graph regularization is multiplied by the LIC-based weight of its corresponding layer, which emphasizes the proximity of more reliable similarities. Furthermore, Tikhonov regularization is added to prevent latent features from overfitting the training set.

By replacing |$\mathcal{R}(\boldsymbol{U},\boldsymbol{V})$| in Eq. (1) with the above regularization terms, we arrive to the objective of MDMF:
(9)
where |$\boldsymbol{L}^d_i=\textrm{diag}(\hat{\boldsymbol{S}}^{d,i}\boldsymbol{e})-\hat{\boldsymbol{S}}^{d,i}$| and |$\boldsymbol{L}^t_j=\textrm{diag}(\hat{\boldsymbol{S}}^{t,j}\boldsymbol{e})-\hat{\boldsymbol{S}}^{t,j}$| are graph Laplacian matrices of |$\hat{\boldsymbol{S}}^{d,i}$| and |$\hat{\boldsymbol{S}}^{t,j}$|⁠, respectively, |$\lambda _M$|⁠, |$\lambda _d$|⁠, |$\lambda _t$| and |$\lambda _r$| are regularization coefficients. Eq. (9) can be solved by updating |$\boldsymbol{U}$| and |$\boldsymbol{V}$| alternatively [2, 7], using an optimization algorithm, e.g. gradient descent (GD) or AdaGrad [31]. The details for the optimization procedure of MDMF are provided in Supplementary Section A2.3.

Optimizing the area under the curve with MDMF

Area under the curve loss functions

AUPR and AUC are two widely used area under the curve metrics in DTI prediction. Modeling differentiable surrogate losses that optimize these two metrics can lead to improvements in predicting performance [10]. Therefore, we instantiate the loss function in Eq. (9) with AUPR and AUC losses, and derive two DeepWalk-based MF models, namely MDMFAUPR and MDMFAUC, that optimize the AUPR and AUC metrics, respectively.

Given |$\boldsymbol{Y}$| and its predictions |$\boldsymbol{\hat{Y}}=\sigma (\boldsymbol{U}\boldsymbol{V}^\top )$|⁠, which are sorted in descending order according to their predicted scores, AUPR without the interpolation to estimate the curve is computed as:
(10)
where Prec|$@h$| is the precision of the |$h$| first predictions, and InRe|$@h$| is the incremental recall from rank |$h$|-1 to |$h$|⁠. In addition, histogram binning [32], which assigns predictions into |$n_b$| ordered bins, is employed to simulate the non-differentiable and non-smooth sorting operation used to rank predictions, deriving differential precision and incremental recall as:
(11)
(12)
where |$\delta (\hat{\boldsymbol{Y}},h)$| is the soft assignment function which returns the membership degree of each prediction to the |$h$|-th bin, i.e. |$[\delta (\hat{\boldsymbol{Y}},h)]_{ij} = \max \left ( 1-|\hat{Y}_{ij}-b_h|/\Delta ,0 \right )$|⁠, where |$b_h$| is the center of the |$h$|-th bin and |$\Delta = 1/(n_b-1)$| is the bin width. Considering that maximizing AUPR is equivalent to minimizing |$-\textrm{AUPR}(\boldsymbol{\hat{Y}}, \boldsymbol{Y})$|⁠, we obtain the differential AUPR loss according to Eq. 1012:
(13)
The objective of MDMFAUPR is given by using |$\mathcal{L}_{AP}$| to replace |$\mathcal{L}(\hat{\boldsymbol{Y}},\boldsymbol{Y})$| in Eq. (9).
AUC, which assesses the proportion of correctly ordered tuples, where the interacting drug–target pairs have higher prediction than the non-interacting ones, is defined as:
(14)
where predictions |$\hat{\boldsymbol{Y}}=\boldsymbol{U}\boldsymbol{V}^\top $| with |$f=\omega $|⁠, and |$P=\{(ij,hl)|(i,j) \in P_1, (h,l) \in P_0\}$| is the Cartesian product of |$P_1=\{(i,j)|Y_{ij}=1\}$| and |$P_0=\{(i,j)|Y_{ij}=0\}$|⁠. To maximize AUC, we need to minimize the proportion of wrongly ordered tuples, where the prediction of the interacting pair is lower than the non-interacting one. In order to make the AUC loss to be optimized easily, the discontinuous indicator function is substituted by a convex approximate function, i.e. |$\phi (x)=\log (1+\exp (-x))$|⁠, and the derived convex AUC loss is:
(15)
where |$\zeta _{ijhl}=\hat{Y}_{ij}-\hat{Y}_{hl}$|⁠. With the substitution of |$\mathcal{L}(\hat{\boldsymbol{Y}},\boldsymbol{Y})$| with |$\mathcal{L}_{AUC}$| in Eq. (9), we obtain the objective of MDMFAUC.

More details for optimizing the AUPR and AUC losses can be found in Supplementary Section A2.4.

Inferring embeddings of new entities

When a new drug (target) arrives, MDMFAUPR and MDMFAUC infer its embeddings using its neighbors in the training set. Given a new drug, we first compute its similarities with all training drugs in |$D$| based on its chemical structure, side effects, etc. Then, we linearly integrate the different types of similarity with LIC-based weights to obtain the fused similarity vector:
(16)
where |$\alpha _x$| is an unseen entity with |$\alpha =d$| denoting a new drug and |$\alpha =t$| indicating a novel target, and |$\bar{\boldsymbol{s}}^{\alpha }_x \in \mathbb{R}^{n_\alpha }$| is the fused similarity vector of |$\alpha _x$|⁠. Based on |$\bar{\boldsymbol{s}}^{\alpha }_x$|⁠, we retrieve the |$k$|-NNs of |${\alpha }_x$| from the training set, denoted as |$\mathcal{N}^k_{\alpha _x}$|⁠, and estimate the embeddings of |$\alpha _x$| as follows:
(17)
where |$\bar{\boldsymbol s}^d_{xi}$| is the similarity between |$d_x$| and |$d_i$|⁠, |$i^{\prime}$| (⁠|$j^{\prime}$|⁠) is the rank of |$d_i$| (⁠|$t_j$|⁠) among |$\mathcal{N}^k_{d_x}$| (⁠|$\mathcal{N}^k_{t_x}$|⁠), e.g. |$i^{\prime}=2$| if |$d_i$| is the second nearest training drug of |$d_x$| and |$\eta \in (0,1]$| is the decay coefficient shrinking the weight of further neighbors, which is an important parameter to control the embedding aggregation. In addition, we employ a pseudo embedding based |$\eta $| selection strategy [10] to choose the optimal |$\eta $| values from a set of candidate |$\mathcal{C}$| for prediction settings involving new drugs or/and new targets, i.e. S2, S3 and S4, respectively. In MDMFAUPR, the |$\eta $| value leading to the best AUPR result is used, while the optimal |$\eta $| in MDMFAUC is the one achieving the best AUC result.

MDMF2A: combining AUPR and AUC

It is known that both AUPR and AUC play a vital role in DTI prediction. However, MDMFAUPR and MDMFAUC only optimize one measure but ignore the other. To overcome this limitation, we propose an ensemble approach, called DWFM2A, which integrates the two MF models by aggregating their predicted scores. Given a test pair |$(d_x, t_z)$|⁠, along with its predicted scores |$\hat{Y}^{AP}_{xz}$| and |$\hat{Y}^{AC}_{xz}$| obtained from MDMFAUPR and MDMFAUC, respectively, the final prediction output by MDMF2A is defined as
(18)
where |$\beta \in [0,1]$| is the trade-off coefficient for the two MF base models, and |$\sigma $| converts |$\hat{Y}^{AC}_{xz}$| and |$\hat{Y}^{AP}_{xz}$| into the same scale, i.e. |$(0,1)$|⁠. The flowchart of MDMF2A is shown in Figure 2.
The flowchart of MDMF2A. The DTI dataset, consisting of multiple drug and target similarities, is firstly reformulated as a multiplex DTI network containing multiple hyper-layers. Then, two base models of MDMF2A, namely MDMFAUPR and MDMFAUC, are trained upon the derived multiple hyper-layers DTI network with the instantiation of two area under the curves metric based losses, respectively. Then, given a test pair, its associated similarity vectors are fused according to the LIC-based weights. Next, the two base models leverage these fused similarities to generate the estimation of the test pair, respectively. Lastly, MDMF2A aggregates the outputs of two models using Eq. (18) to obtain its final prediction.
Figure 2

The flowchart of MDMF2A. The DTI dataset, consisting of multiple drug and target similarities, is firstly reformulated as a multiplex DTI network containing multiple hyper-layers. Then, two base models of MDMF2A, namely MDMFAUPR and MDMFAUC, are trained upon the derived multiple hyper-layers DTI network with the instantiation of two area under the curves metric based losses, respectively. Then, given a test pair, its associated similarity vectors are fused according to the LIC-based weights. Next, the two base models leverage these fused similarities to generate the estimation of the test pair, respectively. Lastly, MDMF2A aggregates the outputs of two models using Eq. (18) to obtain its final prediction.

The computational complexity analysis of the proposed methods can be found in Supplementary Section A2.5.

Experimental evaluation and discussion

Experimental Setup

Following [20], four types of cross validation (CV) are conducted to examine the methods in four prediction settings, respectively. In S1, the 10-fold CV on pairs is used, where one fold of pairs is removed for testing. In S2 (S3), the 10-fold CV on drugs (targets) is applied, where one drug (target) fold along with its corresponding rows (columns) in |$\boldsymbol{Y}$| is separated for testing. The 3|$\times $|3-fold block-wise CV, which splits a drug fold and target fold along with the interactions between them for testing, using the interactions between the remaining drugs and targets for training, is applied to S4. AUPR and AUC defined in Eq. (10) and (14) are used as evaluation measures.

To evaluate the performance of MDMF2A in all prediction settings, we have compared it to eight DTI prediction models (WkNNIR [6], NRLMF [7], MSCMF [9], GRGMF [33], MF2A [10], DRLSM [3], DTINet [17], NEDTP [4]) and two network embedding approaches applicable to any domain (NetMF [18] and Multi2Vec [34]). WkNNIR cannot perform predictions in S1, as it is specifically designed to predict interactions involving new drugs or/and targets (S2, S3, S4) [6]. Furthermore, the proposed MDMF2A is also compared with four deep learning-based methods, namely NeoDTI [12], DTIP [5], DCFME [14] and SupDTI [13]. These deep learning competitors can only be applied to the Luo dataset in S1, because they formulate the DTI dataset as a heterogeneous network consisting of four types of nodes (drugs, targets, drug-side effects and diseases) and learn embeddings for all types of nodes. The illustration of all baseline methods can be found in Supplementary Section A3.

The parameters of all baseline methods are set based on the suggestions in the respective articles. For MDMF2A, the trade-off ensemble weight |$\beta $| is chosen from |$\{0,0.01,\ldots ,1.0\}$|⁠. For the two base models of MDMF2A, i.e. MDMFAUPR and MDMFAUC, the number of neighbors is set to |$k=5$|⁠, the window size of the random walk to |$n_w=5$|⁠, the number of negative samples to |$n_s=1$|⁠, the learning rate to |$\theta =0.1$|⁠, candidate decay coefficient set |$\mathcal{C}$|={0.1, 0.2, |$\dotsc $|⁠, 1.0}, the embedding dimension |$r$| is chosen from {50, 100}, |$\lambda _d$|⁠, |$\lambda _t$| and |$\lambda _r$| are chosen from {|$2^{-6}$|⁠, |$2^{-4}$|⁠, |$2^{-2}$|⁠, |$2^{0}$|⁠, |$2^{2}$|}, and |$\lambda _M=0.005$|⁠. The number of bins |$n_b$| in MDMFAUPR is chosen from {11, 16, 21, 26, 31} for small and medium datasets (NR, GPCR and IC), while it is set to 21 for larger datasets (E and Luo). Similar to [3, 7, 9], we obtain the best hyperparameters of our model via grid search, and the detailed settings are listed in Supplementary Table A2.

Results and discussion

Tables 2 and 3 list the results of MDMF2A and its nine competitors on five datasets under four prediction settings. The numbers in the last row denote the average rank across all prediction settings, and ‘*’ indicates that the advantage of DAMF2A over the competitor is statistically significant according to a Wilcoxon signed-rank test with Bergman-Hommel’s correction [35] at 5% level on the results of all prediction settings.

Table 2

AUPR results in all prediction settings

SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.628(6)0.64(5)0.658(3)0.673(2)0.642(4)0.508(8)0.546(7)0.455(9)0.43(10)0.675(1)
GPCR-0.844(4.5)0.86(3)0.844(4.5)0.87(2)0.835(6)0.597(10)0.798(8)0.831(7)0.747(9)0.874(1)
IC-0.936(3)0.934(4)0.913(7)0.943(2)0.914(6)0.693(10)0.906(8)0.929(5)0.867(9)0.946(1)
E-0.818(6)0.843(3)0.832(4)0.858(2)0.811(7)0.305(10)0.78(8)0.831(5)0.736(9)0.859(1)
Luo-0.599(6)0.603(5)0.636(3)0.653(2)0.598(7)0.216(9)0.08(10)0.615(4)0.451(8)0.679(1)
|$AveRank$|-5.144.3269.48.2691
S2NR0.562(5)0.531(7)0.547(6)0.564(4)0.578(2)0.57(3)0.339(10)0.486(8)0.34(9)0.338(11)0.602(1)
GPCR0.54(4)0.472(7)0.508(6)0.542(3)0.551(2)0.532(5)0.449(9)0.451(8)0.356(10)0.254(11)0.561(1)
IC0.491(4)0.379(8.5)0.479(5)0.493(3)0.495(2)0.466(6)0.365(10)0.407(7)0.379(8.5)0.16(11)0.502(1)
E0.405(4)0.288(8)0.389(5)0.415(3)0.422(2)0.376(6)0.173(10)0.33(7)0.269(9)0.141(11)0.428(1)
Luo0.485(2)0.371(8)0.458(6)0.462(5)0.472(4)0.502(1)0.187(9)0.077(11)0.376(7)0.079(10)0.48(3)
|$AveRank$|3.87.75.63.62.44.29.68.28.710.81.4
S3NR0.56(3)0.505(7)0.519(6)0.545(5)0.588(1)0.546(4)0.431(8)0.375(9)0.359(10)0.335(11)0.582(2)
GPCR0.774(3)0.69(7)0.729(6)0.755(5)0.787(2)0.757(4)0.546(10)0.684(8)0.638(9)0.327(11)0.788(1)
IC0.861(3)0.827(7)0.838(6)0.851(5)0.863(2)0.855(4)0.599(10)0.8(8)0.791(9)0.4(11)0.865(1)
E0.728(3)0.623(8)0.711(6)0.715(4)0.731(2)0.712(5)0.313(10)0.615(9)0.656(7)0.257(11)0.738(1)
Luo0.243(5)0.08(8)0.204(7)0.234(6)0.292(3)0.248(4)0.061(9)0.046(10)0.294(2)0.027(11)0.299(1)
|$AveRank$|3.47.46.2524.29.48.87.4111.2
S4NR0.309(1)0.273(6)0.278(5)0.308(2)0.289(3)0.236(8)0.249(7)0.16(9)0.146(11)0.149(10)0.286(4)
GPCR0.393(3)0.323(6)0.331(5)0.368(4)0.407(1.5)0.077(10)0.306(7)0.29(8)0.159(9)0.074(11)0.407(1.5)
IC0.339(4)0.194(9)0.327(5)0.347(3)0.352(2)0.086(10)0.264(6)0.251(7)0.196(8)0.067(11)0.356(1)
E0.228(3)0.074(9)0.221(5)0.224(4)0.235(2)0.063(10)0.1(7)0.112(6)0.09(8)0.017(11)0.239(1)
Luo0.132(3)0.018(10)0.096(6)0.106(5)0.175(2)0.061(7)0.035(8)0.03(9)0.127(4)0.002(11)0.182(1)
|$AveRank$|2.885.23.62.1977.8810.81.7
|$Summary$|3.33*7.05*5.25*4.13*2.13*5.85*8.85*8.25*7.53*10.4*1.33
SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.628(6)0.64(5)0.658(3)0.673(2)0.642(4)0.508(8)0.546(7)0.455(9)0.43(10)0.675(1)
GPCR-0.844(4.5)0.86(3)0.844(4.5)0.87(2)0.835(6)0.597(10)0.798(8)0.831(7)0.747(9)0.874(1)
IC-0.936(3)0.934(4)0.913(7)0.943(2)0.914(6)0.693(10)0.906(8)0.929(5)0.867(9)0.946(1)
E-0.818(6)0.843(3)0.832(4)0.858(2)0.811(7)0.305(10)0.78(8)0.831(5)0.736(9)0.859(1)
Luo-0.599(6)0.603(5)0.636(3)0.653(2)0.598(7)0.216(9)0.08(10)0.615(4)0.451(8)0.679(1)
|$AveRank$|-5.144.3269.48.2691
S2NR0.562(5)0.531(7)0.547(6)0.564(4)0.578(2)0.57(3)0.339(10)0.486(8)0.34(9)0.338(11)0.602(1)
GPCR0.54(4)0.472(7)0.508(6)0.542(3)0.551(2)0.532(5)0.449(9)0.451(8)0.356(10)0.254(11)0.561(1)
IC0.491(4)0.379(8.5)0.479(5)0.493(3)0.495(2)0.466(6)0.365(10)0.407(7)0.379(8.5)0.16(11)0.502(1)
E0.405(4)0.288(8)0.389(5)0.415(3)0.422(2)0.376(6)0.173(10)0.33(7)0.269(9)0.141(11)0.428(1)
Luo0.485(2)0.371(8)0.458(6)0.462(5)0.472(4)0.502(1)0.187(9)0.077(11)0.376(7)0.079(10)0.48(3)
|$AveRank$|3.87.75.63.62.44.29.68.28.710.81.4
S3NR0.56(3)0.505(7)0.519(6)0.545(5)0.588(1)0.546(4)0.431(8)0.375(9)0.359(10)0.335(11)0.582(2)
GPCR0.774(3)0.69(7)0.729(6)0.755(5)0.787(2)0.757(4)0.546(10)0.684(8)0.638(9)0.327(11)0.788(1)
IC0.861(3)0.827(7)0.838(6)0.851(5)0.863(2)0.855(4)0.599(10)0.8(8)0.791(9)0.4(11)0.865(1)
E0.728(3)0.623(8)0.711(6)0.715(4)0.731(2)0.712(5)0.313(10)0.615(9)0.656(7)0.257(11)0.738(1)
Luo0.243(5)0.08(8)0.204(7)0.234(6)0.292(3)0.248(4)0.061(9)0.046(10)0.294(2)0.027(11)0.299(1)
|$AveRank$|3.47.46.2524.29.48.87.4111.2
S4NR0.309(1)0.273(6)0.278(5)0.308(2)0.289(3)0.236(8)0.249(7)0.16(9)0.146(11)0.149(10)0.286(4)
GPCR0.393(3)0.323(6)0.331(5)0.368(4)0.407(1.5)0.077(10)0.306(7)0.29(8)0.159(9)0.074(11)0.407(1.5)
IC0.339(4)0.194(9)0.327(5)0.347(3)0.352(2)0.086(10)0.264(6)0.251(7)0.196(8)0.067(11)0.356(1)
E0.228(3)0.074(9)0.221(5)0.224(4)0.235(2)0.063(10)0.1(7)0.112(6)0.09(8)0.017(11)0.239(1)
Luo0.132(3)0.018(10)0.096(6)0.106(5)0.175(2)0.061(7)0.035(8)0.03(9)0.127(4)0.002(11)0.182(1)
|$AveRank$|2.885.23.62.1977.8810.81.7
|$Summary$|3.33*7.05*5.25*4.13*2.13*5.85*8.85*8.25*7.53*10.4*1.33
Table 2

AUPR results in all prediction settings

SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.628(6)0.64(5)0.658(3)0.673(2)0.642(4)0.508(8)0.546(7)0.455(9)0.43(10)0.675(1)
GPCR-0.844(4.5)0.86(3)0.844(4.5)0.87(2)0.835(6)0.597(10)0.798(8)0.831(7)0.747(9)0.874(1)
IC-0.936(3)0.934(4)0.913(7)0.943(2)0.914(6)0.693(10)0.906(8)0.929(5)0.867(9)0.946(1)
E-0.818(6)0.843(3)0.832(4)0.858(2)0.811(7)0.305(10)0.78(8)0.831(5)0.736(9)0.859(1)
Luo-0.599(6)0.603(5)0.636(3)0.653(2)0.598(7)0.216(9)0.08(10)0.615(4)0.451(8)0.679(1)
|$AveRank$|-5.144.3269.48.2691
S2NR0.562(5)0.531(7)0.547(6)0.564(4)0.578(2)0.57(3)0.339(10)0.486(8)0.34(9)0.338(11)0.602(1)
GPCR0.54(4)0.472(7)0.508(6)0.542(3)0.551(2)0.532(5)0.449(9)0.451(8)0.356(10)0.254(11)0.561(1)
IC0.491(4)0.379(8.5)0.479(5)0.493(3)0.495(2)0.466(6)0.365(10)0.407(7)0.379(8.5)0.16(11)0.502(1)
E0.405(4)0.288(8)0.389(5)0.415(3)0.422(2)0.376(6)0.173(10)0.33(7)0.269(9)0.141(11)0.428(1)
Luo0.485(2)0.371(8)0.458(6)0.462(5)0.472(4)0.502(1)0.187(9)0.077(11)0.376(7)0.079(10)0.48(3)
|$AveRank$|3.87.75.63.62.44.29.68.28.710.81.4
S3NR0.56(3)0.505(7)0.519(6)0.545(5)0.588(1)0.546(4)0.431(8)0.375(9)0.359(10)0.335(11)0.582(2)
GPCR0.774(3)0.69(7)0.729(6)0.755(5)0.787(2)0.757(4)0.546(10)0.684(8)0.638(9)0.327(11)0.788(1)
IC0.861(3)0.827(7)0.838(6)0.851(5)0.863(2)0.855(4)0.599(10)0.8(8)0.791(9)0.4(11)0.865(1)
E0.728(3)0.623(8)0.711(6)0.715(4)0.731(2)0.712(5)0.313(10)0.615(9)0.656(7)0.257(11)0.738(1)
Luo0.243(5)0.08(8)0.204(7)0.234(6)0.292(3)0.248(4)0.061(9)0.046(10)0.294(2)0.027(11)0.299(1)
|$AveRank$|3.47.46.2524.29.48.87.4111.2
S4NR0.309(1)0.273(6)0.278(5)0.308(2)0.289(3)0.236(8)0.249(7)0.16(9)0.146(11)0.149(10)0.286(4)
GPCR0.393(3)0.323(6)0.331(5)0.368(4)0.407(1.5)0.077(10)0.306(7)0.29(8)0.159(9)0.074(11)0.407(1.5)
IC0.339(4)0.194(9)0.327(5)0.347(3)0.352(2)0.086(10)0.264(6)0.251(7)0.196(8)0.067(11)0.356(1)
E0.228(3)0.074(9)0.221(5)0.224(4)0.235(2)0.063(10)0.1(7)0.112(6)0.09(8)0.017(11)0.239(1)
Luo0.132(3)0.018(10)0.096(6)0.106(5)0.175(2)0.061(7)0.035(8)0.03(9)0.127(4)0.002(11)0.182(1)
|$AveRank$|2.885.23.62.1977.8810.81.7
|$Summary$|3.33*7.05*5.25*4.13*2.13*5.85*8.85*8.25*7.53*10.4*1.33
SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.628(6)0.64(5)0.658(3)0.673(2)0.642(4)0.508(8)0.546(7)0.455(9)0.43(10)0.675(1)
GPCR-0.844(4.5)0.86(3)0.844(4.5)0.87(2)0.835(6)0.597(10)0.798(8)0.831(7)0.747(9)0.874(1)
IC-0.936(3)0.934(4)0.913(7)0.943(2)0.914(6)0.693(10)0.906(8)0.929(5)0.867(9)0.946(1)
E-0.818(6)0.843(3)0.832(4)0.858(2)0.811(7)0.305(10)0.78(8)0.831(5)0.736(9)0.859(1)
Luo-0.599(6)0.603(5)0.636(3)0.653(2)0.598(7)0.216(9)0.08(10)0.615(4)0.451(8)0.679(1)
|$AveRank$|-5.144.3269.48.2691
S2NR0.562(5)0.531(7)0.547(6)0.564(4)0.578(2)0.57(3)0.339(10)0.486(8)0.34(9)0.338(11)0.602(1)
GPCR0.54(4)0.472(7)0.508(6)0.542(3)0.551(2)0.532(5)0.449(9)0.451(8)0.356(10)0.254(11)0.561(1)
IC0.491(4)0.379(8.5)0.479(5)0.493(3)0.495(2)0.466(6)0.365(10)0.407(7)0.379(8.5)0.16(11)0.502(1)
E0.405(4)0.288(8)0.389(5)0.415(3)0.422(2)0.376(6)0.173(10)0.33(7)0.269(9)0.141(11)0.428(1)
Luo0.485(2)0.371(8)0.458(6)0.462(5)0.472(4)0.502(1)0.187(9)0.077(11)0.376(7)0.079(10)0.48(3)
|$AveRank$|3.87.75.63.62.44.29.68.28.710.81.4
S3NR0.56(3)0.505(7)0.519(6)0.545(5)0.588(1)0.546(4)0.431(8)0.375(9)0.359(10)0.335(11)0.582(2)
GPCR0.774(3)0.69(7)0.729(6)0.755(5)0.787(2)0.757(4)0.546(10)0.684(8)0.638(9)0.327(11)0.788(1)
IC0.861(3)0.827(7)0.838(6)0.851(5)0.863(2)0.855(4)0.599(10)0.8(8)0.791(9)0.4(11)0.865(1)
E0.728(3)0.623(8)0.711(6)0.715(4)0.731(2)0.712(5)0.313(10)0.615(9)0.656(7)0.257(11)0.738(1)
Luo0.243(5)0.08(8)0.204(7)0.234(6)0.292(3)0.248(4)0.061(9)0.046(10)0.294(2)0.027(11)0.299(1)
|$AveRank$|3.47.46.2524.29.48.87.4111.2
S4NR0.309(1)0.273(6)0.278(5)0.308(2)0.289(3)0.236(8)0.249(7)0.16(9)0.146(11)0.149(10)0.286(4)
GPCR0.393(3)0.323(6)0.331(5)0.368(4)0.407(1.5)0.077(10)0.306(7)0.29(8)0.159(9)0.074(11)0.407(1.5)
IC0.339(4)0.194(9)0.327(5)0.347(3)0.352(2)0.086(10)0.264(6)0.251(7)0.196(8)0.067(11)0.356(1)
E0.228(3)0.074(9)0.221(5)0.224(4)0.235(2)0.063(10)0.1(7)0.112(6)0.09(8)0.017(11)0.239(1)
Luo0.132(3)0.018(10)0.096(6)0.106(5)0.175(2)0.061(7)0.035(8)0.03(9)0.127(4)0.002(11)0.182(1)
|$AveRank$|2.885.23.62.1977.8810.81.7
|$Summary$|3.33*7.05*5.25*4.13*2.13*5.85*8.85*8.25*7.53*10.4*1.33
Table 3

AUC results in all prediction settings

SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.882(4.5)0.882(4.5)0.891(2)0.884(3)0.879(6)0.797(9)0.846(7)0.818(8)0.788(10)0.892(1)
GPCR-0.962(6)0.972(4)0.978(2)0.978(2)0.971(5)0.916(10)0.953(8)0.96(7)0.93(9)0.978(2)
IC-0.982(6)0.989(2)0.988(4)0.989(2)0.981(7.5)0.938(10)0.981(7.5)0.985(5)0.97(9)0.989(2)
E-0.961(8)0.981(4)0.982(3)0.983(2)0.964(7)0.839(10)0.97(5)0.966(6)0.944(9)0.984(1)
Luo-0.922(7)0.951(3)0.947(4)0.966(2)0.941(5)0.894(9)0.929(6)0.917(8)0.861(10)0.97(1)
|$AveRank$|-6.33.532.26.19.66.76.89.41.4
S2NR0.825(5.5)0.802(7)0.826(4)0.825(5.5)0.833(2)0.831(3)0.666(11)0.786(8)0.739(9)0.727(10)0.837(1)
GPCR0.914(4)0.882(7)0.913(5)0.924(2.5)0.924(2.5)0.867(8)0.858(9)0.885(6)0.852(10)0.811(11)0.925(1)
IC0.826(4)0.783(9)0.825(5)0.833(1)0.828(2)0.796(7)0.766(10)0.794(8)0.803(6)0.715(11)0.827(3)
E0.877(4)0.835(7)0.858(5)0.891(3)0.892(2)0.799(9)0.78(10)0.837(6)0.811(8)0.732(11)0.895(1)
Luo0.904(5)0.897(7)0.917(3)0.899(6)0.927(2)0.864(9)0.873(8)0.907(4)0.861(10)0.776(11)0.937(1)
|$AveRank$|4.57.44.43.62.17.29.66.48.610.81.4
S3NR0.82(3)0.786(7)0.825(2)0.845(1)0.819(4)0.798(6)0.756(8)0.726(9)0.712(10)0.703(11)0.805(5)
GPCR0.952(4)0.902(9)0.946(5)0.965(1)0.96(3)0.917(7)0.879(10)0.918(6)0.909(8)0.798(11)0.961(2)
IC0.958(5)0.941(7)0.96(4)0.965(3)0.967(1)0.942(6)0.907(10)0.939(8)0.938(9)0.866(11)0.966(2)
E0.935(5)0.881(9)0.936(4)0.943(3)0.944(2)0.883(8)0.841(10)0.918(6)0.911(7)0.815(11)0.948(1)
Luo0.835(5)0.826(9)0.828(8)0.84(4)0.901(2)0.801(10)0.829(7)0.86(3)0.831(6)0.633(11)0.902(1)
|$AveRank$|4.48.24.62.42.47.496.48112.2
S4NR0.637(5)0.597(6)0.656(3)0.677(1)0.649(4)0.592(7)0.562(8)0.548(9)0.531(10)0.524(11)0.661(2)
GPCR0.871(4)0.798(8)0.866(5)0.89(1)0.886(3)0.405(11)0.803(7)0.816(6)0.721(9)0.581(10)0.887(2)
IC0.774(5)0.658(9)0.775(4)0.782(2)0.776(3)0.498(11)0.706(6)0.702(7)0.683(8)0.555(10)0.783(1)
E0.819(3)0.695(8)0.799(5)0.815(4)0.821(2)0.453(11)0.757(6)0.744(7)0.69(9)0.541(10)0.827(1)
Luo0.819(4)0.752(7)0.804(5)0.745(8)0.848(2)0.438(11)0.787(6)0.822(3)0.732(9)0.513(10)0.85(1)
|$AveRank$|4.27.64.43.22.810.26.66.4910.21.4
|$Summary$|4.37*7.38*4.23*3.052.38*7.73*8.7*6.48*8.1*10.35*1.6
SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.882(4.5)0.882(4.5)0.891(2)0.884(3)0.879(6)0.797(9)0.846(7)0.818(8)0.788(10)0.892(1)
GPCR-0.962(6)0.972(4)0.978(2)0.978(2)0.971(5)0.916(10)0.953(8)0.96(7)0.93(9)0.978(2)
IC-0.982(6)0.989(2)0.988(4)0.989(2)0.981(7.5)0.938(10)0.981(7.5)0.985(5)0.97(9)0.989(2)
E-0.961(8)0.981(4)0.982(3)0.983(2)0.964(7)0.839(10)0.97(5)0.966(6)0.944(9)0.984(1)
Luo-0.922(7)0.951(3)0.947(4)0.966(2)0.941(5)0.894(9)0.929(6)0.917(8)0.861(10)0.97(1)
|$AveRank$|-6.33.532.26.19.66.76.89.41.4
S2NR0.825(5.5)0.802(7)0.826(4)0.825(5.5)0.833(2)0.831(3)0.666(11)0.786(8)0.739(9)0.727(10)0.837(1)
GPCR0.914(4)0.882(7)0.913(5)0.924(2.5)0.924(2.5)0.867(8)0.858(9)0.885(6)0.852(10)0.811(11)0.925(1)
IC0.826(4)0.783(9)0.825(5)0.833(1)0.828(2)0.796(7)0.766(10)0.794(8)0.803(6)0.715(11)0.827(3)
E0.877(4)0.835(7)0.858(5)0.891(3)0.892(2)0.799(9)0.78(10)0.837(6)0.811(8)0.732(11)0.895(1)
Luo0.904(5)0.897(7)0.917(3)0.899(6)0.927(2)0.864(9)0.873(8)0.907(4)0.861(10)0.776(11)0.937(1)
|$AveRank$|4.57.44.43.62.17.29.66.48.610.81.4
S3NR0.82(3)0.786(7)0.825(2)0.845(1)0.819(4)0.798(6)0.756(8)0.726(9)0.712(10)0.703(11)0.805(5)
GPCR0.952(4)0.902(9)0.946(5)0.965(1)0.96(3)0.917(7)0.879(10)0.918(6)0.909(8)0.798(11)0.961(2)
IC0.958(5)0.941(7)0.96(4)0.965(3)0.967(1)0.942(6)0.907(10)0.939(8)0.938(9)0.866(11)0.966(2)
E0.935(5)0.881(9)0.936(4)0.943(3)0.944(2)0.883(8)0.841(10)0.918(6)0.911(7)0.815(11)0.948(1)
Luo0.835(5)0.826(9)0.828(8)0.84(4)0.901(2)0.801(10)0.829(7)0.86(3)0.831(6)0.633(11)0.902(1)
|$AveRank$|4.48.24.62.42.47.496.48112.2
S4NR0.637(5)0.597(6)0.656(3)0.677(1)0.649(4)0.592(7)0.562(8)0.548(9)0.531(10)0.524(11)0.661(2)
GPCR0.871(4)0.798(8)0.866(5)0.89(1)0.886(3)0.405(11)0.803(7)0.816(6)0.721(9)0.581(10)0.887(2)
IC0.774(5)0.658(9)0.775(4)0.782(2)0.776(3)0.498(11)0.706(6)0.702(7)0.683(8)0.555(10)0.783(1)
E0.819(3)0.695(8)0.799(5)0.815(4)0.821(2)0.453(11)0.757(6)0.744(7)0.69(9)0.541(10)0.827(1)
Luo0.819(4)0.752(7)0.804(5)0.745(8)0.848(2)0.438(11)0.787(6)0.822(3)0.732(9)0.513(10)0.85(1)
|$AveRank$|4.27.64.43.22.810.26.66.4910.21.4
|$Summary$|4.37*7.38*4.23*3.052.38*7.73*8.7*6.48*8.1*10.35*1.6
Table 3

AUC results in all prediction settings

SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.882(4.5)0.882(4.5)0.891(2)0.884(3)0.879(6)0.797(9)0.846(7)0.818(8)0.788(10)0.892(1)
GPCR-0.962(6)0.972(4)0.978(2)0.978(2)0.971(5)0.916(10)0.953(8)0.96(7)0.93(9)0.978(2)
IC-0.982(6)0.989(2)0.988(4)0.989(2)0.981(7.5)0.938(10)0.981(7.5)0.985(5)0.97(9)0.989(2)
E-0.961(8)0.981(4)0.982(3)0.983(2)0.964(7)0.839(10)0.97(5)0.966(6)0.944(9)0.984(1)
Luo-0.922(7)0.951(3)0.947(4)0.966(2)0.941(5)0.894(9)0.929(6)0.917(8)0.861(10)0.97(1)
|$AveRank$|-6.33.532.26.19.66.76.89.41.4
S2NR0.825(5.5)0.802(7)0.826(4)0.825(5.5)0.833(2)0.831(3)0.666(11)0.786(8)0.739(9)0.727(10)0.837(1)
GPCR0.914(4)0.882(7)0.913(5)0.924(2.5)0.924(2.5)0.867(8)0.858(9)0.885(6)0.852(10)0.811(11)0.925(1)
IC0.826(4)0.783(9)0.825(5)0.833(1)0.828(2)0.796(7)0.766(10)0.794(8)0.803(6)0.715(11)0.827(3)
E0.877(4)0.835(7)0.858(5)0.891(3)0.892(2)0.799(9)0.78(10)0.837(6)0.811(8)0.732(11)0.895(1)
Luo0.904(5)0.897(7)0.917(3)0.899(6)0.927(2)0.864(9)0.873(8)0.907(4)0.861(10)0.776(11)0.937(1)
|$AveRank$|4.57.44.43.62.17.29.66.48.610.81.4
S3NR0.82(3)0.786(7)0.825(2)0.845(1)0.819(4)0.798(6)0.756(8)0.726(9)0.712(10)0.703(11)0.805(5)
GPCR0.952(4)0.902(9)0.946(5)0.965(1)0.96(3)0.917(7)0.879(10)0.918(6)0.909(8)0.798(11)0.961(2)
IC0.958(5)0.941(7)0.96(4)0.965(3)0.967(1)0.942(6)0.907(10)0.939(8)0.938(9)0.866(11)0.966(2)
E0.935(5)0.881(9)0.936(4)0.943(3)0.944(2)0.883(8)0.841(10)0.918(6)0.911(7)0.815(11)0.948(1)
Luo0.835(5)0.826(9)0.828(8)0.84(4)0.901(2)0.801(10)0.829(7)0.86(3)0.831(6)0.633(11)0.902(1)
|$AveRank$|4.48.24.62.42.47.496.48112.2
S4NR0.637(5)0.597(6)0.656(3)0.677(1)0.649(4)0.592(7)0.562(8)0.548(9)0.531(10)0.524(11)0.661(2)
GPCR0.871(4)0.798(8)0.866(5)0.89(1)0.886(3)0.405(11)0.803(7)0.816(6)0.721(9)0.581(10)0.887(2)
IC0.774(5)0.658(9)0.775(4)0.782(2)0.776(3)0.498(11)0.706(6)0.702(7)0.683(8)0.555(10)0.783(1)
E0.819(3)0.695(8)0.799(5)0.815(4)0.821(2)0.453(11)0.757(6)0.744(7)0.69(9)0.541(10)0.827(1)
Luo0.819(4)0.752(7)0.804(5)0.745(8)0.848(2)0.438(11)0.787(6)0.822(3)0.732(9)0.513(10)0.85(1)
|$AveRank$|4.27.64.43.22.810.26.66.4910.21.4
|$Summary$|4.37*7.38*4.23*3.052.38*7.73*8.7*6.48*8.1*10.35*1.6
SettingDatasetWkNNIRMSCMFNRMFLGRGMFMF2ADRLSMDTINetNEDTPNetMFMulti2VecMDMF2A
S1NR-0.882(4.5)0.882(4.5)0.891(2)0.884(3)0.879(6)0.797(9)0.846(7)0.818(8)0.788(10)0.892(1)
GPCR-0.962(6)0.972(4)0.978(2)0.978(2)0.971(5)0.916(10)0.953(8)0.96(7)0.93(9)0.978(2)
IC-0.982(6)0.989(2)0.988(4)0.989(2)0.981(7.5)0.938(10)0.981(7.5)0.985(5)0.97(9)0.989(2)
E-0.961(8)0.981(4)0.982(3)0.983(2)0.964(7)0.839(10)0.97(5)0.966(6)0.944(9)0.984(1)
Luo-0.922(7)0.951(3)0.947(4)0.966(2)0.941(5)0.894(9)0.929(6)0.917(8)0.861(10)0.97(1)
|$AveRank$|-6.33.532.26.19.66.76.89.41.4
S2NR0.825(5.5)0.802(7)0.826(4)0.825(5.5)0.833(2)0.831(3)0.666(11)0.786(8)0.739(9)0.727(10)0.837(1)
GPCR0.914(4)0.882(7)0.913(5)0.924(2.5)0.924(2.5)0.867(8)0.858(9)0.885(6)0.852(10)0.811(11)0.925(1)
IC0.826(4)0.783(9)0.825(5)0.833(1)0.828(2)0.796(7)0.766(10)0.794(8)0.803(6)0.715(11)0.827(3)
E0.877(4)0.835(7)0.858(5)0.891(3)0.892(2)0.799(9)0.78(10)0.837(6)0.811(8)0.732(11)0.895(1)
Luo0.904(5)0.897(7)0.917(3)0.899(6)0.927(2)0.864(9)0.873(8)0.907(4)0.861(10)0.776(11)0.937(1)
|$AveRank$|4.57.44.43.62.17.29.66.48.610.81.4
S3NR0.82(3)0.786(7)0.825(2)0.845(1)0.819(4)0.798(6)0.756(8)0.726(9)0.712(10)0.703(11)0.805(5)
GPCR0.952(4)0.902(9)0.946(5)0.965(1)0.96(3)0.917(7)0.879(10)0.918(6)0.909(8)0.798(11)0.961(2)
IC0.958(5)0.941(7)0.96(4)0.965(3)0.967(1)0.942(6)0.907(10)0.939(8)0.938(9)0.866(11)0.966(2)
E0.935(5)0.881(9)0.936(4)0.943(3)0.944(2)0.883(8)0.841(10)0.918(6)0.911(7)0.815(11)0.948(1)
Luo0.835(5)0.826(9)0.828(8)0.84(4)0.901(2)0.801(10)0.829(7)0.86(3)0.831(6)0.633(11)0.902(1)
|$AveRank$|4.48.24.62.42.47.496.48112.2
S4NR0.637(5)0.597(6)0.656(3)0.677(1)0.649(4)0.592(7)0.562(8)0.548(9)0.531(10)0.524(11)0.661(2)
GPCR0.871(4)0.798(8)0.866(5)0.89(1)0.886(3)0.405(11)0.803(7)0.816(6)0.721(9)0.581(10)0.887(2)
IC0.774(5)0.658(9)0.775(4)0.782(2)0.776(3)0.498(11)0.706(6)0.702(7)0.683(8)0.555(10)0.783(1)
E0.819(3)0.695(8)0.799(5)0.815(4)0.821(2)0.453(11)0.757(6)0.744(7)0.69(9)0.541(10)0.827(1)
Luo0.819(4)0.752(7)0.804(5)0.745(8)0.848(2)0.438(11)0.787(6)0.822(3)0.732(9)0.513(10)0.85(1)
|$AveRank$|4.27.64.43.22.810.26.66.4910.21.4
|$Summary$|4.37*7.38*4.23*3.052.38*7.73*8.7*6.48*8.1*10.35*1.6

The proposed MDMF2A is the best-performing model in most cases for both metrics, achieving the highest average rank in all prediction settings and statistically significantly outperforming all competitors, except for GRGMF in AUC. This demonstrates the effectiveness of our model to sufficiently exploit the topology information embedded in the multiplex heterogeneous DTI network and optimize the two area under the curve metrics. MF2A is the runner-up. Its inferiority to MDMF2A is mainly attributed to the ignorance of high-order proximity captured by random walks and the view-specific information loss caused by aggregating multi-type similarities. GRGMF, WkNNIR, NRMLF, DRLSM and MSCMF come next. They are outperformed by the proposed MDMF2A, because they fail to capture the unique information provided by each view. The two graph embedding based DTI prediction models are usually inferior to other DTI prediction approaches, because they generate embedding in an unsupervised manner without exploiting the interacting information. Specifically, NEDTP using the class imbalance resilient GBDT as the predicting classifier outperforms DTINet which employs simple linear projection to estimate DTIs. Regarding the two general multiplex network embedding methods, NetMF is better than Multi2Vec, because the latter requires dichotomizing the edge weights, which wipes out the different influence of the connected nodes in the similarity subnetwork. In addition, averaging all per-layer embeddings does not distinguish the importance of each hyper-layer, unlike the holistic DeepWalk Matrix used in NetMF, which contributes to the inferiority of Multi2Vec to NetMF as well.

There are some cases where MDMF2A does not achieve the best performance. Some baseline models, such as WkNNIR, NRLMF, GRGMF and MF2A, are better than MDMF2A on NR datasets, implying that the random walk-based embedding generation may not yield enough benefit in the case of small-sized datasets. Besides, concerning the Luo dataset under S2, MDMF2A is outperformed by WKNNIR and DRLSM, which incorporate the neighborhood interaction recovery procedure, in terms of AUPR. In the Luo dataset, the neighbor drugs are more likely to share the same interactions, leading to the effectiveness of neighborhood-based interaction estimation for new drugs. Nevertheless, the AUC results of the two baselines are 3.7% and 8.4% lower than DWFM2A, respectively. Also, MF2A is slightly better than MDMF2A on the IC dataset in terms of AUC under S2 and S3, but the gap of results between them is tiny, e.g. 0.001. Finally, GRGMF achieves better AUC results than MDMF2A on the GPCR dataset under S3 and S4 as well as the IC dataset under S2, mainly because GRGMF learns neighbor information adaptively. But it is worse than MDMF2A in terms of AUPR, which is more informative when evaluating a model under extremely imbalanced class distribution (sparse interaction).

The advantage of MDMF2A is also observed in the comparison with deep learning-based DTI prediction models on the Luo dataset. As shown in Table 4, MDMF2A outperforms all competitors in terms of AUPR, achieving 14% improvements over the best-performing competitor (DCFME). In terms of AUC, MDMF2A is only 1.1% lower than DTIP, and 5.3%, 3.6% and 3.9% higher than NeoDTI, DCFME and SupDTI. Although DTIP emphasizes the AUC performance and slightly outperforms MDMF2A, it suffers a significant decline in the AUPR results. Compared with deep learning competitors, MDMF2A takes full advantage of the information shared by high-order neighbors and explicitly optimizes AUPR and AUC that are more effective than the conventional entropy and focal losses to identifying the less frequent interacting pairs, resulting in better performance.

Table 4

Results of MDMF2A and Deep Learning models on Luo dataset in S1

MetricNeoDTIDTIPDCFMESupDTIMDMF2A
AUPR0.573(4)0.399(5)0.596(2)0.585(3)0.679(1)
AUC0.921(5)0.981(1)0.936(3)0.933(4)0.97(2)
MetricNeoDTIDTIPDCFMESupDTIMDMF2A
AUPR0.573(4)0.399(5)0.596(2)0.585(3)0.679(1)
AUC0.921(5)0.981(1)0.936(3)0.933(4)0.97(2)
Table 4

Results of MDMF2A and Deep Learning models on Luo dataset in S1

MetricNeoDTIDTIPDCFMESupDTIMDMF2A
AUPR0.573(4)0.399(5)0.596(2)0.585(3)0.679(1)
AUC0.921(5)0.981(1)0.936(3)0.933(4)0.97(2)
MetricNeoDTIDTIPDCFMESupDTIMDMF2A
AUPR0.573(4)0.399(5)0.596(2)0.585(3)0.679(1)
AUC0.921(5)0.981(1)0.936(3)0.933(4)0.97(2)

To comprehensively investigate the proposed MDMF2A, we conduct an ablation study to demonstrate the effectiveness of its ensemble framework and all regularization terms and analyze the sensitivity of three important parameters, i.e. |$r$|⁠, |$n_w$| and |$n_s$|⁠. Please see more details in Supplementary Sections A5–A6.

Discovery of novel DTIs

We examine the capability of MDMF2A to find novel DTIs not recorded in the Luo dataset. We do not consider updated golden standard datasets, since they have included all recently validated DTIs collected from up-to-date databases. We split all noninteracting pairs into 10 folds, and obtain predictions of each fold by training an MDMF2A model with all interacting pairs and the other nine folds of noninteracting ones. All noninteracting pairs are ranked based on their predicting scores, and the top 10 pairs are selected as newly discovered DTIs, which are shown in Table 5. To further verify the reliability of these new interaction candidates, we search their supportive evidence from DrugBank (DB) [23] and DrugCentral (DC) [36]. As we can see, 8/10 new interactions (in bold) are confirmed, demonstrating the success of MDMF2A in trustworthy new DTI discovery.

Table 5

Top 10 new DTIs discovered by MDMF2A from Luo’s datasets

Drug IDDrug nameTarget IDTarget nameRankDatabase
DB00829DiazepamP48169GABRA41DB
DB01215EstazolamP48169GABRA42DB
DB00580ValdecoxibP23219PTGS13-
DB01367RasagilineP21397MAOA4DC
DB00333MethadoneP41145OPRK15DC
DB00363ClozapineP21918DRD56DC
DB06216AsenapineP21918DRD57DB
DB06800Methylnal-trexoneP41143OPRD18DC
DB00802AlfentanilP41145OPRK19-
DB00482CelecoxibP23219PTGS110DC
Drug IDDrug nameTarget IDTarget nameRankDatabase
DB00829DiazepamP48169GABRA41DB
DB01215EstazolamP48169GABRA42DB
DB00580ValdecoxibP23219PTGS13-
DB01367RasagilineP21397MAOA4DC
DB00333MethadoneP41145OPRK15DC
DB00363ClozapineP21918DRD56DC
DB06216AsenapineP21918DRD57DB
DB06800Methylnal-trexoneP41143OPRD18DC
DB00802AlfentanilP41145OPRK19-
DB00482CelecoxibP23219PTGS110DC
Table 5

Top 10 new DTIs discovered by MDMF2A from Luo’s datasets

Drug IDDrug nameTarget IDTarget nameRankDatabase
DB00829DiazepamP48169GABRA41DB
DB01215EstazolamP48169GABRA42DB
DB00580ValdecoxibP23219PTGS13-
DB01367RasagilineP21397MAOA4DC
DB00333MethadoneP41145OPRK15DC
DB00363ClozapineP21918DRD56DC
DB06216AsenapineP21918DRD57DB
DB06800Methylnal-trexoneP41143OPRD18DC
DB00802AlfentanilP41145OPRK19-
DB00482CelecoxibP23219PTGS110DC
Drug IDDrug nameTarget IDTarget nameRankDatabase
DB00829DiazepamP48169GABRA41DB
DB01215EstazolamP48169GABRA42DB
DB00580ValdecoxibP23219PTGS13-
DB01367RasagilineP21397MAOA4DC
DB00333MethadoneP41145OPRK15DC
DB00363ClozapineP21918DRD56DC
DB06216AsenapineP21918DRD57DB
DB06800Methylnal-trexoneP41143OPRD18DC
DB00802AlfentanilP41145OPRK19-
DB00482CelecoxibP23219PTGS110DC

Conclusion

This paper proposed MDMF2A, a random walk and matrix factorization based model, to predict DTIs by effectively mining topology information from the multiplex heterogeneous network involving diverse drug and target similarities. It integrates two base predictors that leverage our designed objective function, encouraging the learned embeddings to preserve holistic network and layer-specific topology structures. The two base models utilize the convex AUPR and AUC losses in their objectives, enabling MDMF2A to simultaneously optimize two crucial metrics in the DTI prediction task. We have conducted extensive experiments on five DTI datasets under various prediction settings. The results affirmed the superiority of the proposed MDMF2A to other competing DTI prediction methods. Furthermore, the practical ability of MDMF2A to discover novel DTIs was supported by the evidence from online biological databases.

In the future, we plan to extend our model to handle attributed DTI networks, including both topological and feature information for drugs and targets.

Key Points
  • Incorporating multiple hyper-layers based DeepWalk matrix decomposition and layer-specific graph Laplacian to learn robust node representations that preserve both global and view-specific topology.

  • MDMF integrates multiplex heterogeneous network representation learning and DTI prediction into a unified optimization framework, learning latent features in a supervised manner and implicitly recoveries possible missing interactions.

  • MDMF2A, the instantiation of MDMF, optimizes both AUPR and AUC metrics.

  • Our method statistically significantly outperforms state-of-the-art methods under various prediction settings and can discover new reliable DTIs.

Data and code availability

The source code and data are available could be found at https://github.com/intelligence-csd-auth-gr/DTI_MDMF2A

Funding

This work was supported by the China Scholarship Council (CSC) [201708500095]; the French National Research Agency (ANR) under the JCJC project GraphIA [ANR-20-CE23-0009-01].

Author Biographies

Bin Liu is a lecturer at Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications and received his PhD Degree in computer science from Aristotle University of Thessaloniki. His research interests include multi-label learning and bioinformatics.

Dimitrios Papadopoulos is a PhD student at the School of Informatics, Aristotle University of Thessaloniki. His research interests include supervised machine learning, graph mining, and drug discovery.

Fragkiskos D. Malliaros is an Assistant Professor at Paris-Saclay University, CentraleSupélec and associate researcher at Inria Saclay. His research interests include graph mining, machine learning and graph-based information extraction.

Grigorios Tsoumakas is an Associate Professor at the Aristotle University of Thessaloniki. His research interests include machine learning (ensembles, multi-target prediction) and natural language processing (semantic indexing, keyphrase extraction, summarization)

Apostolos N. Papadopoulos is Associate Professor at the School of Informatics, Aristotle University of Thessaloniki. His research interests include data management, data mining and big data analytics.

References

1.

Bagherian
M
,
Sabeti
E
,
Wang
K
, et al.
Machine learning approaches and databases for prediction of drug-target interaction: a survey paper
.
Brief Bioinform
2021
;
22
(
1
):
247
69
.

2.

Ezzat
A
,
Zhao
P
,
Min
W
, et al.
Drug-target interaction prediction with graph regularized matrix factorization
.
IEEE/ACM Trans Comput Biol Bioinform
2017
;
14
(
3
):
646
56
.

3.

Ding
Y
,
Tang
J
,
Guo
F
.
Identification of drug-target interactions via dual laplacian regularized least squares with multiple kernel fusion
.
Knowl Based Syst
2020
;
204
:
106254
.

4.

An
Q
,
Liang
Y
.
A heterogeneous network embedding framework for predicting similarity-based drug-target interactions
.
Brief Bioinform
2021
;
22
(
6
):
1
10
.

5.

Xuan
P
,
Yu
Z
,
Cui
H
, et al.
Integrating multi-scale neighbouring topologies and cross-modal similarities for drug-protein interaction prediction
.
Brief Bioinform
2021
;
22
(
5
):
bbab119
.

6.

Liu
B
,
Pliakos
K
,
Vens
C
, et al.
Drug-target interaction prediction via an ensemble of weighted nearest neighbors with interaction recovery
.
Appl Intell
2022
;
52
(
4
):
3705
27
.

7.

Liu
Y
,
Wu
M
,
Miao
C
, et al.
Neighborhood regularized logistic matrix factorization for drug-target interaction prediction
.
PLoS Comput Biol
2016
;
12
(
2
):
e1004760
.

8.

Pliakos
K
,
Vens
C
,
Tsoumakas
G
.
Predicting drug-target interactions with multi-label classification and label partitioning
.
IEEE/ACM Trans Comput Biol Bioinform
2021
;
18
(
4
):
1596
607
.

9.

Zheng
X
,
Ding
H
,
Mamitsuka
H
, et al. Collaborative matrix factorization with multiple similarities for predicting drug-Target interactions. In: Dhillon IS, Koren Y, Ghani R, Senator TE, Bradley P, Parekh R, He J, Grossman RL, Uthurusamy R (eds.)
Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min
. Chicago, IL, USA: ACM, pages
1025
33
,
2013
.

10.

Liu
B
,
Tsoumakas
G
.
Optimizing area under the curve measures via matrix factorization for predicting drug-target interaction with multiple similarities
.
arXiv
, 2021;abs/2105.01545:1–14.

11.

Olayan
RS
,
Ashoor
H
,
Bajic
VB
.
DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches
.
Bioinformatics
2018
;
7
(
34
):
1164
73
.

12.

Wan
F
,
Hong
L
,
Xiao
A
, et al.
NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions
.
Bioinformatics
2019
;
35
(
1
):
104
11
.

13.

Chen
J
,
Liang
Z
,
Cheng
K
, et al.
Predicting drug-target interaction via self-supervised learning
.
IEEE/ACM Trans Comput Biol Bioinform
2022
;
PP
:
1
.

14.

Chen
R
,
Xia
F
,
Hu
B
, et al.
Drug-target interactions prediction via deep collaborative filtering with multiembeddings
.
Brief Bioinform
2022
;
23
(2):bbab520.

15.

Dai
H
,
Li
H
,
Tian
T
, et al. Adversarial attack on graph structured data. In: Dy JG, Krause A (eds.)
Proc. Int. Conf. on Mach. Learn
. Stockholm, Sweden: PMLR, pages
1115
24
,
2018
.

16.

Pliakos
K
,
Vens
C
.
Drug-target interaction prediction with tree-ensemble learning and output space reconstruction
.
BMC Bioinform
2020
;
21
(
1
):
1
11
.

17.

Luo
Y
,
Zhao
X
,
Zhou
J
, et al.
A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information
.
Nat Commun
2017
;
8
(
1
):
1
13
.

18.

Qiu
J
,
Dong
Y
,
Ma
H
, et al. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In: Chang Y, Zhai C, Liu Y, Maarek Y (eds.)
Proc. ACM Int. Conf. Web Search Data Min
. Marina Del Rey, CA, USA: ACM, pages
459
67
,
2018
.

19.

Perozzi
B
,
Al-Rfou
R
,
Skiena
S
. DeepWalk: Online learning of social representations. In: Macskassy SA, Perlich C, Leskovec J, Wang W, Ghani R (eds.)
Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min
. New York, NY, USA: ACM, pages
701
10
,
2014
.

20.

Pahikkala
T
,
Airola
A
,
Pietilä
S
, et al.
Toward more realistic drug-target interaction predictions
.
Brief Bioinform
2015
;
16
(
2
):
325
37
.

21.

Yamanishi
Y
,
Araki
M
,
Gutteridge
A
, et al.
Prediction of drug-target interaction networks from the integration of chemical and genomic spaces
.
Bioinformatics
2008
;
24
(
13
):
i232
40
.

22.

Kanehisa
M
,
Furumichi
M
,
Tanabe
M
, et al.
KEGG: new perspectives on genomes, pathways, diseases and drugs
.
Nucleic Acids Res
2017
;
45
(
D1
):
D353
61
.

23.

Wishart
DS
,
Feunang
YD
,
Guo
AC
, et al.
DrugBank 5.0: a major update to the DrugBank database for 2018
.
Nucleic Acids Res
2018
;
46
(
D1
):
D1074
82
.

24.

Mendez
D
,
Gaulton
A
,
Bento
AP
, et al.
ChEMBL: towards direct deposition of bioassay data
.
Nucleic Acids Res
2019
;
47
(
D1
):
D930
40
.

25.

Hattori
M
,
Okuno
Y
,
Goto
S
, et al.
Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways
.
J Am Chem Soc
2003
;
125
(
39
):
11853
65
.

26.

Takarabe
M
,
Kotera
M
,
Nishimura
Y
, et al.
Drug target prediction using adverse event report systems: a pharmacogenomic approach
.
Bioinformatics
2012
;
28
(
18
):
i611
8
.

27.

Kuhn
M
,
Letunic
I
,
Jensen
LJ
, et al.
The SIDER database of drugs and side effects
.
Nucleic Acids Res
2016
;
44
(
D1
):
D1075
9
.

28.

Nascimento
ACA
,
Prudêncio
RBC
,
Costa
IG
.
A multiple kernel learning algorithm for drug-target interaction prediction
.
BMC Bioinform
2016
;
17
(
1
):
1
16
.

29.

Prasad
TSK
,
Goel
R
,
Kandasamy
K
, et al.
Human protein reference database-2009 update
.
Nucleic Acids Res
2009
;
37
(
suppl_1
):
D767
72
.

30.

Davis
AP
,
Murphy
CG
,
Johnson
R
, et al.
The comparative toxicogenomics database: update 2013
.
Nucleic Acids Res
2013
;
41
(
D1
):
D1104
14
.

31.

Duchi
J
,
Hazan
E
,
Singer
Y
.
Adaptive subgradient methods for online learning and stochastic optimization
.
J Mach Learn Res
2011
;
12
:
2121
59
.

32.

Revaud
J
,
Almazan
J
,
Rezende
R
, et al. Learning with average precision: Training image retrieval with a listwise loss. In: Lee KM, Forsyth D, Pollefeys M, Tang X (eds.)
Proc IEEE Int Conf Comput Vis
. Seoul, South Korea: IEEE, pages
5106
15
,
2019
.

33.

Zhang
ZC
,
Zhang
XF
,
Wu
M
, et al.
A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks
.
Bioinformatics
2020
;
36
(
11
):
3474
81
.

34.

Teng
X
,
Liu
J
,
Li
L
.
A synchronous feature learning method for multiplex network embedding
.
Inform Sci
2021
;
574
:
176
91
.

35.

Benavoli
A
,
Corani
G
,
Mangili
F
.
Should we really use post-hoc tests based on mean-ranks?
J Mach Learn Res
2016
;
17
(
1
):
1
10
.

36.

Avram
S
,
Bologa
CG
,
Holmes
J
, et al.
DrugCentral 2021 supports drug discovery and repositioning
.
Nucleic Acids Res
2021
;
49
(
D1
):
D1160
9
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data