Abstract

Motivation

Drug–drug interactions (DDIs) can cause unexpected adverse drug reactions, affecting treatment efficacy and patient safety. The need for computational methods to predict DDIs has been growing due to the necessity of identifying potential risks associated with drug combinations in advance. Although several deep learning methods have been recently proposed to predict DDIs, many overlook feature learning based on interactions between the substructures of drug pairs.

Results

In this work, we introduce a molecular Substructure-based Dual Attention Feature Learning framework (MSDAFL), designed to fully utilize the information between substructures of drug pairs to enhance the performance of DDI prediction. We employ a self-attention module to obtain a set number of self-attention vectors, which are associated with various substructural patterns of the drug molecule itself, while also extracting interaction vectors representing inter-substructure interactions between drugs through an interactive attention module. Subsequently, an interaction module based on cosine similarity is used to further capture the interactive characteristics between the self-attention vectors of drug pairs. We also perform normalization after the interaction feature extraction to mitigate overfitting. After applying three-fold cross-validation, the MSDAFL model achieved average precision scores of 0.9707, 0.9991, and 0.9987, and area under the receiver operating characteristic curve scores of 0.9874, 0.9934, and 0.9974 on three datasets, respectively. In addition, the experiment results of five-fold cross-validation and cross-datum study also indicate that MSDAFL performs well in predicting DDIs.

Availability and implementation

Data and source codes are available at https://github.com/27167199/MSDAFL.

1 Introduction

Drug–drug interactions (DDIs) can cause unexpected adverse drug reactions, affecting treatment efficacy and patient safety (Vilar et al. 2014). DDIs refer to interactions that occur between two or more drug administration processes, including changes in drug properties and the occurrence of toxic side effects (Sun et al. 2016). Therefore, research on DDI prediction is of great practical importance. However, traditional biological or pharmacological methods are costly, time-consuming, and labor-intensive (Shao and Zhang 2013).

Machine learning offers a fresh avenue for accurately predicting DDIs (Mei and Zhang 2021). Methods based on feature similarity posit that drugs sharing similar attributes often exhibit comparable reaction patterns, relying largely on drug properties such as fingerprinting (Vilar et al. 2013), chemical structures (Takeda et al. 2017), pharmacological phenotypes (Li et al. 2015), and RNA profiles (Li et al. 2022). Enhancements in model efficacy are achieved by integrating various features. For instance, the DDI-IS-SL model forecasts DDIs through a blend of integrated similarity measures and semi-supervised learning techniques (Yan et al. 2020). Despite their advancements, these feature similarity-based methods often overlook the structural details of drugs, and their feature selection heavily depends on specialized knowledge and experience.

Graph neural networks (GNNs) have widely been implemented to analyze the chemical structures of drugs and forecast DDIs. Contemporary GNN methodologies are divided into two main types. The first type focuses on embedding features directly from the molecular graphs of drugs, effectively utilizing a straightforward method to encapsulate graph-based data (Gilmer et al. 2017). In this method, atoms within the molecular graph are treated as nodes, with chemical bonds serving as the connecting edges. This setup allows for the embedding of the molecular graph by learning features of individual atoms and the interactions conveyed through the chemical bonds. For instance, SSI-DDI deconstructs the DDI prediction task between two drugs to pinpoint pairwise interactions among their respective substructures (Nyamabo et al. 2021). DSN-DDI is a dual-view drug representation learning network specifically engineered to concurrently learn drug substructures from individual drugs and drug pairs (Li et al. 2023). The second type leverages existing drug interaction networks, where drugs are nodes and their interactions are edges, treating the task of DDI prediction as akin to link prediction within these networks. Clearly, the latter type exhibits a common limitation: they lack inductive capability, are unable to accommodate novel drugs not present in the interaction network, and struggle to maintain diverse types of associations between entities. KGNN applies a knowledge-based GNN approach to extract relational data from knowledge graphs to enhance DDI prediction (Lin et al. 2020). MIRACLE employs multi-view graph contrastive representation learning to simultaneously capture the structural interplay and interactions within and between molecules, enhancing the prediction of DDIs (Wang et al. 2021). Lastly, HTCL-DDI applies a hierarchical triple-view contrastive learning framework for predicting DDIs (Zhang et al. 2023). These diverse approaches illustrate the adaptive use of GNNs in addressing the complexities of predicting drug interactions, combining structural and relational data for improved predictive accuracy.

While current deep learning approaches have demonstrated promising results in predicting DDIs, there remains considerable potential for further enhancement. Firstly, methods that rely solely on a single self-attention mechanism for feature extraction may not comprehensively characterize drug information, potentially missing complex interactions between different substructures. Additionally, integrating multiple sources of feature information could introduce redundant features and noise, unnecessarily complicating the model. Secondly, overfitting during model training significantly impacts prediction results, often resulting in biased predictions. The main contributions of this work are outlined as follows:

  • We designed a new Molecular Substructure-based Dual Attention Feature Learning framework for predicting DDIs (MSDAFL). This framework leverages both self-attention and interactive attention mechanisms to effectively extract and process interaction information between drug substructures, enhancing the accuracy of DDI predictions.

  • To uncover the hidden features of interactions between drug substructures, we computed the cosine similarity matrix. This approach has shown that these similarity vectors significantly contribute to the accuracy of predicting DDIs.

  • Additionally, to reduce overfitting during model training, we adopted a normalization strategy. This not only retains the essential interaction features but also improves the predictability and reliability of DDI outcomes.

2 Materials and methods

2.1 Dataset

To evaluate the scalability and robustness of MSDAFL, we test our model on three public datasets, which vary in scale, density and widely used in previous studies. The scale of the dataset is determined by the number of drugs included. According to previous studies, we also treat the observed DDIs as positive samples and also randomly sample the non-existing DDIs to generate the negative samples. We perform stratified splitting to divide all the drug pairs into a training set, a validation set, and a testing set in a ratio of 6:2:2 (three-fold cross-validation) and 8:1:1 (five-fold cross-validation). We run the experiments on three random folds and five random folds, respectively. As shown in Supplementary Table S1, the statistics of the preprocessed datasets are listed as follows:

  • ZhangDDI dataset (Zhang et al. 2017) is of small-scale, consisting of 544 drugs and 45 720 pairwise DDIs.

  • ChCh-Miner dataset (Ma et al. 2018) is of medium-scale, consisting of 997 drugs and 21 486 pairwise DDIs.

  • DeepDDI dataset (Gilmer et al. 2017) is of large-scale, consisting of 1704 drugs and 191 870 pairwise DDIs.

2.2 Problem formulation

The DDI prediction task is formulated as a binary classification problem aimed at discerning the existence of interactions between pairs of drugs. A drug’s molecular structure can be abstractly represented by a graph G, where XRN×d denotes the node feature matrix and ARN×N denotes the adjacency matrix. Within molecular graphs, nodes correspond to atoms, and edges represent chemical bonds between them. GNNs primarily employ the Message Passing mechanism, which integrates information from neighboring nodes to update their representations (Gilmer et al. 2017). Prominent variants of GNNs include Graph Convolutional Networks (GCNs) (Kipf and Welling 2016), Graph Attention Networks (GATs) (Velickovic et al. 2017), and Graph Isomorphism Networks (GINs) (Xu et al. 2018). In this study, we adopt GIN as the foundational architecture for our model. The detailed description of how construct the node feature matrix X is illustrated in Supplementary Section S2. In the context of DDI prediction, given the adjacency matrix A representing the molecular structure graph G and the node feature matrix X, the objective is to derive a predictive function f(d1,d2)[0,1]. This function aims to estimate the likelihood of interaction between any pair of drugs d1 and d2.

2.3 Overview of MSDAFL

The framework of our model is depicted in Fig. 1. In the GNN encoder, we use RDKit to convert drug SMILES sequences into molecular graphs, which are then encoded using GIN. Subsequently, we employ two Transformer-like encoders: a self-attention mechanism encoder and an interaction attention mechanism encoder, both equipped with learnable pattern vectors, to compress the graphs into M representative vectors. Within the GSAT encoder, cosine similarity is computed for each pair of representative vectors from two drugs, resulting in an M×M similarity matrix. After flattening the similarity matrix, the resulting vectors encapsulate rich features of drug interactions. Simultaneously, in the interaction attention mechanism encoder GIAT, drug pair features are obtained after the interaction attention mechanism to derive feature matrices O1 and O2. These matrices undergo average pooling and standardization, yielding vector pairs containing interaction features of substructures between drug pairs. Finally, the vectors obtained from the GSAT and GIAT encoders are concatenated and input into an MLP (Multilayer Perceptron) layer to produce the final prediction.

Overview of the proposed MSADAFL framework. The overall framework includes a GNN encoder, a self-attention mechanism encoder (GSAT encoder), and an interactive attention mechanism encoder (GIAT). The features from GSAT and GIAT are concatenated in an MLP for prediction, yielding the final drug–drug interaction prediction results.
Figure 1.

Overview of the proposed MSADAFL framework. The overall framework includes a GNN encoder, a self-attention mechanism encoder (GSAT encoder), and an interactive attention mechanism encoder (GIAT). The features from GSAT and GIAT are concatenated in an MLP for prediction, yielding the final drug–drug interaction prediction results.

2.4 GSAT encoder

As illustrated in Fig. 2, the GSAT module utilizes a self-attention mechanism to derive self-attention scores between individual queries and keys, subsequently using these scores to distill information from the corresponding values. The formulation can be articulated as follows:
(1)
(2)
(3)
where Q, K, and V denote the queries, keys, and values, respectively, with d representing the embedding dimension. The function ReLU(·) denotes the Rectified Linear Unit activation function. WQ, WK, WV, and W0 denote learnable weights. The M learnable queries (patterns) are initialized in Q0RM×d and are randomly initialized. We use the GNN-encoded node representations as our keys and values. K0 and V0 are represented by the following formulation:
(4)
where L denotes the number of GNN layers, hi(L) represents the representation of node i at the L-th layer, and N denotes the number of nodes. Ultimately, we obtain M representative vectors for each drug from Equation (3), corresponding to M substructure patterns.
(5)
The detail of the GSAT Encoder. The drug features obtained through the GNN encoder are processed by the self-attention mechanism to generate different feature matrices between the drugs. The cosine similarity matrix S is then computed, and after flattening, it results in the vector s that contains the interaction features.
Figure 2.

The detail of the GSAT Encoder. The drug features obtained through the GNN encoder are processed by the self-attention mechanism to generate different feature matrices between the drugs. The cosine similarity matrix S is then computed, and after flattening, it results in the vector s that contains the interaction features.

Once we have obtained the representative vectors, cosine similarity is used to measure each pair of representative vectors from the two drugs, thereby generating a similarity matrix SijRM×M. The formulation of computing cosine similarity can be described as follows:
(6)

The similarity module non-parametrically characterizes interactions among the substructures of the two drugs. The elements of Sij denote the strength of interaction, thereby enhancing the interpretability of prediction outcomes.

2.5 GIAT encoder

To more closely examine the substructural effects in drug pairs, we use an encoder with a cross-attention mechanism to learn the interaction patterns between drug pairs, as shown in Fig. 3. The formulation can be described as follows:

The detailed operational principle of the GIAT encoder employs a cross-attention mechanism to process drug matrices derived from the GNN encoder, thereby producing distinct feature matrices denoted as O1 and O2 for each drug pair. Subsequently, through average pooling and normalization, interaction vectors inter1 and inter2 are computed for these drug pairs.
Figure 3.

The detailed operational principle of the GIAT encoder employs a cross-attention mechanism to process drug matrices derived from the GNN encoder, thereby producing distinct feature matrices denoted as O1 and O2 for each drug pair. Subsequently, through average pooling and normalization, interaction vectors inter1 and inter2 are computed for these drug pairs.

The representations x1 and x2 are derived from the GIN-encoded representations:
(7)
where G1 and G2 denote the molecular graphs of the respective drugs. The GIN layer is formulated as:
(8)
where N(i) denotes the neighbors of node i, and MLP represents a multi-layer perceptron.
We first calculate the keys and values of each drug: K1 and V1 are then converted into dense batch formats, incorporating batch indices for alignment. Similarly, K2 and V2 are processed into dense batches.
(9)
(10)
The query generation of each drug can be described as follows: queries for each drug are generated by tiling a set of learnable query patterns and transforming them through a weight matrix WQ.
(11)
(12)
where K1 represents the result of the linear transformation W1K applied to input x1, which are the keys for the first drug. K1.size(0,1,1) denotes the size of K1 along its dimensions after incorporating the batch information batch1 to ensure it is dense. Qtile duplicates the query matrix Q along its first dimension to align with the batch size of K1. Q1 denotes the resulting query matrix after Q undergoes a linear transformation WQ following the tiling process. Similarly, for the second drug, K2 represents the result of the linear transformation W2K applied to input x2, which are the keys for the second drug. K2.size(0,1,1) denotes the size of K2 along its dimensions after incorporating the batch information batch2. Qtile duplicates the query matrix Q along its first dimension to align with the batch size of K2. Q2 denotes the resulting query matrix after Q undergoes a linear transformation WQ following the tiling process.
Attention scores are computed between the queries of one drug and the keys of the other drug, and vice versa. The formula for computing attention scores is as follows:
(13)
where d is the dimensionality of the key vectors. Furthermore, a threshold parameter λ is applied to select the top attention values, filtering out less relevant interactions:
(14)
The final outputs are calculated by multiplying the filtered attention matrices by the value matrices of the opposite drug and applying a linear transformation followed by a ReLU activation:
(15)
(16)

2.6 Normalization module

To mitigate overfitting after obtaining O1 and O2, we perform pooling and normalization on O1 and O2. The computations are as follows:
(17)
where AvgPool(·) denotes the average pooling activation function.
Then, we apply normalization to O¯1 and O¯2. The computations are as follows:
(18)

The variables inter1 and inter2 represent the interaction vector representations of different drugs. The symbols μO¯1 and μO¯2 represent the mean values of O¯1 and O¯2, respectively. The symbols σO¯1 and σO¯2 represent the standard deviations of O¯1 and O¯2, respectively.

2.7 MLP layer

Finally, the similarity matrix S is flattened, concatenated with the representations of the two drugs, and fed into an MLP prediction layer.
(19)
(20)
where denotes concatenation, and inter1 and inter2 denote the representations of drug pairs obtained from the GIAT encoder. We utilize Binary Cross Entropy loss as our loss function, formulated as follows:
(21)
where yi is the output of the i-th drug pair, yi{0,1} is the label of the i-th drug pair, σ(·) is the sigmoid function, and n is the number of drug pairs.

3 Experimental result

3.1 Evaluation metrics and experimental setup

In our experimental evaluation, we have chosen four metrics area under the receiver operating characteristic curve (AUROC), average precision (AP), F1-score (F1), and accuracy (ACC) to comprehensively assess the model’s performance in predicting DDIs. To ensure the reliability of our results and mitigate the impact of random variability, each experiment is conducted five times, and we report the mean values of these metrics. For a detailed description of these four metrics, please refer to Supplementary Section S3.

We conducted all experiments on an Ubuntu release 20.04 system utilizing the NVIDIA A40-PCIE GPU card with 48 GB of memory. To ensure equitable performance comparisons, all models were implemented in PyTorch. Our model was trained for 300 epochs, with a learning rate of 0.001 for the first 150 epochs and 0.0001 for the subsequent 150 epochs. The batch size was set to 512, and the node embedding dimension was fixed at 128. We utilized 60 representative vectors M and employed five layers in the GIN structure. Additionally, we initialized the DDI using Xavier initialization (Glorot and Bengio 2010) and optimized the model parameters using the Adam optimizer (Kingma and Ba 2014).

3.2 Comparison model description

In comparative experiments, we compare ten state-of-the-art methods: MR-GNN, GCN-BMP, EPGCN-DS, DeepDrug, MIRACLE, SSI-DDI, CSGNN, DeepDDS, DSN-DDI, and HTCL. Here are some brief introductions:

  • MR-GNN (Xu et al. 2019) employs a multi-resolution architecture to capture local features of each graph and extract interaction features between pairwise graphs.

  • GCN-BMP (Chen et al. 2020) leverages GNN for DDI prediction by employing an end-to-end graph representation learning framework.

  • EPGCN-DS (Sun et al. 2020) detects DDIs from molecular structures using an encoder with expressive GCN layers and a decoder that outputs the probability of DDI.

  • DeepDrug (Cao et al. 2020) employs residual graph convolutional networks (RGCNs) along with convolutional networks (CNNs) to enhance the accuracy of DDI prediction.

  • MIRACLE (Wang et al. 2021) offers a multi-view framework that simultaneously captures the intra-view molecular structure and the inter-view DDIs between molecules.

  • SSI-DDI (Nyamabo et al. 2021) deconstructs the DDI prediction task between two drugs to pinpoint pairwise interactions among their respective substructures.

  • CSGNN (Zhao et al. 2021) incorporates a mix-hop neighborhood aggregator into a GNN to capture high-order dependencies in DDI networks and utilizes a contrastive self-supervised learning task as a regularizer.

  • DeepDDS (Wang et al. 2022) is a deep learning model that employs GNNs and an attention mechanism to identify synergistic drug combinations.

  • DSN-DDI (Li et al. 2023) is a dual-view drug representation learning network designed to learn drug substructures from individual drugs and drug pairs simultaneously.

  • HTCL-DDI (Zhang et al. 2023) is a hierarchical triple-view contrastive learning framework for predicting DDIs.

3.3 Model performance comparison

In our study, we compared MSDAFL with 10 competitive DDI prediction models across three datasets of varying scales, using three widely adopted evaluation metrics (AUROC, AUPRC, and F1) to assess their predictive performance. Table 1 summarizes the experimental outcomes of MSDAFL and other baseline methods across these datasets on training, validation and testing sets in a ratio of 6:2:2. MSDAFL consistently demonstrated superior performance across multiple evaluation metrics and datasets. On the ZhangDDI dataset, HTCL-DDI achieved superior results in AP and F1 metrics by leveraging diverse view relationships and integrating multi-view features for DDI prediction. On the ChCh-Miner and DeepDDI datasets, MSDAL showed substantial improvements compared to HTCL. For instance, on DeepDDI, MSDAFL improved AUROC by approximately 5% and ACC by around 6%. SSI-DDI deconstructs the DDI prediction task between drug pairs to identify pairwise interactions among their respective substructures. In contrast, DSN-DDI is a dual-view drug representation learning network designed to simultaneously learn drug substructures from individual drugs and drug pairs. MSDAFL comprehensively outperforms SSI-DDI and DSN-DDI across all metrics on the three datasets. As a multi-attention mechanism network framework, MSDAFL effectively applies various attention mechanisms to predict DDIs, processing drugs from different interaction perspectives to achieve robust and diverse drug representations. The outstanding performance of MSDAFL across these datasets underscores the potential of interactive attention mechanisms in capturing critical features relevant to DDIs.

Table 1.

Comparison of MSDAFL with other DDI prediction methods on training, validation, and testing sets in a ratio of 6:2:2.a

DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9618 ± 0.00250.8442 ± 0.01210.9083 ± 0.00660.9535 ± 0.00200.9644 ± 0.00350.9314 ± 0.00290.9171 ± 0.00090.9320 ± 0.00230.9113 ± 0.00150.9858 ± 0.00210.9874 ± 0.0011
AP0.9263 ± 0.00300.8020 ± 0.01570.8896 ± 0.00880.9233 ± 0.00230.9309 ± 0.00530.9209 ± 0.00390.8902 ± 0.00730.9208 ± 0.00310.8642 ± 0.00300.9706 ± 0.00380.9707 ± 0.0018
F10.8293 ± 0.00810.7186 ± 0.02710.8007 ± 0.00860.8289 ± 0.00270.8516 ± 0.00270.8196 ± 0.01240.8360 ± 0.00730.8279 ± 0.00420.8768 ± 0.00400.9219 ± 0.00560.9005 ± 0.0014
ACC0.9190 ± 0.00500.7578 ± 0.01070.8240 ± 0.01040.8567 ± 0.00330.9316 ± 0.00160.8535 ± 0.00500.8414 ± 0.00450.8563 ± 0.00280.8665 ± 0.00460.9659 ± 0.00240.9533 ± 0.0026
ChCh-MainerAUROC0.9311 ± 0.00360.7865 ± 0.00560.9423 ± 0.00710.9838 ± 0.00100.9620 ± 0.00790.9809 ± 0.00140.9768 ± 0.00100.9710 ± 0.00180.9669 ± 0.00200.9906 ± 0.00150.9934 ± 0.0031
AP0.9595 ± 0.00190.8631 ± 0.00540.9680 ± 0.00400.9916 ± 0.00050.9950 ± 0.00110.9897 ± 0.00060.9756 ± 0.00160.9851 ± 0.00080.9634 ± 0.00270.9987 ± 0.00020.9991 ± 0.0007
F10.8813 ± 0.00720.8087 ± 0.00920.8941 ± 0.00660.9467 ± 0.00260.9455 ± 0.00660.9398 ± 0.00340.9247 ± 0.00220.9221 ± 0.00630.8812 ± 0.00640.9748 ± 0.00190.9932 ± 0.0044
ACC0.8503 ± 0.00620.7307 ± 0.00800.8664 ± 0.00980.9318 ± 0.00350.9077 ± 0.00110.9219 ± 0.00480.9254 ± 0.00170.9038 ± 0.00640.8889 ± 0.00420.9561 ± 0.00320.9830 ± 0.0012
DeepDDIAUROC0.9335 ± 0.00170.7719 ± 0.00630.8593 ± 0.00240.9174 ± 0.00140.9276 ± 0.00380.9179 ± 0.00480.9401 ± 0.00250.9438 ± 0.00630.9322 ± 0.00100.9449 ± 0.00200.9974 ± 0.0011
AP0.9456 ± 0.00090.8170 ± 0.00600.8872 ± 0.00120.9299 ± 0.00180.9677 ± 0.00180.9347 ± 0.00440.9417 ± 0.00300.9568 ± 0.00560.9287 ± 0.00150.9741 ± 0.00100.9987 ± 0.0012
F10.9007 ± 0.00490.8010 ± 0.00260.8486 ± 0.00380.8939 ± 0.00090.9354 ± 0.00700.8823 ± 0.00490.8601 ± 0.00630.9127 ± 0.00540.8560 ± 0.00150.9478 ± 0.00270.9911 ± 0.0043
ACC0.8754 ± 0.00430.7294 ± 0.00490.8022 ± 0.00390.8628 ± 0.00120.9033 ± 0.00980.8538 ± 0.00590.8633 ± 0.00360.8887 ± 0.00680.8541 ± 0.00090.9208 ± 0.00370.9866 ± 0.0024
DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9618 ± 0.00250.8442 ± 0.01210.9083 ± 0.00660.9535 ± 0.00200.9644 ± 0.00350.9314 ± 0.00290.9171 ± 0.00090.9320 ± 0.00230.9113 ± 0.00150.9858 ± 0.00210.9874 ± 0.0011
AP0.9263 ± 0.00300.8020 ± 0.01570.8896 ± 0.00880.9233 ± 0.00230.9309 ± 0.00530.9209 ± 0.00390.8902 ± 0.00730.9208 ± 0.00310.8642 ± 0.00300.9706 ± 0.00380.9707 ± 0.0018
F10.8293 ± 0.00810.7186 ± 0.02710.8007 ± 0.00860.8289 ± 0.00270.8516 ± 0.00270.8196 ± 0.01240.8360 ± 0.00730.8279 ± 0.00420.8768 ± 0.00400.9219 ± 0.00560.9005 ± 0.0014
ACC0.9190 ± 0.00500.7578 ± 0.01070.8240 ± 0.01040.8567 ± 0.00330.9316 ± 0.00160.8535 ± 0.00500.8414 ± 0.00450.8563 ± 0.00280.8665 ± 0.00460.9659 ± 0.00240.9533 ± 0.0026
ChCh-MainerAUROC0.9311 ± 0.00360.7865 ± 0.00560.9423 ± 0.00710.9838 ± 0.00100.9620 ± 0.00790.9809 ± 0.00140.9768 ± 0.00100.9710 ± 0.00180.9669 ± 0.00200.9906 ± 0.00150.9934 ± 0.0031
AP0.9595 ± 0.00190.8631 ± 0.00540.9680 ± 0.00400.9916 ± 0.00050.9950 ± 0.00110.9897 ± 0.00060.9756 ± 0.00160.9851 ± 0.00080.9634 ± 0.00270.9987 ± 0.00020.9991 ± 0.0007
F10.8813 ± 0.00720.8087 ± 0.00920.8941 ± 0.00660.9467 ± 0.00260.9455 ± 0.00660.9398 ± 0.00340.9247 ± 0.00220.9221 ± 0.00630.8812 ± 0.00640.9748 ± 0.00190.9932 ± 0.0044
ACC0.8503 ± 0.00620.7307 ± 0.00800.8664 ± 0.00980.9318 ± 0.00350.9077 ± 0.00110.9219 ± 0.00480.9254 ± 0.00170.9038 ± 0.00640.8889 ± 0.00420.9561 ± 0.00320.9830 ± 0.0012
DeepDDIAUROC0.9335 ± 0.00170.7719 ± 0.00630.8593 ± 0.00240.9174 ± 0.00140.9276 ± 0.00380.9179 ± 0.00480.9401 ± 0.00250.9438 ± 0.00630.9322 ± 0.00100.9449 ± 0.00200.9974 ± 0.0011
AP0.9456 ± 0.00090.8170 ± 0.00600.8872 ± 0.00120.9299 ± 0.00180.9677 ± 0.00180.9347 ± 0.00440.9417 ± 0.00300.9568 ± 0.00560.9287 ± 0.00150.9741 ± 0.00100.9987 ± 0.0012
F10.9007 ± 0.00490.8010 ± 0.00260.8486 ± 0.00380.8939 ± 0.00090.9354 ± 0.00700.8823 ± 0.00490.8601 ± 0.00630.9127 ± 0.00540.8560 ± 0.00150.9478 ± 0.00270.9911 ± 0.0043
ACC0.8754 ± 0.00430.7294 ± 0.00490.8022 ± 0.00390.8628 ± 0.00120.9033 ± 0.00980.8538 ± 0.00590.8633 ± 0.00360.8887 ± 0.00680.8541 ± 0.00090.9208 ± 0.00370.9866 ± 0.0024
a

The superior results are emphasized in bold, while the second-best results are underlined.

Table 1.

Comparison of MSDAFL with other DDI prediction methods on training, validation, and testing sets in a ratio of 6:2:2.a

DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9618 ± 0.00250.8442 ± 0.01210.9083 ± 0.00660.9535 ± 0.00200.9644 ± 0.00350.9314 ± 0.00290.9171 ± 0.00090.9320 ± 0.00230.9113 ± 0.00150.9858 ± 0.00210.9874 ± 0.0011
AP0.9263 ± 0.00300.8020 ± 0.01570.8896 ± 0.00880.9233 ± 0.00230.9309 ± 0.00530.9209 ± 0.00390.8902 ± 0.00730.9208 ± 0.00310.8642 ± 0.00300.9706 ± 0.00380.9707 ± 0.0018
F10.8293 ± 0.00810.7186 ± 0.02710.8007 ± 0.00860.8289 ± 0.00270.8516 ± 0.00270.8196 ± 0.01240.8360 ± 0.00730.8279 ± 0.00420.8768 ± 0.00400.9219 ± 0.00560.9005 ± 0.0014
ACC0.9190 ± 0.00500.7578 ± 0.01070.8240 ± 0.01040.8567 ± 0.00330.9316 ± 0.00160.8535 ± 0.00500.8414 ± 0.00450.8563 ± 0.00280.8665 ± 0.00460.9659 ± 0.00240.9533 ± 0.0026
ChCh-MainerAUROC0.9311 ± 0.00360.7865 ± 0.00560.9423 ± 0.00710.9838 ± 0.00100.9620 ± 0.00790.9809 ± 0.00140.9768 ± 0.00100.9710 ± 0.00180.9669 ± 0.00200.9906 ± 0.00150.9934 ± 0.0031
AP0.9595 ± 0.00190.8631 ± 0.00540.9680 ± 0.00400.9916 ± 0.00050.9950 ± 0.00110.9897 ± 0.00060.9756 ± 0.00160.9851 ± 0.00080.9634 ± 0.00270.9987 ± 0.00020.9991 ± 0.0007
F10.8813 ± 0.00720.8087 ± 0.00920.8941 ± 0.00660.9467 ± 0.00260.9455 ± 0.00660.9398 ± 0.00340.9247 ± 0.00220.9221 ± 0.00630.8812 ± 0.00640.9748 ± 0.00190.9932 ± 0.0044
ACC0.8503 ± 0.00620.7307 ± 0.00800.8664 ± 0.00980.9318 ± 0.00350.9077 ± 0.00110.9219 ± 0.00480.9254 ± 0.00170.9038 ± 0.00640.8889 ± 0.00420.9561 ± 0.00320.9830 ± 0.0012
DeepDDIAUROC0.9335 ± 0.00170.7719 ± 0.00630.8593 ± 0.00240.9174 ± 0.00140.9276 ± 0.00380.9179 ± 0.00480.9401 ± 0.00250.9438 ± 0.00630.9322 ± 0.00100.9449 ± 0.00200.9974 ± 0.0011
AP0.9456 ± 0.00090.8170 ± 0.00600.8872 ± 0.00120.9299 ± 0.00180.9677 ± 0.00180.9347 ± 0.00440.9417 ± 0.00300.9568 ± 0.00560.9287 ± 0.00150.9741 ± 0.00100.9987 ± 0.0012
F10.9007 ± 0.00490.8010 ± 0.00260.8486 ± 0.00380.8939 ± 0.00090.9354 ± 0.00700.8823 ± 0.00490.8601 ± 0.00630.9127 ± 0.00540.8560 ± 0.00150.9478 ± 0.00270.9911 ± 0.0043
ACC0.8754 ± 0.00430.7294 ± 0.00490.8022 ± 0.00390.8628 ± 0.00120.9033 ± 0.00980.8538 ± 0.00590.8633 ± 0.00360.8887 ± 0.00680.8541 ± 0.00090.9208 ± 0.00370.9866 ± 0.0024
DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9618 ± 0.00250.8442 ± 0.01210.9083 ± 0.00660.9535 ± 0.00200.9644 ± 0.00350.9314 ± 0.00290.9171 ± 0.00090.9320 ± 0.00230.9113 ± 0.00150.9858 ± 0.00210.9874 ± 0.0011
AP0.9263 ± 0.00300.8020 ± 0.01570.8896 ± 0.00880.9233 ± 0.00230.9309 ± 0.00530.9209 ± 0.00390.8902 ± 0.00730.9208 ± 0.00310.8642 ± 0.00300.9706 ± 0.00380.9707 ± 0.0018
F10.8293 ± 0.00810.7186 ± 0.02710.8007 ± 0.00860.8289 ± 0.00270.8516 ± 0.00270.8196 ± 0.01240.8360 ± 0.00730.8279 ± 0.00420.8768 ± 0.00400.9219 ± 0.00560.9005 ± 0.0014
ACC0.9190 ± 0.00500.7578 ± 0.01070.8240 ± 0.01040.8567 ± 0.00330.9316 ± 0.00160.8535 ± 0.00500.8414 ± 0.00450.8563 ± 0.00280.8665 ± 0.00460.9659 ± 0.00240.9533 ± 0.0026
ChCh-MainerAUROC0.9311 ± 0.00360.7865 ± 0.00560.9423 ± 0.00710.9838 ± 0.00100.9620 ± 0.00790.9809 ± 0.00140.9768 ± 0.00100.9710 ± 0.00180.9669 ± 0.00200.9906 ± 0.00150.9934 ± 0.0031
AP0.9595 ± 0.00190.8631 ± 0.00540.9680 ± 0.00400.9916 ± 0.00050.9950 ± 0.00110.9897 ± 0.00060.9756 ± 0.00160.9851 ± 0.00080.9634 ± 0.00270.9987 ± 0.00020.9991 ± 0.0007
F10.8813 ± 0.00720.8087 ± 0.00920.8941 ± 0.00660.9467 ± 0.00260.9455 ± 0.00660.9398 ± 0.00340.9247 ± 0.00220.9221 ± 0.00630.8812 ± 0.00640.9748 ± 0.00190.9932 ± 0.0044
ACC0.8503 ± 0.00620.7307 ± 0.00800.8664 ± 0.00980.9318 ± 0.00350.9077 ± 0.00110.9219 ± 0.00480.9254 ± 0.00170.9038 ± 0.00640.8889 ± 0.00420.9561 ± 0.00320.9830 ± 0.0012
DeepDDIAUROC0.9335 ± 0.00170.7719 ± 0.00630.8593 ± 0.00240.9174 ± 0.00140.9276 ± 0.00380.9179 ± 0.00480.9401 ± 0.00250.9438 ± 0.00630.9322 ± 0.00100.9449 ± 0.00200.9974 ± 0.0011
AP0.9456 ± 0.00090.8170 ± 0.00600.8872 ± 0.00120.9299 ± 0.00180.9677 ± 0.00180.9347 ± 0.00440.9417 ± 0.00300.9568 ± 0.00560.9287 ± 0.00150.9741 ± 0.00100.9987 ± 0.0012
F10.9007 ± 0.00490.8010 ± 0.00260.8486 ± 0.00380.8939 ± 0.00090.9354 ± 0.00700.8823 ± 0.00490.8601 ± 0.00630.9127 ± 0.00540.8560 ± 0.00150.9478 ± 0.00270.9911 ± 0.0043
ACC0.8754 ± 0.00430.7294 ± 0.00490.8022 ± 0.00390.8628 ± 0.00120.9033 ± 0.00980.8538 ± 0.00590.8633 ± 0.00360.8887 ± 0.00680.8541 ± 0.00090.9208 ± 0.00370.9866 ± 0.0024
a

The superior results are emphasized in bold, while the second-best results are underlined.

In addition, we also compare MSDAFL and other baseline methods across these datasets on training, validation and testing sets in a ratio of 8:1:1. The results are shown in the Table 2. The MSADFL model demonstrates improved performance on the ZhangDDI dataset. For the ChCh-Miner and DeepDDI datasets, the model’s performance remains stable across all four metrics. This stability highlights the strong capabilities of our model’s self-attention mechanism and interaction attention. In contrast, the HTCL-DDI model shows a decline in performance across all datasets, particularly on the DeepDDI dataset. This decline underscores the limitations of the approach that leverages existing drug interaction networks, where drugs are nodes and their interactions are edges, which can negatively impact drug interaction prediction. Our model outperforms others across all three datasets, indicating that it surpasses other models in predictive performance. Furthermore, to assess the generalization performance of the MSADFL model, we further conduct cross-datum study experiments on three datasets: ZhangDDI, DeepDDI, and ChCh-Miner. The experimental results of MSADFL and HTCL-DDI are presented in Supplementary Tables S2 and S3, respectively. Compared to previous experiments across three datasets with varying scales, the prediction performances of MSDAFL and HTCL-DDI are declined. Notably, when DeepDDI is used as the training set and ChCh-Miner as the test set, the AUC reaches 0.8227, highlighting the model’s robust generalization ability.

Table 2.

Comparison of MSDAFL with other DDI prediction methods on training, validation and testing sets in a ratio of 8:1:1.a

DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9434 ± 0.00150.8512 ± 0.01590.9043 ± 0.00180.9477 ± 0.00090.8914 ± 0.00210.9279 ± 0.00250.9871 ± 0.00120.9212 ± 0.00340.7113 ± 0.00350.9882 ± 0.00310.9912 ± 0.0012
AP0.9133 ± 0.00520.8220 ± 0.02010.8996 ± 0.00650.9351 ± 0.00420.9312 ± 0.00430.8913 ± 0.00740.9712 ± 0.00870.9031 ± 0.00120.6756 ± 0.00450.9514 ± 0.00480.9907 ± 0.0033
F10.8463 ± 0.00740.7086 ± 0.01540.7958 ± 0.00460.8473 ± 0.00120.9282 ± 0.00170.8412 ± 0.00340.8731 ± 0.00340.8412 ± 0.00750.6712 ± 0.00310.9219 ± 0.00560.9405 ± 0.0014
ACC0.8884 ± 0.00990.7675 ± 0.00980.8212 ± 0.00640.8753 ± 0.00550.9391 ± 0.00140.8132 ± 0.00620.8992 ± 0.00140.8812 ± 0.00340.6513 ± 0.00340.9568 ± 0.00310.9553 ± 0.0012
ChCh-MainerAUROC0.9451 ± 0.00090.7762 ± 0.00810.9015 ± 0.00640.9902 ± 0.00200.9540 ± 0.00120.9809 ± 0.00140.9912 ± 0.00350.9717 ± 0.00730.9218 ± 0.00320.9836 ± 0.00210.9964 ± 0.0021
AP0.9605 ± 0.00690.8351 ± 0.00710.9590 ± 0.00130.9874 ± 0.00150.9810 ± 0.00230.9897 ± 0.00060.9831 ± 0.00340.9881 ± 0.00120.9112 ± 0.00320.9931 ± 0.00120.9988 ± 0.0009
F10.9023 ± 0.01220.7187 ± 0.00540.8741 ± 0.00460.9323 ± 0.00510.9712 ± 0.00620.9398 ± 0.00340.9271 ± 0.00220.9331 ± 0.00510.8432 ± 0.00140.9701 ± 0.00210.9911 ± 0.0021
ACC0.8653 ± 0.00580.7543 ± 0.00190.8061 ± 0.00740.9216 ± 0.00710.9534 ± 0.00320.9219 ± 0.00480.8912 ± 0.00640.9151 ± 0.00430.8465 ± 0.00120.9513 ± 0.00130.9850 ± 0.0022
DeepDDIAUROC0.9402 ± 0.00410.7412 ± 0.00850.8393 ± 0.00540.9062 ± 0.00430.8965 ± 0.00230.9429 ± 0.00250.9531 ± 0.00350.9056 ± 0.00190.7412 ± 0.00310.9152 ± 0.00220.9954 ± 0.0014
AP0.9514 ± 0.00650.8023 ± 0.00540.8566 ± 0.00660.9444 ± 0.00450.9471 ± 0.00440.9213 ± 0.00740.9411 ± 0.00300.9217 ± 0.00850.7213 ± 0.00410.8921 ± 0.00140.9937 ± 0.0022
F10.9052 ± 0.00530.7745 ± 0.00560.8214 ± 0.00610.8639 ± 0.00360.9352 ± 0.00650.8712 ± 0.00340.8355 ± 0.00190.8951 ± 0.00640.6060 ± 0.00150.8828 ± 0.00080.9821 ± 0.0023
ACC0.8873 ± 0.00740.6444 ± 0.00560.7023 ± 0.00590.8021 ± 0.00580.9042 ± 0.00860.8632 ± 0.00620.8413 ± 0.00230.8552 ± 0.00120.6634 ± 0.00170.8694 ± 0.00430.9812 ± 0.0024
DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9434 ± 0.00150.8512 ± 0.01590.9043 ± 0.00180.9477 ± 0.00090.8914 ± 0.00210.9279 ± 0.00250.9871 ± 0.00120.9212 ± 0.00340.7113 ± 0.00350.9882 ± 0.00310.9912 ± 0.0012
AP0.9133 ± 0.00520.8220 ± 0.02010.8996 ± 0.00650.9351 ± 0.00420.9312 ± 0.00430.8913 ± 0.00740.9712 ± 0.00870.9031 ± 0.00120.6756 ± 0.00450.9514 ± 0.00480.9907 ± 0.0033
F10.8463 ± 0.00740.7086 ± 0.01540.7958 ± 0.00460.8473 ± 0.00120.9282 ± 0.00170.8412 ± 0.00340.8731 ± 0.00340.8412 ± 0.00750.6712 ± 0.00310.9219 ± 0.00560.9405 ± 0.0014
ACC0.8884 ± 0.00990.7675 ± 0.00980.8212 ± 0.00640.8753 ± 0.00550.9391 ± 0.00140.8132 ± 0.00620.8992 ± 0.00140.8812 ± 0.00340.6513 ± 0.00340.9568 ± 0.00310.9553 ± 0.0012
ChCh-MainerAUROC0.9451 ± 0.00090.7762 ± 0.00810.9015 ± 0.00640.9902 ± 0.00200.9540 ± 0.00120.9809 ± 0.00140.9912 ± 0.00350.9717 ± 0.00730.9218 ± 0.00320.9836 ± 0.00210.9964 ± 0.0021
AP0.9605 ± 0.00690.8351 ± 0.00710.9590 ± 0.00130.9874 ± 0.00150.9810 ± 0.00230.9897 ± 0.00060.9831 ± 0.00340.9881 ± 0.00120.9112 ± 0.00320.9931 ± 0.00120.9988 ± 0.0009
F10.9023 ± 0.01220.7187 ± 0.00540.8741 ± 0.00460.9323 ± 0.00510.9712 ± 0.00620.9398 ± 0.00340.9271 ± 0.00220.9331 ± 0.00510.8432 ± 0.00140.9701 ± 0.00210.9911 ± 0.0021
ACC0.8653 ± 0.00580.7543 ± 0.00190.8061 ± 0.00740.9216 ± 0.00710.9534 ± 0.00320.9219 ± 0.00480.8912 ± 0.00640.9151 ± 0.00430.8465 ± 0.00120.9513 ± 0.00130.9850 ± 0.0022
DeepDDIAUROC0.9402 ± 0.00410.7412 ± 0.00850.8393 ± 0.00540.9062 ± 0.00430.8965 ± 0.00230.9429 ± 0.00250.9531 ± 0.00350.9056 ± 0.00190.7412 ± 0.00310.9152 ± 0.00220.9954 ± 0.0014
AP0.9514 ± 0.00650.8023 ± 0.00540.8566 ± 0.00660.9444 ± 0.00450.9471 ± 0.00440.9213 ± 0.00740.9411 ± 0.00300.9217 ± 0.00850.7213 ± 0.00410.8921 ± 0.00140.9937 ± 0.0022
F10.9052 ± 0.00530.7745 ± 0.00560.8214 ± 0.00610.8639 ± 0.00360.9352 ± 0.00650.8712 ± 0.00340.8355 ± 0.00190.8951 ± 0.00640.6060 ± 0.00150.8828 ± 0.00080.9821 ± 0.0023
ACC0.8873 ± 0.00740.6444 ± 0.00560.7023 ± 0.00590.8021 ± 0.00580.9042 ± 0.00860.8632 ± 0.00620.8413 ± 0.00230.8552 ± 0.00120.6634 ± 0.00170.8694 ± 0.00430.9812 ± 0.0024
a

The superior results are emphasized in bold, while the second-best results are underlined.

Table 2.

Comparison of MSDAFL with other DDI prediction methods on training, validation and testing sets in a ratio of 8:1:1.a

DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9434 ± 0.00150.8512 ± 0.01590.9043 ± 0.00180.9477 ± 0.00090.8914 ± 0.00210.9279 ± 0.00250.9871 ± 0.00120.9212 ± 0.00340.7113 ± 0.00350.9882 ± 0.00310.9912 ± 0.0012
AP0.9133 ± 0.00520.8220 ± 0.02010.8996 ± 0.00650.9351 ± 0.00420.9312 ± 0.00430.8913 ± 0.00740.9712 ± 0.00870.9031 ± 0.00120.6756 ± 0.00450.9514 ± 0.00480.9907 ± 0.0033
F10.8463 ± 0.00740.7086 ± 0.01540.7958 ± 0.00460.8473 ± 0.00120.9282 ± 0.00170.8412 ± 0.00340.8731 ± 0.00340.8412 ± 0.00750.6712 ± 0.00310.9219 ± 0.00560.9405 ± 0.0014
ACC0.8884 ± 0.00990.7675 ± 0.00980.8212 ± 0.00640.8753 ± 0.00550.9391 ± 0.00140.8132 ± 0.00620.8992 ± 0.00140.8812 ± 0.00340.6513 ± 0.00340.9568 ± 0.00310.9553 ± 0.0012
ChCh-MainerAUROC0.9451 ± 0.00090.7762 ± 0.00810.9015 ± 0.00640.9902 ± 0.00200.9540 ± 0.00120.9809 ± 0.00140.9912 ± 0.00350.9717 ± 0.00730.9218 ± 0.00320.9836 ± 0.00210.9964 ± 0.0021
AP0.9605 ± 0.00690.8351 ± 0.00710.9590 ± 0.00130.9874 ± 0.00150.9810 ± 0.00230.9897 ± 0.00060.9831 ± 0.00340.9881 ± 0.00120.9112 ± 0.00320.9931 ± 0.00120.9988 ± 0.0009
F10.9023 ± 0.01220.7187 ± 0.00540.8741 ± 0.00460.9323 ± 0.00510.9712 ± 0.00620.9398 ± 0.00340.9271 ± 0.00220.9331 ± 0.00510.8432 ± 0.00140.9701 ± 0.00210.9911 ± 0.0021
ACC0.8653 ± 0.00580.7543 ± 0.00190.8061 ± 0.00740.9216 ± 0.00710.9534 ± 0.00320.9219 ± 0.00480.8912 ± 0.00640.9151 ± 0.00430.8465 ± 0.00120.9513 ± 0.00130.9850 ± 0.0022
DeepDDIAUROC0.9402 ± 0.00410.7412 ± 0.00850.8393 ± 0.00540.9062 ± 0.00430.8965 ± 0.00230.9429 ± 0.00250.9531 ± 0.00350.9056 ± 0.00190.7412 ± 0.00310.9152 ± 0.00220.9954 ± 0.0014
AP0.9514 ± 0.00650.8023 ± 0.00540.8566 ± 0.00660.9444 ± 0.00450.9471 ± 0.00440.9213 ± 0.00740.9411 ± 0.00300.9217 ± 0.00850.7213 ± 0.00410.8921 ± 0.00140.9937 ± 0.0022
F10.9052 ± 0.00530.7745 ± 0.00560.8214 ± 0.00610.8639 ± 0.00360.9352 ± 0.00650.8712 ± 0.00340.8355 ± 0.00190.8951 ± 0.00640.6060 ± 0.00150.8828 ± 0.00080.9821 ± 0.0023
ACC0.8873 ± 0.00740.6444 ± 0.00560.7023 ± 0.00590.8021 ± 0.00580.9042 ± 0.00860.8632 ± 0.00620.8413 ± 0.00230.8552 ± 0.00120.6634 ± 0.00170.8694 ± 0.00430.9812 ± 0.0024
DatasetMetricMR-GNNGCN-BMPEPGCN-DSDeepDrugMIRACLESSI-DDICSGNNDeepDDSDSN-DDIHTCL-DDIMSDAFL
ZhangDDIAUROC0.9434 ± 0.00150.8512 ± 0.01590.9043 ± 0.00180.9477 ± 0.00090.8914 ± 0.00210.9279 ± 0.00250.9871 ± 0.00120.9212 ± 0.00340.7113 ± 0.00350.9882 ± 0.00310.9912 ± 0.0012
AP0.9133 ± 0.00520.8220 ± 0.02010.8996 ± 0.00650.9351 ± 0.00420.9312 ± 0.00430.8913 ± 0.00740.9712 ± 0.00870.9031 ± 0.00120.6756 ± 0.00450.9514 ± 0.00480.9907 ± 0.0033
F10.8463 ± 0.00740.7086 ± 0.01540.7958 ± 0.00460.8473 ± 0.00120.9282 ± 0.00170.8412 ± 0.00340.8731 ± 0.00340.8412 ± 0.00750.6712 ± 0.00310.9219 ± 0.00560.9405 ± 0.0014
ACC0.8884 ± 0.00990.7675 ± 0.00980.8212 ± 0.00640.8753 ± 0.00550.9391 ± 0.00140.8132 ± 0.00620.8992 ± 0.00140.8812 ± 0.00340.6513 ± 0.00340.9568 ± 0.00310.9553 ± 0.0012
ChCh-MainerAUROC0.9451 ± 0.00090.7762 ± 0.00810.9015 ± 0.00640.9902 ± 0.00200.9540 ± 0.00120.9809 ± 0.00140.9912 ± 0.00350.9717 ± 0.00730.9218 ± 0.00320.9836 ± 0.00210.9964 ± 0.0021
AP0.9605 ± 0.00690.8351 ± 0.00710.9590 ± 0.00130.9874 ± 0.00150.9810 ± 0.00230.9897 ± 0.00060.9831 ± 0.00340.9881 ± 0.00120.9112 ± 0.00320.9931 ± 0.00120.9988 ± 0.0009
F10.9023 ± 0.01220.7187 ± 0.00540.8741 ± 0.00460.9323 ± 0.00510.9712 ± 0.00620.9398 ± 0.00340.9271 ± 0.00220.9331 ± 0.00510.8432 ± 0.00140.9701 ± 0.00210.9911 ± 0.0021
ACC0.8653 ± 0.00580.7543 ± 0.00190.8061 ± 0.00740.9216 ± 0.00710.9534 ± 0.00320.9219 ± 0.00480.8912 ± 0.00640.9151 ± 0.00430.8465 ± 0.00120.9513 ± 0.00130.9850 ± 0.0022
DeepDDIAUROC0.9402 ± 0.00410.7412 ± 0.00850.8393 ± 0.00540.9062 ± 0.00430.8965 ± 0.00230.9429 ± 0.00250.9531 ± 0.00350.9056 ± 0.00190.7412 ± 0.00310.9152 ± 0.00220.9954 ± 0.0014
AP0.9514 ± 0.00650.8023 ± 0.00540.8566 ± 0.00660.9444 ± 0.00450.9471 ± 0.00440.9213 ± 0.00740.9411 ± 0.00300.9217 ± 0.00850.7213 ± 0.00410.8921 ± 0.00140.9937 ± 0.0022
F10.9052 ± 0.00530.7745 ± 0.00560.8214 ± 0.00610.8639 ± 0.00360.9352 ± 0.00650.8712 ± 0.00340.8355 ± 0.00190.8951 ± 0.00640.6060 ± 0.00150.8828 ± 0.00080.9821 ± 0.0023
ACC0.8873 ± 0.00740.6444 ± 0.00560.7023 ± 0.00590.8021 ± 0.00580.9042 ± 0.00860.8632 ± 0.00620.8413 ± 0.00230.8552 ± 0.00120.6634 ± 0.00170.8694 ± 0.00430.9812 ± 0.0024
a

The superior results are emphasized in bold, while the second-best results are underlined.

3.4 Ablation experiment

The outstanding performance of MSDAFL stems from three carefully designed strategies: the cross-attention mechanism strategy between drug pairs, normalization of the interaction matrix, and self-attention mechanism strategy with cosine similarity. To ascertain the efficacy of each drug feature type, we conducted ablation experiments on the ZhangDDI dataset across these three configurations on training, validation and testing sets in a ratio of 6:2:2.The results are shown in the Supplementary Fig. S1, demonstrating the effectiveness of the module we proposed.

3.5 Parameter sensitivity

To investigate the influence of crucial parameters on prediction performance, we systematically vary these parameters and assessed their impact on the MSDAFL model’s efficacy using the ZhangDDI dataset on training, validation and testing sets in a ratio of 6:2:2. We analyze the batch size for model training, the parameter λ, and the number of GIN layers. By holding other parameters constant, we explore how varying key parameter settings impacted the performance of MSDAFL. As illustrated in the Supplementary Fig. S2, we investigated how these parameters affect model performance. Specifically, we found that the model performs optimally when the batch size is set to 512, λ to 0.75, and the number of GIN layers to 5.

3.6 Case study

To assess the practical utility of MSDAFL in real-world scenarios, we performed an analysis of clinical studies evaluating the prediction outcomes for four drug pairs using MSDAFL on the ZhangDDI testing set, as depicted in Supplementary Fig. S3. For the analysis of these four drugs, we can confirm the powerful performance of the MSDAFL model in predicting DDIs.

4 Discussion and conclusion

In this work, we introduce a molecular substructure-based dual attention feature learning framework for predicting DDIs. This framework integrates multiple attention mechanisms, including a self-attention encoder that extracts substructures from individual drugs and computes a cosine similarity matrix between the feature matrices of drug pairs. In the interactive attention encoding segment, we employ an interactive attention mechanism to investigate the strength of interactions between substructures of drug pairs, culminating in the regularization of the interactive feature matrix. Extensive experiments are conducted across three public datasets to evaluate the efficacy of our MSDAFL model and assess the contributions of its various modules. The findings decisively establish MSDAFL as a robust and promising tool for predicting DDIs, significantly contributing to medication safety and drug side effect research. Our study can be further advanced in three key domains: (i) by integrating heterogeneous biomedical information to augment representation learning, (ii) by expanding MSDAFL to more complex and practical application scenarios, and (iii) by supplementing with wet-lab experiments to further validate certain DDI prediction outcomes.

Supplementary data

Supplementary data are available at Bioinformatics online.

Conflict of interest

None declared.

Funding

This work was supported in part by the National Natural Science Foundation of China (62473149, 61962050, and 62072473), Natural Science Foundation of Hunan Province of China (2022JJ30428) and Excellent youth funding of Hunan Provincial Education Department (22B0372).

Data availability

Our code and data are available at: https://github.com/27167199/MSDAFL.

References

Cao
X
,
Fan
R
,
Zeng
W.
Deepdrug: a general graph-based deep learning framework for drug relation prediction
.
bioRxiv
2020
, preprint: not peer reviewed.

Chen
X
,
Liu
X
,
Wu
J.
GCN-BMP: investigating graph representation learning for DDI prediction task
.
Methods
2020
;
179
:
47
54
.

Gilmer
J
,
Schoenholz
SS
,
Riley
PF
et al. Neural message passing for quantum chemistry. In: International Conference on Machine Learning. Sydney, NSW, Australia: PMLR.
2017
,
1263
1272
.

Glorot
X
,
Bengio
Y.
Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Chia Laguna Resort, Sardinia, Italy, JMLR Workshop and Conference Proceedings,
2010
,
249
256
.

Kingma
DP
,
Ba
J.
Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, arXiv, arXiv:1412.6980,
2014
, preprint: not peer reviewed.

Kipf
TN
,
Welling
M.
Semi-supervised classification with graph convolutional networks. arXiv, arXiv:1609.02907,
2016
, preprint: not peer reviewed.

Li
J
,
Miao
B
,
Wang
S
et al. ;
Hiplot Consortium
.
Hiplot: a comprehensive and easy-to-use web service for boosting publication-ready biomedical data visualization
.
Brief Bioinform
2022
;
23
:
bbac261
.

Li
P
,
Huang
C
,
Fu
Y
et al.
Large-scale exploration and analysis of drug combinations
.
Bioinformatics
2015
;
31
:
2007
16
.

Li
Z
,
Zhu
S
,
Shao
B
et al.
DSN-DDI: an accurate and generalized framework for drug–drug interaction prediction by dual-view representation learning
.
Brief Bioinform
2023
;
24
:
bbac597
.

Lin
X
,
Quan
Z
,
Wang
Z-J
et al.
2020
.
KGNN: knowledge graph neural network for drug-drug interaction prediction
.
IJCAI
;
380
:
2739
2745
.

Ma
T
,
Xiao
C
,
Zhou
J
et al. Drug similarity integration through attentive multi-view graph auto-encoders. arXiv, arXiv:1804.10850,
2018
, preprint: not peer reviewed.

Mei
S
,
Zhang
K.
A machine learning framework for predicting drug–drug interactions
.
Sci Rep
2021
;
11
:
17619
.

Nyamabo
AK
,
Yu
H
,
Shi
J-Y.
SSI–DDI: substructure–substructure interactions for drug–drug interaction prediction
.
Brief Bioinform
2021
;
22
:
bbab133
.

Shao
L
,
Zhang
B.
Traditional chinese medicine network pharmacology: theory, methodology and application
.
Chin J Nat Med
2013
;
11
:
110
20
.

Sun
M
,
Wang
F
,
Elemento
O
et al.
Structure-based drug-drug interaction detection via expressive graph convolutional networks and deep sets (student abstract)
.
AAAI
2020
;
34
:
13927
8
. volume

Sun
W
,
Sanderson
PE
,
Zheng
W.
Drug combination therapy increases successful drug repositioning
.
Drug Discov Today
2016
;
21
:
1189
95
.

Takeda
T
,
Hao
M
,
Cheng
T
et al.
Predicting drug–drug interactions through drug structural similarities and interaction networks incorporating pharmacokinetics and pharmacodynamics knowledge
.
J Cheminform
2017
;
9
:
16
.

Velickovic
P
,
Cucurull
G
,
Casanova
A
et al.
Graph attention networks
.
Stat
2017
;
1050
:
10
48550
.

Vilar
S
,
Uriarte
E
,
Santana
L
et al.
Detection of drug-drug interactions by modeling interaction profile fingerprints
.
PLoS One
2013
;
8
:
e58321
.

Vilar
S
,
Uriarte
E
,
Santana
L
et al.
Similarity-based modeling in large-scale prediction of drug-drug interactions
.
Nat Protoc
2014
;
9
:
2147
63
.

Wang
J
,
Liu
X
,
Shen
S
et al.
DeepDDS: deep graph neural network with attention mechanism to predict synergistic drug combinations
.
Brief Bioinform
2022
;
23
:
bbab390
.

Wang
Y
,
Min
Y
,
Chen
X
et al. Multi-view graph contrastive representation learning for drug-drug interaction prediction. In: Proceedings of the Web Conference 2021, Ljubljana, Slovenia.
2021
,
2921
2933
.

Xu
K
,
Hu
W
,
Leskovec
J
et al. How powerful are graph neural networks? arXiv, arXiv:1810.00826,
2018
, preprint: not peer reviewed.

Xu
N
,
Wang
P
,
Chen
L
et al. MR-GNN: Multi-resolution and dual graph neural network for predicting structured entity interactions. arXiv, arXiv:1905.09558,
2019
, preprint: not peer reviewed.

Yan
C
,
Duan
G
,
Zhang
Y
et al.
Predicting drug-drug interactions based on integrated similarity and semi-supervised learning
.
IEEE/ACM Trans Comput Biol Bioinform
2020
;
19
:
168
79
.

Zhang
R
,
Wang
X
,
Wang
P
et al.
HTCL-DDI: a hierarchical triple-view contrastive learning framework for drug–drug interaction prediction
.
Brief Bioinform
2023
;
24
:
bbad324
.

Zhang
W
,
Chen
Y
,
Liu
F
et al.
Predicting potential drug-drug interactions by integrating chemical, biological, phenotypic and network data
.
BMC Bioinformatics
2017
;
18
:
18
.

Zhao
C
,
Liu
S
,
Huang
F
et al.
CSGNN: contrastive self-supervised graph neural network for molecular interaction prediction
. In: 
Proceedings of the 30th International Joint Conference on Artificial Intelligence
, IJCAI, Montreal, Canada.
2021
:
3756
63
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Jianlin Cheng
Jianlin Cheng
Associate Editor
Search for other works by this author on:

Supplementary data