iGRLDTI: an improved graph representation learning method for predicting drug–target interactions over heterogeneous biological information network

Comparison with state-of-the-art models on the benchmark dataset.

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.957	0.957	0.961	0.965
AUPR	0.939	0.964	0.945	0.904	0.947	0.967
F1-score	0.091	0.828	0.810	0.892	0.868	0.899

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.957	0.957	0.961	0.965
AUPR	0.939	0.964	0.945	0.904	0.947	0.967
F1-score	0.091	0.828	0.810	0.892	0.868	0.899

Table 1.

Comparison with state-of-the-art models on the benchmark dataset.

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.957	0.957	0.961	0.965
AUPR	0.939	0.964	0.945	0.904	0.947	0.967
F1-score	0.091	0.828	0.810	0.892	0.868	0.899

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.957	0.957	0.961	0.965
AUPR	0.939	0.964	0.945	0.904	0.947	0.967
F1-score	0.091	0.828	0.810	0.892	0.868	0.899

The reasons accountable for the promising performance of iGRLDTI are 2-fold. On the one hand, it employs the biological knowledge of drugs and proteins to enrich the content of HBIN, and then learns their feature vectors from biological knowledge as the initial representations. On the other hand, it adopts the NDLS strategy to decide the node-specific propagation depth during representation learning, thus alleviating the impact of the over-smoothing issue. But for the other comparing algorithms, they are difficult to adjust the depth of neighbor information aggregation for each node, and accordingly, the representations of drugs and targets learned by their GNN models are less discriminative.

Another point worth to note is that some comparing algorithms exhibit different behaviors for their performance in terms of AUC and AUPR. After an in-depth analysis, we find that the main reason for that phenomenon is ascribed to the introduction of the more heterogeneous information. Taking NeoDTI as an example, its AUC performance ranks as the third-best. iGRLDTI only makes use of chemical structures of drugs, protein sequences of targets, and their interactions to compose a HBIN as its input while NeoDTI integrates the structural similarity network of drugs, the sequence similarity network of targets, and different kinds of associations, such as drug–drug interactions, protein–protein interactions, and drug–disease associations, to construct a heterogeneous network as its input. Hence, the main difference lying in the input between iGRLDTI and NeoDTI is that only DTIs are considered by iGRLDTI. Moreover, iGRLDTI still outperforms NeoDTI at 0.8% of AUC, 2.2% of AUPR, and 8.9% of F1-score, suggesting that it may not be necessary to include so many kinds of associations, as the heterogeneous information given by them could degrade the performance by confusing the classifiers to a certain extent. Consequently, we have reason to believe that rather than incorporating more kinds of associations, our work provides an alternative view to improve the accuracy of DTI prediction by alleviating the over-smoothing issue.

Moreover, the DTI prediction problem is more reasonable to be formulated as an imbalanced classification problem in the real case. It is for this reason that additional experiments have been conducted to evaluate the performance of iGRLDTI on the imbalanced dataset, where the ratio between positive and negative samples is set to be 1:10 by following (Wan et al. 2019). The experimental results of 10-fold CV are presented in Table 2. Regarding iGRLDTI, we note that it yields the best performance in terms of AUC and AUPR. Its F1-score is ranked as the second best and is only slightly worse by 0.7% than MultiDTI. However, among all evaluation metrics, particular attention should be paid to AUPR, which is a promising indicator in case of imbalanced datasets (Johnson et al. 2012). In terms of AUPR, iGRLDTI performs better by 14.1%, 25.5%, 5.9%, 3.8%, and 1.0% than DTINet, EEG-DTI, NeoDTI, IMCHGAN, and MultiDTI, respectively. Hence, we have reason to believe that iGRLDTI is preferred as a promising DTI prediction tool when applied to the imbalance datasets in the real case.

Table 2.

Comparison with state-of-the-art models under imbalanced samples.

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.946	0.929	0.967	0.986
AUPR	0.786	0.672	0.854	0.603	0.917	0.927
F1-score	0.093	0.813	0.772	0.754	0.828	0.821

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.946	0.929	0.967	0.986
AUPR	0.786	0.672	0.854	0.603	0.917	0.927
F1-score	0.093	0.813	0.772	0.754	0.828	0.821

Table 2.

Open in new tab Download slide

Comparison with state-of-the-art models under imbalanced samples.

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.946	0.929	0.967	0.986
AUPR	0.786	0.672	0.854	0.603	0.917	0.927
F1-score	0.093	0.813	0.772	0.754	0.828	0.821

Metrics	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI	iGRLDTI
AUC	0.916	0.953	0.946	0.929	0.967	0.986
AUPR	0.786	0.672	0.854	0.603	0.917	0.927
F1-score	0.093	0.813	0.772	0.754	0.828	0.821

3.3 Ablation study

To study the impacts of biological knowledge and the over-smoothing issue on the performance of iGRLDTI, we also develop another two variants of iGRLDTI, i.e. iGRLDTI-A and iGRLDTI-G. Specifically, iGRLDTI-A only takes into account the biological knowledge of drugs and targets, i.e. drug molecule structures and protein sequences, while iGRLDTI-G learns the feature representations of drugs and targets based on a classical GNN model as described by Equation (1). Moreover, these two variants also use the GBDT classifier with the same hyper-parameter setting to predict novel DTIs. Experiment results of 10-fold CV are presented in Fig. 2A, and the ROC and PR curves of iGRLDTI-A, iGRLDTI-G, and iGRLDTI are presented in Fig. 2B and C, where several things can be noted.

Figure 2.

(A) Experimental results of iGRLDTI-A, iGRLDTI-G, and iGRLDTI. (B) The ROC curves are obtained by two variants of iGRLDTI over the benchmark datasets in the ablation study. (C) The PR curves are obtained by two variants of iGRLDTI over the benchmark datasets in the ablation study

First, iGRLDTI-A achieves the worst performance when compared with iGRLDTI-G and iGRLDTI. In this regard, only considering the biological knowledge of drugs and targets is difficult to build an accurate prediction model for discovering novel DTIs. Second, iGRLDTI-G presents a better performance against iGRLDTI-A. Under 10-fold CV, iGRLDTI-G achieves an average 9.3% relative gain in AUC and 8.4% in AUPR on the benchmark dataset when compared with iGRLDTI-A. Hence, the aggregation of neighborhood information through the topological structure of HBIN enhances the expressiveness of X, which is the representation matrix of drugs and targets. Last, it is noted from Fig. 2A that iGRLDTI outperforms iGRLDTI-G by 3.4%, 3.1%, and 4.6% in terms of AUC, AUPR, and F1-score, and a further improvement is observed from iGRLDTI by addressing the over-smoothing issue. Accordingly, the resulting representations of drugs and targets are more distinguishable than those learned by iGRLDTI-A and iGRLDTI-G.

To investigate the impact of such information loss, a new variant of iGRLDTI, i.e. iGRLDTI-D, is implemented. The only difference between iGRLDTI and iGRLDTI-D is that iGRLDTI-D simply use the one-hot encoding of amino acids, rather than their categories, to compose the feature vectors of 3-mers. The performance of iGRLDTI-D is presented in Fig. 2B and C, and we note that iGRLDTI yields a relative improvement of 1.7% and 1.3% in terms of AUC and AUPR, respectively when compared with iGRLDTI-D. Obviously, the use of amino acid categories allows iGRLDTI to compose the feature vectors of 3-mers in a more compact manner without much redundant information, and accordingly, iGRLDTI performs better than iGRLDTI-D.

3.4 Over-smoothing analysis

In the context of deep learning, smoothness is normally used to indicate the similarity across the embedding vectors of nodes. Obviously, less discriminative features of nodes are extracted if their embedding vectors are more similar. When the number of GNN layers increases, node representations become more similar, thus leading to the over-smoothness issue. With no exception in an HBIN, the over-smoothing issue could degrade the performance of DTI prediction. To quantitatively measure the over-smoothing degree, we additionally adopt a frequently used evaluation metric, i.e. Mean Absolute Distance (MAD) (Chen et al. 2020), which is defined to compute the average distance between node representations. It is repeatable for the observation in Table 3, and we can now measure the over-smoothing situation by MAD values in the text above. One should note that MAD is proposed to calculate the mean average distance among node representations, and its value is within the range [0, 1]. A smaller MAD score indicates that deep learning models encounter a more severe over-smoothing issue, and thus higher MAD scores can indicate the learned representations are more discriminative. It is noted from Table 3 that the MAD score of iGRLDTI is significantly larger than that of iGRLDTI-G. This could be a strong indicator that for drugs and targets, their representations learned by iGRLDTI exhibit more discriminative features, thus leading the observation in Fig. 3B frequently made across all drug–target pairs. Since iGRLDTI is able to adaptively adjust the propagation depth for each node during representation learning, the impact of the over-smoothing issue is substantially weakened, thus improving the accuracy of DTI prediction.

Figure 3.

(A) The values of AUC, AUPR, and F1-score by iGRLDTI under 100 rounds of 10-fold cross-validation. (B) Distribution between $D B 01110$ and $P 54284$ representation vectors learned from iGRLDTI and iGRLDTI-G

Open in new tab Download slide

Table 3.

Comparison of MAD values between iGRLDTI and iGRLDTI-G with different propagation depths.

	iGRLDTI( $10^{- 2}$ )	iGRLDTI-G( $10^{- 2}$ )
		1	2	4	10	100	200
MAD	0.399	0.161	0.103	0.102	0.09	0.08	0.06

	iGRLDTI( $10^{- 2}$ )	iGRLDTI-G( $10^{- 2}$ )
		1	2	4	10	100	200
MAD	0.399	0.161	0.103	0.102	0.09	0.08	0.06

All numerical values in Table 4 are in the order of $10^{- 2}$ ⁠.

Table 3.

Comparison of MAD values between iGRLDTI and iGRLDTI-G with different propagation depths.

	iGRLDTI( $10^{- 2}$ )	iGRLDTI-G( $10^{- 2}$ )
		1	2	4	10	100	200
MAD	0.399	0.161	0.103	0.102	0.09	0.08	0.06

	iGRLDTI( $10^{- 2}$ )	iGRLDTI-G( $10^{- 2}$ )
		1	2	4	10	100	200
MAD	0.399	0.161	0.103	0.102	0.09	0.08	0.06

All numerical values in Table 4 are in the order of $10^{- 2}$ ⁠.

Regarding the propagation depth k, we can observe that the MAD values become smaller as k value increases, and thus the over-smoothing phenomenon become serious. In Table 3, we note that the MAD value of iGRLDTI is higher when compared with the MAD value for iGRLDTI-G with k = 1. In other words, node representations learned by iGRLDTI-G with k = 1 are more smoothing than those learned by iGRLDTI, which adopts the NDLS strategy to alleviate the over-smoothing issue. The smaller MAD value obtained by iGRLDTI-G with k = 1 is caused by the insufficient information transfer during the message propagation. Since iGRLDTI alleviates the over-smoothing issue by adaptively adjusting the propagation depth for each node during representation learning. In doing so, iGRLDTI can excavate a local-smoothing state of graph node features within HBIN, and further efficiently improve the ability of the model in task of DTI prediction.

3.5 Robustness analysis

To evaluate the robustness of iGRLDTI, we repeat 10-fold CV for 100 rounds and presented the average results of AUC, AUPR, and F1-score obtained by iGRLDTI in Table 1, where iGRLDTI still yields the best performance on the benchmark dataset. Moreover, we also draw the box plots in Fig. 3A to show both the summary statistics and the distributions of AUC, AUPR, and F1-score after 100 rounds. Since the variances of AUC, AUPR, and F1-score are 3.72E−06, 4.84E−06, and 2.15E−05, respectively, iGRLDTI also demonstrates its promising performance in terms of robustness.

Moreover, we conduct statistical hypothesis tests to demonstrate the significant difference in the comparison of AUC, AUPR, and F1-score. In particular, we perform the Paired Wilcoxon test by comparing iGRLDTI with other prediction models in terms of AUC, AUPR, and F1-score, and present the results in Table 4. Obviously, iGRLDTI significantly outperforms other prediction models at a confidence level of 95% (P-value < .05). This again indicates the superior advantage of iGRLDTI in DTI prediction.

Table 4.

Comparison of the Paired Wilcoxon test by comparing iGRLDTI with other prediction models.

iGRLDTI	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI
P-value	0.03662	0.02852	0.03125	0.01242	0.01618

P-value <.05 signifies that the results are statistically significant.

Table 4.

Comparison of the Paired Wilcoxon test by comparing iGRLDTI with other prediction models.

iGRLDTI	DTINet	EEG-DTI	NeoDTI	IMCHGAN	MultiDTI
P-value	0.03662	0.02852	0.03125	0.01242	0.01618

P-value <.05 signifies that the results are statistically significant.

3.6 Case study

The purpose of our case study is to assess the practical ability of iGRLDTI in terms of identifying unknown DTIs. In the case study, all known DTIs in the benchmark dataset are first taken as positive samples to compose the training dataset, and they are collected from DrugBank V3.0. Regarding the negative samples, we randomly pair up drugs and targets whose interactions are not found in the positive samples. Moreover, in the training dataset, the number of negative samples is the same as that of positive samples. After that, all drug–target pairs that are not found in the training dataset constitute the testing dataset. The cutoff is set as 0.5 to claim predicted DTIs. In other words, a drug–target pair is predicted to be interacted with each other if its prediction score is greater than 0.5. In terms of prediction scores, top 20 pairs in the testing dataset are selected for further validation, and each of them is verified with the latest version of DrugBank, i.e. V5.0. In other words, these verified drug–target pairs are not existed in DrugBank V3.0, but later added into DrugBank V5.0 due to the update of this database (Wishart et al. 2018). Following the same procedure as iGRLDTI, top-20 drug–target pairs predicted by each compared model are selected for further investigation in our case study. The top 20 pairs of drugs and targets with the largest prediction scores are presented in Table 5. It is worth noting that the top-20 DTIs pairs can be verified by the latest version DrugBank database (Wishart et al. 2018), which means the drug–target pair are not connected when training the iGRLDTI model, they can be predicted by iGRLDTI as candidate DTI and verified by DrugBank database. Consequently, iGRLDTI yields a better performance when compared with other comparing algorithms in discovering unknown DTIs. Taking MultiDTI as an example, only three out of the top 20 pairs have been verified by the DrugBank database, and none of these three verified pairs are ranked in top 5. Besides, we also analyze the performance of iGRLDTI and MultiDTI on the task of discovering DTIs for Zonisamide (ID: DB00909), which is a recommended drug in treating partial seizures (Wilfong and Willmore 2006). Regarding the prediction results, we find that for iGRLDTI, all the five targets predicted to interact with Zonisamide have been verified by the DrugBank database. But for MultiDTI, there are a total of three predicted targets, and none of them could be verified. Hence, this could be a strong indicator that iGRLDTI has a promising performance for discovering novel DTIs when compared with state-of-the-art DTIs prediction algorithms.

Table 5.

Top 20 predicted results by iGRLDTI.

Rank	Drug ID	Protein ID	Evidence	Rank	Drug ID	Protein ID	Evidence
1	DB00909	O43570	DrugBank	11	DB01268	P17948	DrugBank
2	DB01110	Q14500	DrugBank	12	DB00909	P00918	DrugBank
3	DB00909	Q99250	DrugBank	13	DB00398	P17948	DrugBank
4	DB00594	P19634	DrugBank	14	DB01224	P28335	DrugBank
5	DB01159	P48051	DrugBank	15	DB01268	P09619	DrugBank
6	DB00661	O95180	DrugBank	16	DB00398	P15056	DrugBank
7	DB00594	P19801	DrugBank	17	DB01268	P36888	DrugBank
8	DB01159	O60391	DrugBank	18	DB00398	P04049	DrugBank
9	DB01159	P48549	DrugBank	19	DB01110	Q13936	DrugBank
10	DB00909	O43497	DrugBank	20	DB00909	P21397	DrugBank

Rank	Drug ID	Protein ID	Evidence	Rank	Drug ID	Protein ID	Evidence
1	DB00909	O43570	DrugBank	11	DB01268	P17948	DrugBank
2	DB01110	Q14500	DrugBank	12	DB00909	P00918	DrugBank
3	DB00909	Q99250	DrugBank	13	DB00398	P17948	DrugBank
4	DB00594	P19634	DrugBank	14	DB01224	P28335	DrugBank
5	DB01159	P48051	DrugBank	15	DB01268	P09619	DrugBank
6	DB00661	O95180	DrugBank	16	DB00398	P15056	DrugBank
7	DB00594	P19801	DrugBank	17	DB01268	P36888	DrugBank
8	DB01159	O60391	DrugBank	18	DB00398	P04049	DrugBank
9	DB01159	P48549	DrugBank	19	DB01110	Q13936	DrugBank
10	DB00909	O43497	DrugBank	20	DB00909	P21397	DrugBank

Table 5.

Top 20 predicted results by iGRLDTI.

Rank	Drug ID	Protein ID	Evidence	Rank	Drug ID	Protein ID	Evidence
1	DB00909	O43570	DrugBank	11	DB01268	P17948	DrugBank
2	DB01110	Q14500	DrugBank	12	DB00909	P00918	DrugBank
3	DB00909	Q99250	DrugBank	13	DB00398	P17948	DrugBank
4	DB00594	P19634	DrugBank	14	DB01224	P28335	DrugBank
5	DB01159	P48051	DrugBank	15	DB01268	P09619	DrugBank
6	DB00661	O95180	DrugBank	16	DB00398	P15056	DrugBank
7	DB00594	P19801	DrugBank	17	DB01268	P36888	DrugBank
8	DB01159	O60391	DrugBank	18	DB00398	P04049	DrugBank
9	DB01159	P48549	DrugBank	19	DB01110	Q13936	DrugBank
10	DB00909	O43497	DrugBank	20	DB00909	P21397	DrugBank

Rank	Drug ID	Protein ID	Evidence	Rank	Drug ID	Protein ID	Evidence
1	DB00909	O43570	DrugBank	11	DB01268	P17948	DrugBank
2	DB01110	Q14500	DrugBank	12	DB00909	P00918	DrugBank
3	DB00909	Q99250	DrugBank	13	DB00398	P17948	DrugBank
4	DB00594	P19634	DrugBank	14	DB01224	P28335	DrugBank
5	DB01159	P48051	DrugBank	15	DB01268	P09619	DrugBank
6	DB00661	O95180	DrugBank	16	DB00398	P15056	DrugBank
7	DB00594	P19801	DrugBank	17	DB01268	P36888	DrugBank
8	DB01159	O60391	DrugBank	18	DB00398	P04049	DrugBank
9	DB01159	P48549	DrugBank	19	DB01110	Q13936	DrugBank
10	DB00909	O43497	DrugBank	20	DB00909	P21397	DrugBank

Another case study is given to further analyze how iGRLDTI avoids being over-smoothing by comparing its performance with iGRLDTI-G. As mentioned in the section of ablation study, iGRLDTI-G is a variant of iGRLDTI by using a classical GNN model, and hence it is prone to encounter the over-smoothing issue during representation learning. In particular, we note that the interaction between the drug DB01110 and the target protein P54284 is successfully identified by iGRLDTI, but not by iGRLDTI-G, where DB01110 is the drug ID of Miconazole in the DrugBank database, and P54284 is the uniport ID of Voltage-dependent L-type calcium channel subunit beta-3. Hence, we investigate the prediction scores yielded by iGRLDTI and iGRLDTI-G for this drug–target pair, and find that the prediction score of iGRLDTI, i.e. 0.98, is much larger than that of IGRLDTI-G, i.e. 0.39. In other words, iGRLDTI is more confident to indicate the interaction between DB01110 and P54284, but iGRLDTI-G fails to identify the interaction, as its prediction score is below the cutoff, i.e. 0.5. To validate the solidly of the observation in Fig. 3B, we additionally employed interquartile ranges (IQRs) as a measure of dispersion within the vector elements. A smaller IQR value corresponds to a shorter length of the boxplot, indicating a higher level of similarity among the vector elements, whereas a high boxplot with a larger IQR value hints at the differentiation within the vector elements. Hence, we calculate the average IQR for the feature representations learned from all unknown DTIs using iGRLDTI and iGRLDTI-G, resulting in values of 1.11 ± 0.015 and 0.93 ± 0.019, respectively. Consequently, the observation depicted in Fig. 3B is not an isolated incident but rather a common occurrence.

Assuming that H and $H_{G}$ are the concatenated representation vectors of DB01110 and P54284 learned by iGRLDTI and iGRLDTI-G, respectively, we present their boxplots in Fig. 3B to visualize the difference between H and $H_{G}$ from the distribution perspective. The height of a boxplot, to some extent, indicates the difference among the elements in the corresponding vector. In particular, a short boxplot means that all the elements in a vector are similar to each other, whereas a tall boxplot hints at the differentiation within the vector elements. It is observed from Fig. 3B that the difference in the elements of $H_{G}$ is much smaller than that of H. This could be a strong indicator that the representation vectors learned by iGRLDTI-G still suffer the over-smoothing issue, and it is for this reason that iGRLDTI-G fails to predict the DTI between DB01110 and P54284. Since iGRLDTI is able to learn more distinguishable representations by alleviating the over-smoothness from an alternative view, the accuracy of DTI prediction can thus be improved.

In sum, these case studies again demonstrate the promising performance of iGRLDTI in discovering new DTIs with more distinguishable representations, and hence it is believed that iGRLDTI could be a useful tool to identify novel DTIs.

4 Conclusion

In this article, an improved graph representation learning method, namely iGRLDTI, is developed to discover novel DTIs over HBIN. To this end, iGRLDTI first constructs an HBIN by integrating the biological knowledge of drugs and targets with their interactions. Then, iGRLDTI adopts an NDLS strategy to adaptively decide the propagation depth during representation learning, thus significantly enhancing the discriminative ability of their representations by alleviating over-smoothness. Finally, iGRLDTI employs the GBDT classifier to achieve the DTI prediction task. Experimental results demonstrate that iGRLDTI yields a superior performance under 10-fold CV when compared with several state-of-the-art prediction algorithms, and furthermore, our case studies indicate that iGRLDTI is able to learn more distinguishable representations of drugs and targets, and it is a useful tool to identify novel DTIs.

There are two reasons contributing to the superior performance of iGRLDTI. On the one hand, the construction of HBIN allows iGRLDTI to learn the representations of drugs and targets from multiple views. Due to the rich information carried by HBIN, the task of DTI prediction can be achieved by iGRLDTI in a more effective manner. On the other hand, with the NDLS strategy, iGRLDTI is able to determine the node-specific propagation depth for each biomolecule in HBIN. Consequently, it adaptively controls how much neighborhood information should be gathered to avoid over-smoothness during representation learning.

Besides, we also note several limitations of iGRLDTI. On the one hand, a simple weighted averaging method is applied to update $X_{i}$ ⁠, and it is difficult for us to differentiate the significance of $X_{i}^{(k)}$ at the k-th layer. One the other hand, not all drugs and targets are able to provide necessary biological knowledge especially for those newly discovered, and hence the prediction performance of iGRLDTI is weakened for drugs and targets without sufficient biological knowledge.

Regarding the future work, we would like to unfold it from four aspects. First, we intend to improve the performance of iGRLDTI by proposing solutions to address its limitations. Second, we are interested in evaluate the generalization ability of iGRLDTI by applying it to other prediction problems, such as protein–protein interactions prediction and drug–drug interaction prediction. Third, we would like to investigate the performance of iGRLDTI by integrating more kinds of associations, as it is a challenging task to fully exploit the heterogeneous information for improved performance of DTI prediction. Last, we also would like to explore the interpretability of iGRLDTI in order to provide interpretable prediction results (Schulte-Sasse et al. 2021).

Conflict of interest

The authors declare that the research has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding

This work was supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region [2021D01D05], in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences, in part of CAS Light of the West Multidisciplinary Team project [xbzg-zdsys-202114].

Data availability

The real data underlying this article are available from https://github.com/stevejobws/iGRLDTI.

References

Bagherian

Sabeti

Wang

et al.

Machine learning approaches and databases for prediction of drug–target interaction: a survey paper

Brief Bioinform

2021

;

247

–

Ballesteros

Palczewski

G protein-coupled receptor drug discovery: implications from the crystal structure of rhodopsin

Curr Opin Drug Discov Devel

2001

;

561

PubMed

Chen

Lin

et al. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In: Proceedings of the AAAI Conference on Artificial Intelligence, New York, New York, USA, Vol.

34.

2020

3438

–

3445

Chen

Xiao

FastGCN: Fast learning with graph convolutional networks via importance sampling. In: International Conference on Learning Representations, Vancouver, BC, Canada. ICLR,

2018a

Chen

Zhu

Song

Stochastic training of graph convolutional networks with variance reduction. In: International Conference on Machine Learning, Vancouver, BC, Canada. PMLR,

2018b

942

–

950

D’Souza

Prema

Balaji

Machine learning models for drug–target interactions: current knowledge and future directions

Drug Discov Today

2020

;

748

–

Friedman

JH.

Greedy function approximation: a gradient boosting machine

Ann Stat

2001

;

1189

–

232

Guo

Wen

et al.

Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences

Nucleic Acids Res

2008

;

3025

–

Hauser

Attwood

Rask-Andersen

et al.

Trends in GPCR drug discovery: new agents, targets and indications

Nat Rev Drug Discov

2017

;

829

–

Yang

Luo

et al.

An algorithm of inductively identifying clusters from attributed graphs

IEEE Trans Big Data

2020

;

–

534

Huang

Rong

et al. Tackling over-smoothing for general graph convolutional networks. arXiv, arXiv:2008.09864,

2020

, preprint: not peer reviewed.

Johnson

Chawla

Hellmann

JJ.

Species distribution modeling and prediction: a class imbalance problem. In: 2012 Conference on Intelligent Data Understanding, Boulder, CO, United States. IEEE,

2012

–

Keshava Prasad

Goel

Kandasamy

et al.

Human protein reference database-2009 update

Nucleic Acids Res

2009

;

D767

–

Kipf

Welling

Semi-supervised classification with graph convolutional networks. In: International Conference on Learning, ICLR, Toulon, France,

2017

Knox

Law

Jewison

et al.

DrugBank 3.0: a comprehensive resource for “omics” research on drugs

Nucleic Acids Res

2011

;

D1035

–

Landrum

Rdkit documentation

Release

2013

;

Wang

et al.

IMCHGAN: inductive matrix completion with heterogeneous graph attention networks for drug-target interactions prediction

IEEE/ACM Trans Comput Biol Bioinform

2022

;

655

–

Han

XM.

Deeper insights into graph convolutional networks for semi-supervised learning. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, New Orleans, Lousiana, United States.

2018

3538

–

3545

Luo

Zhao

Zhou

et al.

A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information

Nat Commun

2017

;

573

–

Mohamed

Ksantini

Kaabi

Convolutional dynamic auto-encoder: a clustering method for semantic images

Neural Comput Appl

2022

;

17087

–

105

Pan

et al.

Identifying protein complexes from protein-protein interaction networks based on fuzzy clustering and go semantic information

IEEE/ACM Trans Comput Biol Bioinform

2022

;

2882

–

Peng

Wang

Guan

et al.

An end-to-end heterogeneous graph representation learning-based framework for drug-target interaction prediction

Brief Bioinform

2021

;

–

PubMed

Peska

Buza

Koller

Drug-target interaction prediction: a Bayesian ranking approach

Comput Methods Programs Biomed

2017

;

152

–

Phatak

Stephan

Cavasotto

CN.

High-throughput and in silico screenings in drug discovery

Expert Opin Drug Discov

2009

;

947

–

Schulte-Sasse

Budach

Hnisz

et al.

Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms

Nat Mach Intell

2021

;

513

–

Shen

Zhang

Luo

et al.

Predicting protein–protein interactions based only on sequences information

Proc Natl Acad Sci USA

2007

;

104

4337

–

Szklarczyk

Gable

Lyon

et al.

String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets

Nucleic Acids Res

2019

;

D607

–

Veličković

Casanova

Liò

et al. Graph attention networks. In: 6th International Conference on Learning Representations, ICLR 2018—Conference Track Proceedings, Vancouver, BC, Canada.

2018

–

Vincent

Larochelle

Bengio

et al. Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, Helsinki, Finland.

2008

1096

–

1103

Wan

Hong

Xiao

et al.

Neodti: neural integration of neighbor information from a heterogeneous network for discovering new drug–target interactions

Bioinformatics

2019

;

104

–

Wang

Yang

Zhang

et al.

Drug repositioning by integrating target information through a heterogeneous network model

Bioinformatics

2014

;

2923

–

Weininger

Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules

J Chem Inf Comput Sci

1988

;

–