Abstract

Ribonucleic acids (RNAs) play crucial roles in living organisms and some of them, such as bacterial ribosomes and precursor messenger RNA, are targets of small molecule drugs, whereas others, e.g. bacterial riboswitches or viral RNA motifs are considered as potential therapeutic targets. Thus, the continuous discovery of new functional RNA increases the demand for developing compounds targeting them and for methods for analyzing RNA—small molecule interactions. We recently developed fingeRNAt—a software for detecting non-covalent bonds formed within complexes of nucleic acids with different types of ligands. The program detects several non-covalent interactions and encodes them as structural interaction fingerprint (SIFt). Here, we present the application of SIFts accompanied by machine learning methods for binding prediction of small molecules to RNA. We show that SIFt-based models outperform the classic, general-purpose scoring functions in virtual screening. We also employed Explainable Artificial Intelligence (XAI)—the SHapley Additive exPlanations, Local Interpretable Model-agnostic Explanations and other methods to help understand the decision-making process behind the predictive models. We conducted a case study in which we applied XAI on a predictive model of ligand binding to human immunodeficiency virus type 1 trans-activation response element RNA to distinguish between residues and interaction types important for binding. We also used XAI to indicate whether an interaction has a positive or negative effect on binding prediction and to quantify its impact. Our results obtained using all XAI methods were consistent with the literature data, demonstrating the utility and importance of XAI in medicinal chemistry and bioinformatics.

INTRODUCTION

Ribonucleic acids (RNA) are essential bioorganic molecules present in every living organism. Apart from messenger RNA (mRNA), many noncoding RNA (ncRNA) also play a crucial role in the cell, as they build large macromolecular machines, deliver amino acids to ribosomes or regulate different molecular processes, e.g. by acting as binding sites. Although the majority of drugs target proteins [1], only about 1.5% of the human genome consists of protein-coding sequences [2]. On the other hand, there is growing evidence of the functional role of ncRNA and its involvement in a plethora of diseases [3–5]. Also, many nucleic acids, or at least segments/domains within their sequence, possess the ability to adopt tertiary structures and have grooves capable of binding small molecules, making them attractive targets for small molecule drugs [6, 7].

As experimental approaches provide the most valuable data of biological activity for small molecule ligands, their high-throughput usage is restricted due to the tremendous cost, as well as the time- and labor-consuming nature of the process. For this reason, the computational virtual screening technique is widely used, as it allows us to score and compare thousands of potentially binding ligands much faster and cheaper than can be done experimentally. Thus, effective scoring, i.e. predicting their activity (in terms of binding, as biological activity can only be validated experimentally), is the ultimate challenge of rational drug discovery, as it allows for narrowing the number of compounds for further assays. Molecular docking is the classical method used in numerous studies to predict the favored geometric orientation of the receptor-ligand complex and to virtually screen a database of thousands of small molecule compounds to select those with the highest predicted binding activity toward the receptor for further optimization [8, 9].

Numerous computer programs exist for molecular docking. They share common steps of ligand sampling and scoring and have similar drawbacks arising from the fact that more nature-like approximations of receptor-ligand flexible conformation entail a high number of degrees of freedom, thus requiring exhaustive computations. Also, scoring functions used to assess molecular docking results have limited accuracy and may not correctly recognize binding ligands. Recently developed strategies based on machine learning (ML) may alleviate these obstacles and provide new frameworks for predicting active ligands [10, 11]. Also, the recent advance in the field of explainable artificial intelligence (XAI) may facilitate the understanding of decisions made by predictive ML models and support decisions made by medicinal chemists [12, 13].

Molecular fingerprints translate two- or three-dimensional structural information into one-dimensional vectors. These are machine-readable entities suitable for similarity calculation, modeling, data mining or ML (for a recent review on molecular fingerprints, see [14, 15], and for their applications in ML, see [16]). Structural interaction fingerprint (SIFt) is a molecular fingerprint type first described by Deng et al. for analyzing protein-ligand complexes [17]. It is a binary string representing the presence of specific molecular interactions between the receptor's residues and the ligand. SIFts have been used in studying protein-ligand complexes and were successfully applied in small molecule drug discovery: selectivity profiling, new target prediction, elucidating the binding mechanisms or developing new scoring functions [18–20]. SIFts enable filtering virtual chemical libraries to select only compounds with similar interaction interfaces to molecules with experimentally validated biological activities [17, 21, 22]. We recently showed that SIFts can also be successfully applied for the analysis of RNA–ligand complexes: comparing the binding mode in experimentally solved complexes, clustering of docking poses and selecting ligands forming a similar interaction network with the RNA receptor [23].

However, since most virtual screening methods focus on proteins, there is a scarcity of nucleic acids-oriented approaches. The goal of this work was to determine if SIFts combined with ML techniques can be applied to predict the binding activity of small molecules to RNA targets. As there is no widely available and up-to-date database of compounds with known binding activity toward RNA targets, for the purpose of testing virtual screening procedures, we composed a benchmark set based on the literature search, using the methodology previously described for protein and RNA targets [23, 24], freely available at https://github.com/filipsPL/fingernat-ml [25]. To calculate SIFts, we used our recently published open-source nucleic acid-oriented software fingeRNAt [23]. In our analysis, we compared the performance of the SIFt-based method with the state-of-the-art, RNA-oriented scoring functions. Finally, using the example of human immunodeficiency virus type 1 (HIV-1) trans-activation response element (TAR) RNA, we applied XAI methods to elucidate the structural factors determining the ligand binding.

Together, our work is the first described protocol for predicting ligand activity toward RNA receptors with molecular docking and explainable SIFt- and ML-based scoring techniques.

RESULTS AND DISCUSSION

To test and compare the predictive power of various ML methods for RNA—small molecule binding, six RNA targets were selected: (i) HIV-1 TAR, (ii) thiamine pyrophosphate (TPP) riboswitch [26, 27], (iii) flavin mononucleotide (FMN) riboswitch [27, 28], (iv) adenine riboswitch [29], (v) guanine riboswitch [30] and (vi) guanine riboswitch with the C74U mutation (guanine riboswitch with C74U mutation was distinguished as a separate target because of differences in ligand specificity resulting from this nucleotide change [31]). Information on chemical structures and binding properties of small molecule ligands were taken from the literature and experimentally solved structures were deposited in the RCSB Protein Data Bank (PDB). To avoid discrepancies resulting from different experimental techniques used to determine ligand binding, we used a binary class to describe their affinity toward RNA targets: binders, for molecules binding relatively strongly, and non-binders, for those with weak or non-detectable binding. For each RNA target, a structurally diversified set of ligands was prepared. As the information on non-binding ligands is sparse in the literature reports, putative non-binding ligands were generated using a well-established directory of useful decoys: enhanced (DUD-E) methodology ([24]; for the procedure details and several ligands in the datasets, see Materials and Methods).

For all of the investigated targets, multiple experimentally determined structures are deposited in the RCSB PDB. Therefore, for each RNA target, three distinct structures (but with an identical sequence of the binding pocket within 8 Å from the experimental ligand) were selected for further analysis (Table 1 from Materials and Methods). The effect of the number of RNA structures used to build SIFts was examined for the HIV-1 TAR RNA dataset (for up to seven different RNA structures) and three ML methods (namely: CatBoost, kNN and Random Forest). As expected, we observed that as the number of RNA structures increased, so did the performance (see: Supplemental Figure S1). For the Random Forest method, we observed a plateau in performance (AUROC values) after six structures forming the SIFts, whereas for two other methods (kNN and CatBoost) we observed a steady increase in performance. In this work, we used three structures to balance processing time and prediction accuracy.

Table 1

Molecular RNA targets used for benchmarking, PDB IDs of experimentally solved structures used for docking and SIFts calculation, the number of ligands collected from the literature and the final composition of ligand sets after adding decoys.

RNA targetPDB IDsNumber of collected ligandsFinal dataset (with DUD-E generated decoys)
BindersNon-binders
Adenine riboswitch1Y26, 4TZX, 4XNR187361
FMN riboswitch2YIE, 5KX9, 6BFB3015750
Guanine riboswitch3GAO, 3GER, 4FE579301510
Guanine riboswitch with C74U mutation2XNW, 2XNZ, 2XO12514714
HIV-1 TAR1UTS, 1UUI, 6MCE139301478
TPP riboswitch2GDI, 4NYA, 4NYD73301510
Total3641266323
RNA targetPDB IDsNumber of collected ligandsFinal dataset (with DUD-E generated decoys)
BindersNon-binders
Adenine riboswitch1Y26, 4TZX, 4XNR187361
FMN riboswitch2YIE, 5KX9, 6BFB3015750
Guanine riboswitch3GAO, 3GER, 4FE579301510
Guanine riboswitch with C74U mutation2XNW, 2XNZ, 2XO12514714
HIV-1 TAR1UTS, 1UUI, 6MCE139301478
TPP riboswitch2GDI, 4NYA, 4NYD73301510
Total3641266323

The last row provides an overview of the total values in each column.

Table 1

Molecular RNA targets used for benchmarking, PDB IDs of experimentally solved structures used for docking and SIFts calculation, the number of ligands collected from the literature and the final composition of ligand sets after adding decoys.

RNA targetPDB IDsNumber of collected ligandsFinal dataset (with DUD-E generated decoys)
BindersNon-binders
Adenine riboswitch1Y26, 4TZX, 4XNR187361
FMN riboswitch2YIE, 5KX9, 6BFB3015750
Guanine riboswitch3GAO, 3GER, 4FE579301510
Guanine riboswitch with C74U mutation2XNW, 2XNZ, 2XO12514714
HIV-1 TAR1UTS, 1UUI, 6MCE139301478
TPP riboswitch2GDI, 4NYA, 4NYD73301510
Total3641266323
RNA targetPDB IDsNumber of collected ligandsFinal dataset (with DUD-E generated decoys)
BindersNon-binders
Adenine riboswitch1Y26, 4TZX, 4XNR187361
FMN riboswitch2YIE, 5KX9, 6BFB3015750
Guanine riboswitch3GAO, 3GER, 4FE579301510
Guanine riboswitch with C74U mutation2XNW, 2XNZ, 2XO12514714
HIV-1 TAR1UTS, 1UUI, 6MCE139301478
TPP riboswitch2GDI, 4NYA, 4NYD73301510
Total3641266323

The last row provides an overview of the total values in each column.

The ligands were docked to their targets using the rDock docking software [32]. SIFts were calculated for three top-scored poses for each complex of ligand and its respective RNA target (see Materials and Methods and Figure 5).

CatBoost, Extra Trees and LightGBM are the most effective ML methods in predicting ligand binding

First, we checked the virtual screening performance of the ML algorithms, whose inputs were SIFts for each RNA–ligand complex with assigned ligand activity toward a given RNA target as a binary class (binder/non-binder). Fifteen ML classifiers were evaluated at their default setup, with the optional feature selection preprocessing step and/or stacking (see Materials and Methods).

Most of the tested ML methods performed better than the Baseline predictor (the negative control—which returns the most frequent label from the training data), with the average AUROC (area under the receiver operating characteristic curve, calculated from 3-fold cross-validation results) ranging from 0.50 (meaning models with no discriminative power, as Naïve Bayes) to 0.97 (LightGBM), and median AUROC values ranging from 0.50 (Naïve Bayes and Decision Table) to 0.99 (CatBoost; for the performance summary of all examined methods, Figure 1 and Supplemental Table S1). The performance varies from one RNA target to another, with the lowest average AUROC value for HIV-1 TAR (0.78) and the highest for adenine and guanine C74U riboswitches (0.91). This indicates that binding predictions for some molecular targets are more challenging with the presented method. It may result from an inability of the docking program to correctly predict the binding mode of the ligands, the fact that small molecules bind to the distinct binding pocket of the given RNA, or that the RNA undergoes a conformational change during the ligand binding. However, even for HIV-1 TAR RNA, for which we observed the worst average performance, ML methods with satisfactory AUROC values can still be selected (LightGBM with SelectedFeatures and RandomForest with SelectedFeatures and Stacked, with AUROC values equal to 0.92 and 0.92, respectively).

The performance of the ML methods. The performance is expressed as AUROC values (calculated from the 3-fold cross-validation results) and shown as a box-and-whisker plot. Individual AUROC values for each molecular target are overlaid as color- and shape-coded points.
Figure 1

The performance of the ML methods. The performance is expressed as AUROC values (calculated from the 3-fold cross-validation results) and shown as a box-and-whisker plot. Individual AUROC values for each molecular target are overlaid as color- and shape-coded points.

We observed that—on average—the additional pre- and post-processing steps slightly decreased the performance of the ML methods when applied individually (with the average AUROC values for all methods and targets equal to 0.94 for feature selection and stacking, compared to 0.95 for methods without pre- and post-processing) but the exact influence depends on the method. Feature selection improved two methods (CatBoost and Extra Trees) and worsened three methods (LightGBM, Random Forest and XGBoost), whereas stacking improved one method (Random Forest), worsened three methods (Extra Trees, LightGBM and XGBoost), and did not influence one method (CatBoost) (Figure 1). Nevertheless, all differences in observed AUROC values were not spectacular. Feature selection and stacking, when combined, also had a minor influence on the observed AUROC values. On average, it slightly increased the average performance (AUROC value 0.95). The accuracy was better for three methods (Extra Trees, Random Forest and XGBoost) and worsened for two methods (CatBoost and LightGBM), but the differences in AUROC values were still minor.

SIFts, in combination with ML methods, outperform classic scoring functions in ligand binding predictions

As shown above, SIFts combined with ML effectively predict ligand binding activity toward RNA targets. We compared these methods with the performance of classical general-purpose scoring functions. For that, we compared the screening metrics (AUROC, Boltzmann-enhanced discrimination of receiver operating characteristic—BEDROC, and Enrichment Factor for the top-scored 10% of ligands—EF10%; [33]) for the three best ML methods trained on SIFts (namely CatBoost, Extra Trees and LightGBM), three classic RNA–ligand scoring functions, AnnapuRNA [34], LigandRNA [35] and rDock [36] and one protein-ligand scoring function (RF-Score-VS, as a negative control) [37]; see Supplemental Section S5 for a description of the scoring function tested. The results are summarized in Figure 2 and Supplemental Figures S2S4.

The data show that the average performance of SIFts-based scoring functions is superior to the examined classic scoring functions, although the results are highly dependent on the molecular target. For example, for the adenine riboswitch, the SIFts methods reached AUROC values ranging from 0.99 to 1.00, whereas classic RNA–ligand scoring functions have only slightly worse performance (ranging from 0.96 to 0.99). This almost-ideal performance may be partially caused by the fact that the chemical space of binding ligands of adenine and guanine riboswitches is limited to heterocyclic compounds of similar size, which are also structurally similar to the experimentally determined ligands. This increases the probability of finding the native or near-native conformation of the complex by the docking program and results in satisfactory screening performance for both our method and classic RNA–ligand scoring functions. It should be noted that our datasets for guanine and adenine riboswitches also contain heterocyclic experimentally confirmed non-binding molecules; thus, the binder/non-binder recognition process goes beyond simple structural rules.

Similarly, the performance of the SIFts methods for FMN riboswitch is also very good (with AUROC ranging from 0.98 to 1.00), whereas for the scoring functions, it is much worse (AUROC ranging from 0.69 to 0.84), and for the protein dedicated RF-Score-VS AUROC reaches value 0.41 (which means worse than the random selection of the binders). For a single target—HIV-1 TAR RNA, we observed the decreased performance of all examined scoring functions (for SIFts methods, AUROC values range from 0.79 to 0.90, for classical functions from 0.29 to 0.81, for protein scoring function equals 0.54). This shows that the molecular recognition process for such ‘difficult’ cases may be beyond the relatively simple interaction model utilized by the current scoring functions (both described in this work and developed earlier) and cannot recapitulate the complicated binding phenomena observed in real life.

For a single target in the datasets, the absolute number of binders is below 10 (seven binders for adenine riboswitch), which is a consequence of the scant number of diverse compounds reported in the literature. However, the results obtained for targets with a lower number of binders (such as the adenine or guanine C74U riboswitch) are consistent with those obtained for targets with a higher number of binders (such as FMN and guanine riboswitch; Supplemental Table S1 and Supplemental Figures S2S4). Thus, we conclude that despite the variable number of binders in each dataset, the results presented are accurate.

As expected, for all the cases, the performance of the protein-ligand scoring function is much worse than for ML-based methods or classical scoring functions, with the AUROC value ranging from 0.36 for TPP riboswitch to 0.73 for adenine riboswitch. The suboptimal performance of a protein-specific scoring function applied to RNA-ligand complexes has also been observed in previous benchmarking experiments focused on predicting the native position of a ligand [34]. These results indicate that the statistical potentials trained on protein-ligand complexes do not provide accurate results when applied to RNA-ligand complexes, either for binding mode prediction or for virtual screening. The reason for this discrepancy may be due to the different binding preferences of these macromolecules, and thus the differences in contact statistics collected from experimentally resolved structures and used to develop the scoring function. This underscores the need to develop RNA-dedicated tools that are parameterized to the appropriate data.

The obtained benchmark results indicate that the SIFts-based ‘local’ scoring functions, which are trained on the binding activity data for a given molecular target, perform visibly better than the general-purpose scoring functions; however, this difference is not statistically significant (Supplemental Table S2). Also, our data pinpoint the need for further development and tests of the scoring functions dedicated to the RNA–ligands complexes, as the performance of the protein scoring functions may be very poor, frequently offering a random or worse-than-random selection of the ligands. The limitation is, however, the availability of the training data required to build the interaction model.

Comparison of AUROC values between methods. The performance of the three best ML methods (CatBoost, Extra Trees and LightGBM, with no feature selection nor stacking applied, calculated from the 3-fold cross-validation results) trained on the SIFts (upper part of the plots) is compared with the performance of the scoring functions (AnnapuRNA, LigandRNA, rDock and RF-Score-VS; the bottom part of the plots). The distribution of AUROC, BEDROC and EF10% values are represented on box-and-whisker plots, and the individual performance for six molecular targets is overlaid as color- and shape-coded points. The performance of scoring functions is displayed for the best performance among the three structures for each RNA.
Figure 2

Comparison of AUROC values between methods. The performance of the three best ML methods (CatBoost, Extra Trees and LightGBM, with no feature selection nor stacking applied, calculated from the 3-fold cross-validation results) trained on the SIFts (upper part of the plots) is compared with the performance of the scoring functions (AnnapuRNA, LigandRNA, rDock and RF-Score-VS; the bottom part of the plots). The distribution of AUROC, BEDROC and EF10% values are represented on box-and-whisker plots, and the individual performance for six molecular targets is overlaid as color- and shape-coded points. The performance of scoring functions is displayed for the best performance among the three structures for each RNA.

It should be noted, that due to the limited availability of experimentally determined structures and information on ligand binding to these RNAs, the set of RNA targets used in this study is biased toward riboswitches. On the other hand, the performance values obtained for a single non-riboswitch target (HIV-1 TAR) indicate that the results presented in this work can be generalized to other RNA targets.

Explainable ML models

Despite the high accuracy offered by many ML models in drug discovery and molecular biology, their application is limited by the lack of transparency on how and why the decisions are made [38]. To overcome this limited confidence in Artificial Intelligence systems, XAI became crucial, especially for application in biomedical, medical and healthcare studies. Also, in drug design applications, interpretable and explainable ML models offer support in understanding the binding phenomena, interpreting the landscape of structure–activity relationships in the inspected pool of compounds, or deciphering complex drug–drug interactions [39–41].

We investigated if the interpretation of binding prediction of the ML models built on SIFts can give insights into the factors important for the small molecule ligand binding to the RNA. Similar work has been recently presented for SIFts and a protein target—dopamine D4 receptor—and was proven useful in indicating important factors for ligand binding [42]. In our study, we used three XAI frameworks dedicated to explaining the predictions of ML models. The first method is a locally interpretable explanatory method termed SHapley Additive exPlanations (SHAP), derived from cooperative game theory and applicable to multiple ML problems in different fields [43]. For each prediction (each ligand) and each feature used in the predictive model, a SHAP value is assigned to quantify how this particular feature influences the prediction (in this work: the probability that the ligand belongs to the binders class). For this experiment, we built an ML model using one of the best-performing methods—CatBoost, for HIV-1 TAR class-balanced data. This RNA was chosen to demonstrate the usability of the proposed approach for more difficult but medically relevant molecular targets. For clarity of data presentation, we used SIFts with the basic interactions subset (hydrogen and halogen bonds, cation-anion, Pi-cation, Pi-anion, Pi-stacking and lipophilic interactions; see [23]), derived for the top-scored pose of each ligand docked to a single RNA structure (PDB ID: 1UTS, chosen arbitrarily). In addition to the SHAP method, we estimated the variable importances for the generated ML model using the CatBoost classifiers: CatBoost Prediction Values Change (indicating how much, on average, the prediction changes if the feature value changes), and the CatBoost Loss Function Change (indicating the difference between the loss value of the model with this feature and without it). The feature importance summary is presented in Figure 3.

Feature importance analysis for the HIV-1 TAR activity binding model trained with the CatBoost classifier on SIFts. Each row indicates a selected interaction type formed with a given residue (number in square brackets). (A) The 20 top-ranked interactions and their importance for the model performance tested with the SHAP method (the contribution of each feature to the prediction of the model—mean absolute SHAP value), the CatBoost Prediction Values Change (indicating how much, on average, the prediction changes if the feature value changes) and the CatBoost Loss Function Change (indicating the difference between the loss value of the model with this feature and without it). (B) The distribution of SHAP values for all ligands in the data set, with color-coded presence/absence of the interactions of the given type (red—the interaction was detected, blue—the interaction was not detected). SHAP values show how forming the given interaction increases or decreases the likelihood of a compound being classified as a binder.
Figure 3

Feature importance analysis for the HIV-1 TAR activity binding model trained with the CatBoost classifier on SIFts. Each row indicates a selected interaction type formed with a given residue (number in square brackets). (A) The 20 top-ranked interactions and their importance for the model performance tested with the SHAP method (the contribution of each feature to the prediction of the model—mean absolute SHAP value), the CatBoost Prediction Values Change (indicating how much, on average, the prediction changes if the feature value changes) and the CatBoost Loss Function Change (indicating the difference between the loss value of the model with this feature and without it). (B) The distribution of SHAP values for all ligands in the data set, with color-coded presence/absence of the interactions of the given type (red—the interaction was detected, blue—the interaction was not detected). SHAP values show how forming the given interaction increases or decreases the likelihood of a compound being classified as a binder.

According to all three XAI methods, the most important for the binding activity model are interactions with residue 40 (hydrogen bonds and lipophilic interactions) and residue 27 (Pi-stacking interaction) (Figure 3A). The forming of these interactions positively contributes to the model output and increases the predicted probability of binding activity for ligands (Figure 3B). This is in agreement with the experimental data deposited in the literature [44]. The subsequent, less important interactions with a positive contribution to the model output are non-covalent bonds with residues 23 (lipophilic and Pi-stacking; this residue is essential for binding of native substrate, Tat protein, to HIV-1 TAR RNA, and is highly conserved among different subtypes of HIV-1 and HIV-2, and simian immunodeficiency virus [45], thus utilizing it for ligand binding can be valuable for antiviral drug design), 26 (Pi-stacking [46] and hydrogen bonds [47]) and 39 (Pi-stacking) [44]. To our surprise, according to the SHAP analysis, the formation of hydrogen bonds with residue 22 has a slightly negative effect on the predicted binding activity probability. Earlier literature data indicated that hydrogen bonding with the phosphate oxygen atoms of this residue is important for arginine and argininamide ligand binding [48]. However, the latter molecular dynamics simulations and nuclear magnetic resonance experiments did not support this observation [44], which may explain calculated SHAP values for hydrogen bonds with this residue. The influence of other interactions, such as hydrogen bonding with residues 39 and 24 and lipophilic interactions with residue 27, are harder to interpret due to their mixed contribution to the overall binding activity probability.

We also performed SHAP analysis for models built using other well-performing ML methods (see Supplemental Section S4.1). Although the contributions of individual interactions are slightly different from the one described for CatBoost, the list of the most important ones affecting ligand binding remains the same.

SHAP values calculated for the individual ligands may give insights into the contribution of variables (here: interactions) on the probability of the compound being the binder, as estimated by the ML model. Selected examples of the binding molecules Mitoxantrone and Amiloride 4h (Figure 4A) show the contributions of the presence or absence of the given interactions to the final prediction of the probability of being a binder for these molecules. Interactions that are colored purple increase the probability of the ligand being the binder, whereas interactions that are colored dark cyan decrease this probability. For Mitoxantrone, the main factors increasing this probability are lipophilic interactions, hydrogen bonds and cation-anion interactions with residues 40 and 22. Interestingly, this probability is also increased by the fact that the ligand does not form hydrogen bonds and lipophilic interaction with residue 27 and hydrogen bonds with residue 24. The probability of the Mitoxantrone being the binder decreases if the ligand does not form the Pi-stacking contact with residue 27. This is in agreement with the literature reports, where the criteria for HIV-1 TAR RNA binders are defined as follows: having an aromatic or heteroaromatic moiety (here, the ligand has this structural feature, but stacking interaction is not observed, thus decreased probability of binding), a cationic residue (the ligand has this structural feature and it has a positive contribution to the overall probability by interactions with residues 22 and 40), and a linker in the structure (a positive contribution to the overall probability by the hydrogen bonding of the hydroxyl group with residue 40) [49, 50]. These observations also agree with the interactions detected in a three-dimensional structure of the Mitoxantrone—HIV-1 TAR RNA complex predicted with molecular docking (best-scored pose, Figure 4B). SHAP values calculated for each ligand in the pool and stacked horizontally confirm that, indeed, the probability of binding for the true-inactive ligands is visibly lower than for the real binding ligands (Figure 4C, left and right part of the plot, respectively), although different features are contributing to the final probability values.

SHAP analysis for the HIV-1 TAR ligand binding activity model trained with the CatBoost classifier on SIFts input. (A) Force plots for two selected HIV-1 TAR binding molecules: Mitoxantrone and Amiloride 4h, showing the main features contributing to pushing the model output from the base value (the average model output over the training dataset) to the predicted probability. Features pushing the probability higher are shown in purple, features pushing the probability lower are in dark cyan and the purple-dark cyan border indicates the probability value. (B) Prediction of the binding mode of the Mitoxantrone molecule inside the binding pocket of HIV-1 TAR with detected interactions. The residues forming the interactions with the highest contribution to the model output are indicated by arrows: purple (with the positive contribution) and dark cyan (with the negative contribution). (C) SHAP values were calculated for each ligand in the pool (with the non-binding ligands located at the first half of the plot and binding ligands at the second half of the plot) stacked horizontally, indicating the predicted probability of the ligand being the binder.
Figure 4

SHAP analysis for the HIV-1 TAR ligand binding activity model trained with the CatBoost classifier on SIFts input. (A) Force plots for two selected HIV-1 TAR binding molecules: Mitoxantrone and Amiloride 4h, showing the main features contributing to pushing the model output from the base value (the average model output over the training dataset) to the predicted probability. Features pushing the probability higher are shown in purple, features pushing the probability lower are in dark cyan and the purple-dark cyan border indicates the probability value. (B) Prediction of the binding mode of the Mitoxantrone molecule inside the binding pocket of HIV-1 TAR with detected interactions. The residues forming the interactions with the highest contribution to the model output are indicated by arrows: purple (with the positive contribution) and dark cyan (with the negative contribution). (C) SHAP values were calculated for each ligand in the pool (with the non-binding ligands located at the first half of the plot and binding ligands at the second half of the plot) stacked horizontally, indicating the predicted probability of the ligand being the binder.

Analysis of the variables influencing the decision made by the ML model for Amiloride 4h indicates that the set of interactions is similar to the one for Mitoxantrone, however with some differences (e.g. the importance of the formation of Pi-cation interaction with residue 39, not ranked high in the analysis for Mitoxantrone).

In addition to the SHAP method, we also used the local interpretable model-agnostic explanations (LIME) [51] approach to help interpret the predictions of ML model bindings built on SIFs. Instead of explaining the model globally (with a more exhaustive approach, testing all possible predictions for an observation using all possible combinations of inputs), this method focuses on training local models to explain individual predictions, explain how the model works in the vicinity of the data point. The LIME analysis of the CatBoost model for HIV-1 TAR RNA for Mitoxantrone indicates the same interactions contributing the most to the binding prediction, namely interactions formed with residue 40 (hydrogen bonds and lipophilic interactions) and residue 23 (lipophilic, see Supplemental Section S4.2). Also, interactions that are not formed reduce the probability of binding (e.g. missing Pi-stacking interaction with residue 27 and halogen bond with residue 26). Also, as indicated by the SHAP analysis, hydrogen bond formation with residue 22 has a slightly negative effect on the predicted probability of binding activity. Some other interactions, which are indicated as having a high contribution to the binding of mitoxantrone by the SHAP method, have a lower contribution to the binding according to the LIME method (e.g. cation-anion interactions with residues 22 and 40). The observed discrepancies are due to differences in the methodologies used by SHAP and LIME. A key step in the latter algorithm is a selection of the neighborhood of the observation of interest (i.e. the ligand; done by the kernel width parameter of LIME), which can affect the explanation results.

Other XAI frameworks can also be employed to explain binding models built with SIFts. For example, ELI5 (a method designed for global and local interpretation of models, https://github.com/TeamHG-Memex/eli5) applied to the LightGBM and SVM models indicated the importance of binding Mitoxantrone to the residue 40 (as did the SHAP and LIME methods), but lipophilic interaction with residue 23 was indicated as important only for the SVM model (see Supplemental Section S4.3). The visualization and explanation of decision tree models using the dtreeviz package (https://github.com/parrt/dtreeviz) allow one to follow the decision-making process for each compound. Analysis of the LightGBM models built for HIV-1 TAR using the aforementioned tool shows that two binding molecules: Mitoxantrone and Amiloride 4h are correctly classified, but the decision-making process is different (see Supplementary Section S4.4). For both compounds, the first two decisions are made based on hydrogen bond formation with residues 40 and 39, but the subsequent decision scheme is different.

The XAI methods, such as SHAP, LIME or ELI5, applied to explain ML models built using SIFts can provide valuable insights into the nature of molecular interactions’ importance for small molecule binding. SHAP values of non-covalent interactions combined with analysis of the three-dimensional structure of complexes enable visual interpretation of the ML model and can give instant hints on the nature of the binding phenomena. XAI analysis not only allowed for the selection of residues and interaction types important for the ligand binding activity but also made it possible to indicate if the given interaction has a positive or a negative influence.

Summary

Computational methods play a pivotal role in the early stages of drug discovery and are widely applied in virtual screening, structure optimization and compound activity profiling. With emerging discoveries of the role of RNA in disease, there is a growing interest in utilizing them as potential targets for novel therapies. In this work, we showed that SIFts, combined with ML algorithms, can be readily and effectively used in building predictive models for RNA targets. We showed that these ‘local scoring functions’ (i.e. ML models built with SIFts for a specific molecular target) have superior performance in virtual screening compared to general-purpose scoring functions. On the other hand, to use this approach, a separate ML model must be trained for each molecular target using a dataset of ligands with known activity. Although this requirement currently may be seen as limiting, we believe that with growing interest in studying interactions of small molecules with RNA, the experimental training data will become readily available for many other RNAs, as it is for proteins [52]. Recent applications of protein-focused SIFts-based drug discovery have played an important role in addressing the current needs of global healthcare, such as the COVID-19 pandemic [53, 54]. In this work, we also showed that the SIFts-based ML models can be interpreted using modern XAI techniques, such as SHAP, LIME, ELI5 or dtreeviz, which offer great help in understanding the molecular-recognition process. Merging numerical data with a three-dimensional complex structure can guide medicinal chemists in further directions of developing new potential small molecule drugs. We are convinced that the presented approach will be valuable for the scientific community and facilitate new RNA-based therapies discoveries.

MATERIALS AND METHODS

Database of RNA-targeting ligands

To construct a dataset containing information on binding and non-binding ligands to the six selected RNA targets, an extensive literature search was performed. The composition of the dataset is a direct consequence of the limited availability of experimental data reported in the literature and databases (i.e. experimentally solved RNA structures and reports of ligands that do or do not bind to them). Information about the chemical structures of molecules and their binding affinities was collected and tabulated together. As there is a lack of literature reports on non-binding molecules, to simulate the content of the real chemical library where the percentage of binding molecules is low, the additional putative non-binding ligands were generated using the DUD-E methodology (https://dude.docking.org/generate) [24]. The exact procedure was identical to the one described in our previous work [23]. Briefly, ligands whose binding was not described precisely, or was calculated only on the grounds of cell-based assays, were not included in the database. If the RNA sequence, for which the ligand’s binding activity is described, was different from the sequence of experimentally solved structures (Table 1) within the binding site or its proximity up to 8 Å, the ligand was rejected. Structures of the ligands were normalized (i.e. transformed into consistent chemical representation to avoid discrepancies and detect possible errors), and a set of filters was applied to include only drug-like molecules (molecular weight from 90 to 900 daltons, an octanol–water partition coefficient—SlogP from −7 to 9, number of hydrogen bond acceptors up to 18, number of hydrogen bond donors up to 18, number of rotatable bonds up to 18). Next, the PAINS (Pan-assay interference compounds) [55] filter was applied to exclude ligands for which observed activity was potentially a result of interference with assays and not the interactions with the RNA. Finally, ligands were assigned into one of two groups—binders or non-binders. The criterion was the binding affinity or biological activity (expressed as KD, IC50 or Ki), and ligands with an activity parameter below 300 μM were classified as binders. Ligands with an activity parameter higher than 300 μM or described by the authors as ‘inactive’ or ‘not binding’ were classified as non-binders. This procedure allowed us to neglect the methodological differences associated with various parameters expressing the activity of compounds and use a binary binding class to compare ligands indirectly. A similar procedure was successfully applied earlier to protein targets (e.g. [56–58]) as well as in our previous work [23]. Afterward, to remove structurally similar compounds, binders and non-binders were clustered using the k-medoids algorithm, and for each cluster, a representative ligand was selected (Figure 5A). Data processing and analysis were performed using the KNIME 4.0.1 analytics platform [59]. The composition of the final datasets is summarized in Table 1.

Dataset preparation pipeline. (A) For each RNA target, a dataset of ligands with experimentally determined binding to the target RNA was collected from the literature reports and the RCSB PDB database. The pool of ligands was diversified, and putative non-binding ligands were generated. Each molecule from this dataset (with the binary activity class assigned) was (B) independently docked to three structures of its corresponding RNA target. (C) Next, for each computationally predicted RNA–ligand complex, SIFts were calculated for the three top-scored poses (and then averaged); fingerprints from each complex were concatenated into a single long fingerprint. (D) Fingerprints were combined with a binding class of the initial molecule and used as inputs to ML. This procedure was repeated for each molecule in the dataset of a given RNA target, and the resulting fingerprints were collected in a single table, which served as input to ML.
Figure 5

Dataset preparation pipeline. (A) For each RNA target, a dataset of ligands with experimentally determined binding to the target RNA was collected from the literature reports and the RCSB PDB database. The pool of ligands was diversified, and putative non-binding ligands were generated. Each molecule from this dataset (with the binary activity class assigned) was (B) independently docked to three structures of its corresponding RNA target. (C) Next, for each computationally predicted RNA–ligand complex, SIFts were calculated for the three top-scored poses (and then averaged); fingerprints from each complex were concatenated into a single long fingerprint. (D) Fingerprints were combined with a binding class of the initial molecule and used as inputs to ML. This procedure was repeated for each molecule in the dataset of a given RNA target, and the resulting fingerprints were collected in a single table, which served as input to ML.

Datasets for machine learning

The datasets preparation pipeline is summarized in Figure 5.

Molecular docking

Experimentally solved RNA structures were used for docking small molecule ligands to the six selected viral and bacterial RNA targets (Table 1, Figure 5B). All macromolecules were preprocessed with the Chimera dockprep pipeline [60]. Docking of small molecule ligands to the aforementioned RNA targets was performed with the rDock docking program (version 2013.1) with dock_solv desolvation potential and a docking radius set to 10 Å [36].

Rescoring of the docked poses was performed using external scoring functions: AnnapuRNA [34], LigandRNA [35] and RF-Score-VS [37].

SIFts’ calculations

Fingerprints type FULL with default settings were calculated for the three top-scored poses of small molecule ligands docked to their RNA target (Figure 5C) using the fingeRNAt software (version 3.0) [23] (fingerprint version FULL; all interactions included) and then averaged. Next, fingerprints calculated for different RNA structures of the same target were horizontally concatenated (this allowed to preserve all information carried by these complexes), and the binding class of the compound was assigned (Figure 5D).

Machine learning

ML algorithms have been selected from modern methods frequently used in bioinformatics [61–63], as well as those traditionally used to solve bioinformatics problems [64]. For building ML models, the mljar-supervised python package was used (https://github.com/mljar/mljar-supervised, version 0.11.1; for algorithms Random Forest, XGBoost, DecisionTree, LightGBM, Extra Trees and CatBoost), Weka (version 3.8.6, for algorithms: Naïve Bayes, Logistic Regression, IBk, K*, Decision Table, J48 and Random Tree) [65], H2O KNIME Nodes (KNIME version 4.7.0, H2O AutoML node, for deep learning—multi-layer feed-forward neural network, with maximum training time set to 600 s on 16 CPUs [59, 66]) and scikit-learn optimized with the Optuna framework (versions 1.2.1 and 3.1.0, respectively, for SVM + Optuna; hyperparameters optimized involved C value and kernel type, with 1000 trials [67, 68]). For methods provided with mljar-supervised package, models were also optimized using feature selection preprocessing (i.e. for the training set, first selecting a subset of important features and next training new models on selected features) and stacking (stacked algorithms are built with predictions from unstacked models), and hyper-parameters were selected using the Optuna optimization framework (with a time budget set to 32,400 s on 28 CPUs) [68]. Additionally, the Baseline predictor, which returns the most frequent label from the training data, was used as a negative control. For all methods, a 3-fold cross-validation strategy with stratified sampling was used during the optimization, with AUROC as the evaluation metrics.

Feature importance calculation for the CatBoost model built for HIV-1 TAR ligands was performed with CatBoost internal feature importance functions (PredictionValuesChange and LossFunctionChange; CatBoost python module version 1.0.6 [69]). SHAP analysis was performed with the SHAP python module (version 0.41.0) [43, 70]; for clarity of data presentation, it was calculated for the top-scored pose with the binding class balanced. LIME analysis was performed with the lime python module (version 0.2.0.1), with an entropy discretizer. ELI5 analysis was performed with the eli5 python module (version 0.13.0). Since CatBoost is not supported for explaining individual predictions, the analysis was performed for LightGBM and SVM (with a linear kernel) models. ELI5 plots show a <BIAS> feature, which is the expected average score output by the model, based on the distribution of the training set (intercept, baseline). dtreeviz analysis was performed with the dtreeviz python module (version 0.13.0) for LightGBM, visualized for the first tree.

Data analysis

AUROC values were calculated with the mljar-supervised package (for ML methods) and the scikit-learn python module (version 1.0.1; [67]). The 95% confidence intervals were calculated using a bootstrap algorithm with 1000 iterations using the Seaborn statistical data visualization library (version 0.11.2; [71]). Statistical tests were performed in the SciPy python module (version 1.8.1) using the Wilcoxon signed-rank test, and the two-tailed P-values were reported [72].

Key Points
  • Structural interaction fingerprints (SIFts), combined with machine learning, were successfully used to develop activity models for ligands binding to ribonucleic acids (RNA).

  • SIFt-based models outperformed the classic, general-purpose scoring functions in virtual screening.

  • Explainable artificial intelligence allowed us to understand the decision-making process and decipher molecular recognition processes in the analysis of RNA–ligand binding activity models.

  • We provide a benchmark dataset based on ligands with known or putative binding activity toward six RNA targets. It can be readily used by the scientific community to test new algorithms of virtual screening on RNA–ligand complexes.

ACKNOWLEDGEMENTS

We thank Dr Eugene Baulin for critical reading and feedback on the manuscript. This research was carried out in part with the support of the Interdisciplinary Centre for Mathematical and Computational Modelling (ICM) University of Warsaw under computational allocation no G88-1177 to F.S. This research was funded in part by the National Science Centre, grant number: 2020/39/B/NZ2/03127. For the purpose of Open Access, the author has applied a CC-BY public copyright license to any Author Accepted Manuscript (AAM) version arising from this submission.

FUNDING

National Science Centre in Poland (grant number 2020/39/B/NZ2/03127 to F.S.); TEAM programme of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund (grant number POIR.04.04.00-00-3CF0/16 to J.M.B.).

DATA AVAILABILITY

The training dataset and the code required for performing all the described analyses can be found at https://github.com/filipsPL/fingernat-ml [25].

Author Biographies

Natalia A. Szulc is a Ph.D. student at the International Institute of Molecular and Cell Biology in Warsaw. She holds M.Sc. degrees in Molecular Biotechnology and Computational Engineering. She is interested in RNA–ligand interactions in the pharmacological context.

Zuzanna Mackiewicz is a Ph.D. student at the International Institute of Molecular and Cell Biology in Warsaw. Her work is focused on studying cytoplasmic polyadenylation using both bioinformatics and molecular biology approaches.

Janusz M. Bujnicki is a head of the Laboratory of Bioinformatics and Protein Engineering at the International Institute of Molecular and Cell Biology in Warsaw. He is interested in understanding how the sequence of biopolymers defines their structure and interactions with other molecules. His team is currently focusing on the structural biology of RNA.

Filip Stefaniak is a senior researcher at the International Institute of Molecular and Cell Biology in Warsaw. His research interests include data-driven modeling of interactions of RNA with small molecule ligands.

REFERENCES

1.

Santos
 
R
,
Ursu
 
O
,
Gaulton
 
A
, et al.  
A comprehensive map of molecular drug targets
.
Nat Rev Drug Discov
 
2017
;
16
:
19
34
.

2.

Gregory
 
TR
.
Synergy between sequence and size in large-scale genomics
.
Nat Rev Genet
 
2005
;
6
:
699
708
.

3.

Esteller
 
M
.
Non-coding RNAs in human disease
.
Nat Rev Genet
 
2011
;
12
:
861
74
.

4.

Ning
 
L
,
Cui
 
T
,
Zheng
 
B
, et al.  
MNDR v3.0: mammal ncRNA–disease repository with increased coverage and annotation
.
Nucleic Acids Res
 
2021
;
49
:
D160
4
.

5.

Guo
 
JK
,
Guttman
 
M
.
Regulatory non-coding RNAs: everything is possible, but what is important?
 
Nat Methods
 
2022
;
19
:
1156
9
.

6.

Warner
 
KD
,
Hajdin
 
CE
,
Weeks
 
KM
.
Principles for targeting RNA with drug-like small molecules
.
Nat Rev Drug Discov
 
2018
;
17
:
547
58
.

7.

Falese
 
JP
,
Donlic
 
A
,
Hargrovea
 
AE
.
Targeting RNA with small molecules: from fundamental principles towards the clinic
.
Chem Soc Rev
 
2021
;
50
:
2224
43
.

8.

Pinzi
 
L
,
Rastelli
 
G
.
Molecular docking: shifting paradigms in drug discovery
.
Int J Mol Sci
 
2019
;
20
:
4331
.

9.

Fan
 
J
,
Fu
 
A
,
Zhang
 
L
.
Progress in molecular docking
.
Quant Biol
 
2019
;
7
:
83
9
.

10.

Paul
 
D
,
Sanap
 
G
,
Shenoy
 
S
, et al.  
Artificial intelligence in drug discovery and development
.
Drug Discov Today
 
2021
;
26
:
80
93
.

11.

Shen
 
C
,
Weng
 
G
,
Zhang
 
X
, et al.  
Accuracy or novelty: What can we gain from target-specific machine-learning-based scoring functions in virtual screening?
 
Brief Bioinform
 
2021
;
22
:
bbaa410
.

12.

Jiménez-Luna
 
J
,
Grisoni
 
F
,
Schneider
 
G
.
Drug discovery with explainable artificial intelligence
.
Nat Mach Intell
 
2020
;
2
:
573
84
.

13.

Jiménez-Luna
 
J
,
Grisoni
 
F
,
Weskamp
 
N
, et al.  
Artificial intelligence in drug discovery: recent advances and future perspectives
.
Expert Opin Drug Discovery
 
2021
;
16
:
949
59
.

14.

Yang
 
J
,
Cai
 
Y
,
Zhao
 
K
, et al.  
Concepts and applications of chemical fingerprint for hit and lead screening
.
Drug Discov Today
 
2022
;
27
:
103356
.

15.

Du
 
B-X
,
Qin
 
Y
,
Jiang
 
Y-F
, et al.  
Compound–protein interaction prediction by deep learning: databases, descriptors and models
.
Drug Discov Today
 
2022
;
27
:
1350
66
.

16.

Wigh
 
DS
,
Goodman
 
JM
,
Lapkin
 
AA
.
A review of molecular representation in the age of machine learning
.
WIREs Comput Mol Sci
 
2022
;
12
:
e1603
.

17.

Deng
 
Z
,
Chuaqui
 
C
,
Singh
 
J
.
Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions
.
J Med Chem
 
2004
;
47
:
337
44
.

18.

Seifert
 
MHJ
.
Targeted scoring functions for virtual screening
.
Drug Discov Today
 
2009
;
14
:
562
9
.

19.

Zhao
 
Z
,
Bourne
 
PE
.
Harnessing systematic protein–ligand interaction fingerprints for drug discovery
.
Drug Discov Today
 
2022
;
27
:
103319
.

20.

Xiong
 
G
,
Shen
 
C
,
Yang
 
Z
, et al.  
Featurization strategies for protein–ligand interactions and their applications in scoring function development
.
WIREs Comput Mol Sci
 
2022
;
12
:
e1567
.

21.

Bajusz
 
D
,
Rácz
 
A
,
Héberger
 
K
.
Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?
 
J Chem
 
2015
;
7
:
20
.

22.

Rácz
 
A
,
Bajusz
 
D
,
Héberger
 
K
.
Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints
.
J Chem
 
2018
;
10
:
48
.

23.

Szulc
 
NA
,
Mackiewicz
 
Z
,
Bujnicki
 
JM
, et al.  
fingeRNAt—A novel tool for high-throughput analysis of nucleic acid-ligand interactions
.
PLoS Comput Biol
 
2022
;
18
:
e1009783
.

24.

Mysinger
 
MM
,
Carchia
 
M
,
Irwin
 
JJ
, et al.  
Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking
.
J Med Chem
 
2012
;
55
:
6582
94
.

25.

Szulc
 
NA
,
Mackiewicz
 
Z
,
Bujnicki
 
JM
, et al.  
Structural interaction fingerprints and machine learning for predicting and explaining binding of small molecule ligands to RNA, a benchmark data set
. https://doi.org/10.5281/zenodo.7486183.

26.

Winkler
 
W
,
Nahvi
 
A
,
Breaker
 
RR
.
Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression
.
Nature
 
2002
;
419
:
952
6
.

27.

Mironov
 
AS
,
Gusarov
 
I
,
Rafikov
 
R
, et al.  
Sensing small molecules by nascent RNA: a mechanism to control transcription in bacteria
.
Cell
 
2002
;
111
:
747
56
.

28.

Vitreschak
 
AG
,
Rodionov
 
DA
,
Mironov
 
AA
, et al.  
Regulation of riboflavin biosynthesis and transport genes in bacteria by transcriptional and translational attenuation
.
Nucleic Acids Res
 
2002
;
30
:
3141
51
.

29.

Mandal
 
M
,
Breaker
 
RR
.
Adenine riboswitches and gene activation by disruption of a transcription terminator
.
Nat Struct Mol Biol
 
2004
;
11
:
29
35
.

30.

Mandal
 
M
,
Boese
 
B
,
Barrick
 
JE
, et al.  
Riboswitches control fundamental biochemical pathways in Bacillus subtilis and other bacteria
.
Cell
 
2003
;
113
:
577
86
.

31.

Gilbert
 
SD
,
Stoddard
 
CD
,
Wise
 
SJ
, et al.  
Thermodynamic and kinetic characterization of ligand binding to the purine riboswitch Aptamer domain
.
J Mol Biol
 
2006
;
359
:
754
68
.

32.

Morley
 
SD
,
David Morley
 
S
,
Afshar
 
M
.
Validation of an empirical RNA-ligand scoring function for fast flexible docking using RiboDock®
.
J Comput Aided Mol Des
 
2004
;
18
:
189
208
.

33.

Truchon
 
J-F
,
Bayly
 
CI
.
Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem
.
J Chem Inf Model
 
2007
;
47
:
488
508
.

34.

Stefaniak
 
F
,
Bujnicki
 
JM
.
AnnapuRNA: a scoring function for predicting RNA-small molecule binding poses
.
PLoS Comput Biol
 
2021
;
17
:
e1008309
.

35.

Philips
 
A
,
Milanowska
 
K
,
Lach
 
G
, et al.  
LigandRNA: computational predictor of RNA-ligand interactions
.
RNA
 
2013
;
19
:
1605
16
.

36.

Ruiz-Carmona
 
S
,
Alvarez-Garcia
 
D
,
Foloppe
 
N
, et al.  
rDock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids
.
PLoS Comput Biol
 
2014
;
10
:
e1003571
.

37.

Wójcikowski
 
M
,
Ballester
 
PJ
,
Siedlecki
 
P
.
Performance of machine-learning scoring functions in structure-based virtual screening
.
Sci Rep
 
2017
;
7
:
46710
.

38.

Yang
 
G
,
Ye
 
Q
,
Xia
 
J
.
Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond
.
Inf Fusion
 
2022
;
77
:
29
52
.

39.

Rodríguez-Pérez
 
R
,
Bajorath
 
J
.
Interpretation of compound activity predictions from complex machine learning models using local approximations and Shapley values
.
J Med Chem
 
2020
;
63
:
8761
77
.

40.

Harren
 
T
,
Matter
 
H
,
Hessler
 
G
, et al.  
Interpretation of structure–activity relationships in real-world drug design data sets using explainable artificial intelligence
.
J Chem Inf Model
 
2022
;
62
:
447
62
.

41.

Vo
 
TH
,
Nguyen
 
NTK
,
Kha
 
QH
, et al.  
On the road to explainable AI in drug-drug interactions prediction: a systematic review
.
Comput Struct Biotechnol J
 
2022
;
20
:
2112
23
.

42.

Fassio
 
AV
,
Shub
 
L
,
Ponzoni
 
L
, et al.  
Prioritizing virtual screening with interpretable interaction fingerprints
.
J Chem Inf Model
 
2022
;
62
:4300–18.

43.

Lundberg
 
SM
,
Lee
 
SI
.
A unified approach to interpreting model predictions
.
Adv Neural Inf Process Syst
 
2017
;
30
.

44.

Nifosì
 
R
,
Reyes
 
CM
,
Kollman
 
PA
.
Molecular dynamics studies of the HIV-1 TAR and its complex with argininamide
.
Nucleic Acids Res
 
2000
;
28
:
4944
55
.

45.

Roy
 
S
,
Delling
 
U
,
Chen
 
CH
, et al.  
A bulge structure in HIV-1 TAR RNA is required for Tat binding and Tat-mediated trans-activation
.
Genes Dev
 
1990
;
4
:
1365
73
.

46.

Stelzer
 
AC
,
Frank
 
AT
,
Kratz
 
JD
, et al.  
Discovery of selective bioactive small molecules by targeting an RNA dynamic ensemble
.
Nat Chem Biol
 
2011
;
7
:
553
9
.

47.

Ronsard
 
L
,
Rai
 
T
,
Rai
 
D
, et al.  
In silico analyses of subtype specific HIV-1 Tat-TAR RNA interaction reveals the structural determinants for viral activity
.
Front Microbiol
 
2017
;
8
.

48.

Puglisi
 
JD
,
Tan
 
R
,
Calnan
 
BJ
, et al.  
Conformation of the TAR RNA-arginine complex by NMR spectroscopy
.
Science
 
1992
;
257
:
76
80
.

49.

Qi
 
L
,
Zhang
 
J
,
He
 
T
, et al.  
Probing interaction of a fluorescent ligand with HIV TAR RNA
.
Spectrochim Acta A Mol Biomol Spectrosc
 
2017
;
173
:
93
8
.

50.

Li
 
J
,
Wang
 
M
,
Fan
 
Y-Y
, et al.  
Ligand-RNA interaction assay based on size-selective fluorescence core-shell nanocomposite
.
Anal Bioanal Chem
 
2020
;
412
:
7349
56
.

51.

Ribeiro
 
MT
,
Singh
 
S
,
Guestrin
 
C
.
‘Why should I trust you?’: Explaining the predictions of any classifier
.
2016
.

52.

Gaulton
 
A
,
Hersey
 
A
,
Nowotka
 
M
, et al.  
The ChEMBL database in 2017
.
Nucleic Acids Res
 
2017
;
45
:
D945
54
.

53.

Zhao
 
Z
,
Bourne
 
PE
.
Structural insights into the binding modes of viral RNA-dependent RNA polymerases using a function-site interaction fingerprint method for RNA virus drug discovery
.
J Proteome Res
 
2020
;
19
:
4698
705
.

54.

Mittal
 
L
,
Kumari
 
A
,
Srivastava
 
M
, et al.  
Identification of potential molecules against COVID-19 main protease through structure-guided virtual screening approach
.
J Biomol Struct Dyn
 
2021
;
39
:
3662
80
.

55.

Baell
 
JB
,
Holloway
 
GA
.
New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays
.
J Med Chem
 
2010
;
53
:
2719
40
.

56.

Réau
 
M
,
Langenfeld
 
F
,
Zagury
 
J-F
, et al.  
Decoys selection in benchmarking datasets: overview and perspectives
.
Front Pharmacol
 
2018
;
9
:11.

57.

Evenseth
 
LM
,
Warszycki
 
D
,
Bojarski
 
AJ
, et al.  
In silico methods for the discovery of orthosteric GABAB receptor compounds
.
Molecules
 
2019
;
24
:
935
.

58.

Zhou
 
D
,
Liu
 
F
,
Zheng
 
Y
, et al.  
Deffini: a family-specific deep neural network model for structure-based virtual screening
.
Comput Biol Med
 
2022
;
151
:
106323
.

59.

Berthold
 
MR
,
Cebron
 
N
,
Dill
 
F
, et al.  
KNIME: the Konstanz information miner
.
Data Anal Mach Learn Appl
 
2008
;
319
26
.

60.

Pettersen
 
EF
,
Goddard
 
TD
,
Huang
 
CC
, et al.  
UCSF chimera–a visualization system for exploratory research and analysis
.
J Comput Chem
 
2004
;
25
:
1605
12
.

61.

Auslander
 
N
,
Gussow
 
AB
,
Koonin
 
EV
.
Incorporating machine learning into established bioinformatics frameworks
.
Int J Mol Sci
 
2021
;
22
:
2903
.

62.

Shwartz-Ziv
 
R
,
Armon
 
A
.
Tabular data: deep learning is not all you need
.
Inf Fusion
 
2022
;
81
:
84
90
.

63.

Li
 
H
,
Tian
 
S
,
Li
 
Y
, et al.  
Modern deep learning in bioinformatics
.
J Mol Cell Biol
 
2020
;
12
:
823
7
.

64.

Larrañaga
 
P
,
Calvo
 
B
,
Santana
 
R
, et al.  
Machine learning in bioinformatics
.
Brief Bioinform
 
2006
;
7
:
86
112
.

65.

Eibe
 
F
,
Hall
 
MA
,
Witten
 
IH
.
The WEKA Workbench. Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques
.
Morgan Kaufmann, MA, USA
,
2016
.

66.

LeDell
 
E
,
Poirier
 
S
. H2O AutoML: Scalable automatic machine learning. In:
7th ICML Workshop on Automated Machine Learning, AutoML
 
2020
.

67.

Pedregosa
 
F
,
Varoquaux
 
G
,
Gramfort
 
A
.
Scikit-learn: machine learning in python
.
J Mach Learn Res
 
2011
;
12
:
2825
30
.

68.

Akiba
 
T
,
Sano
 
S
,
Yanase
 
T
, et al.  Optuna: a Next-generation hyperparameter optimization framework. In:
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
, Association for Computing Machinery, New York, NY, USA,
2019
. p.
2623
31

69.

Dorogush
 
AV
,
Ershov
 
V
,
Gulin
 
A
.
CatBoost: gradient boosting with categorical features support
.
ArXiv Prepr. ArXiv181011363
.
2018
.

70.

Lundberg
 
SM
,
Erion
 
G
,
Chen
 
H
, et al.  
From local explanations to global understanding with explainable AI for trees
.
Nat Mach Intell
 
2020
;
2
:
56
67
.

71.

Waskom
 
ML
.
seaborn: statistical data visualization
.
J Open Source Softw
 
2021
;
6
:
3021
.

72.

Virtanen
 
P
,
Gommers
 
R
,
Oliphant
 
TE
, et al.  
SciPy 1.0: fundamental algorithms for scientific computing in python
.
Nat Methods
 
2020
;
17
:
261
72
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)

Supplementary data