Advancing microRNA target site prediction with transformer and base-pairing patterns

Introduction

MicroRNAs (miRNAs) are short non-coding RNAs, approximately 23 nucleotides long, critical in regulating gene expression at the post-transcriptional level (1,2). As part of the miRNA-induced silencing complex (miRISC), miRNAs form associations with Argonaute (AGO) proteins, facilitating their binding to target mRNA and leading to destabilization or translational repression (3). The specificity of miRNA targeting is primarily attributed to consecutive Watson–Crick (WC) base pairing between the miRNA seed region (positions 2–8) and complementary sites within the 3′ untranslated regions (3′UTRs) of target mRNAs (4,5). The seed region contains well-known ‘canonical sites’, including 8mer, 7mer-m8, 7mer-A1 and 6mer, characterized by perfect WC pairings (6). However, recent advances, particularly in high-throughput sequencing techniques like CLIP-seq, suggest that perfect seed matches are neither necessary nor sufficient for the functionality of miRNA-target identification (7–9). It has been observed that imperfect base pairing also leads to a decrease in mRNA levels (10,11). These instances are typically referred to as ‘non-canonical sites,’ as depicted in Figure 1A. For instance, functional interactions can tolerate a limited number of wobble pairs, bulges and mismatches within the seed region (12,13). ‘Centered sites’ are a class of contiguous base-pairing of 11–12 nucleotides that starts at the third or fourth nucleotide and extends to the central region of the miRNA (14). In addition, wobble or mismatch in the seed can be compensated by pairing with the 3′ region (positions 13–16) (15,16). Even perfect seed pairing may also be enhanced by complementary interactions within the 3′ region (17). Despite the discovery of these seed base pairings, fully unraveling the functional mechanism of miRNAs remains a challenge. One of the critical issues is to decipher the complexity of the miRNA regulatory network because one miRNA can target multiple different mRNAs and one mRNA can be regulated by many different miRNAs (18). Therefore, determining which miRNAs interact with which mRNAs, i.e. identifying miRNA targets, continues to be an important focus of current research.

Figure 1.

(A) Figure depicts several examples of both canonical and non-canonical sites in miRNA targeting. (B) Figure presents the framework of Mimosa, a dual-input model that specifically integrates base-pairing patterns into the model inputs. (C) Figure outlines the gene-level prediction process for miRNA–mRNA pairs, highlighting that our model is trained on site-level samples (miRNA–MBS pairs). (D) Figure illustrates the traceback process in our alignment, as well as the detailed base-pairing pattern detected.

Over the past two decades, a number of computational approaches have been developed for miRNA target identification, driven by their cost-effectiveness and easy accessibility (19). These approaches predominantly rely on manual feature engineering for decision-making, focusing on gene-level predictions and incorporating site-level features such as seed matching (20–22). Typically, they support a limited array of seed match types, including 8mer, 7mer-m8, 7mer-A1, 6mer and offset 6mer, all considered canonical sites. In contrast, deep learning approaches have shown a remarkable ability to automatically discern intricate data patterns compared to those reliant on feature engineering (23–28). For instance, Lee et al. introduced deepTarget, a model utilizing auto-encoders and stacked recurrent neural networks to analyze sequence interactions (29). Similarly, Pla et al. developed miRAW, a deep learning model with eight hidden layers designed for learning features and assessing functionality, with a particular focus on non-canonical sites (30). This model uses a pre-selection step for candidate target sites (CTSs) to widen the range of potential miRNA targets. Subsequently, TargetNet expanded the selection criteria for CTS and leveraged the ResNet architecture to enhance the predictive performance (31). Despite the challenges in characterizing non-canonical sites due to their variable base-pairing and conservation, they are recognized for their moderate role in gene suppression, potentially due to a reduced number of AGO–miRNA complexes or weaker binding affinity (32–34). The interplay between canonical and non-canonical sites may offer a refined control over gene suppression mechanisms (35). Therefore, the emphasis on the identification of non-canonical sites, as seen with miRAW and TargetNet, is crucial for advancing predictive models. As an additional selection step prior to model prediction, the CTS selection captures all canonical sites while potentially ignoring non-canonical sites that lack predefined criteria but possess significant interaction potential. This underscores the need to enhance the predictive capability to identify more non-canonical sites without predefined criteria.

In this study, we proposed Mimosa, a deep learning framework for the identification of miRNA targets that shifts away from traditional CTS selection methods. The core innovation of Mimosa lies in its capacity to autonomously identify non-canonical binding sites by incorporating base-pairing patterns directly into the model’s training. This is achieved using a dynamic programming algorithm with a scoring strategy based on pairing stability, enabling the identification of optimal local alignments and the creation of base-pairing embeddings for input into the model. Such integration enables Mimosa to autonomously recognize and classify various types of non-canonical sites without manual pre-specification of pairing configurations. The foundation of Mimosa is the Transformer architecture, a robust deep learning framework renowned for its efficacy in addressing long-range dependencies in natural language processing (NLP) tasks (36). The Transformer’s ability to assimilate global information through its self-attention mechanism has not only led to remarkable successes in NLP but has also inspired a plethora of bioinformatics tools, thanks to the conceptual parallels between biological sequences and linguistic constructs (37). Mimosa harnesses this architecture to extract and integrate sequence context, positional information and base-pairing interactions, thus functioning as a standalone and independent tool that does not rely on any third-party software. This design ensures streamlined, end-to-end predictions and optimizes the user journey. To further this goal, we have developed a user-friendly web server for Mimosa, designed to provide easier access to advanced miRNA target identification.

Materials and methods

The benchmark dataset

This study employed a benchmark dataset compiled from experimentally validated miRNA–mRNA functional interactions (30), obtained from Diana TarBase (6) and MirTarBase (38). The dataset initially consists of 303912 positive and 1096 negative gene-level interaction entries for Homo sapiens. Site-specific binding data from PAR-Clip (39) and CLASH (40) experiments were integrated to enhance the dataset with precise interaction locations. In addition, thermodynamic properties, including minimum free energy (MFE) and species conservation, were incorporated to refine the dataset. Consequently, the benchmark is structured into a training set at the site-level, comprising 58793 miRNA and miRNA pairing site (MBS) pairs. Each MBS consists of a 30-nt core pairing region flanked by an additional five nucleotides on either side. Moreover, ten test sets at the gene-level were established, each containing 548 positive and 548 negative miRNA–mRNA pairs. It is worth noting that ‘mRNA’ specifically refers to 3′UTRs, the primary region for miRNA binding. Although 5′UTRs and open reading frame (ORF) regions may facilitate some interactions, they are not the focus of the model’s development. Notably, this study adopted TargetNet’s methodology to divide the training set into a training subset, comprising 26 995 positives and 27 469 negatives, and a validation subset for parameter optimization, containing 2193 positives and 2136 negatives. This approach ensures consistent and comparative analysis of the model’s effectiveness.

The overall framework of Mimosa

To provide a clear understanding of Mimosa’s framework, we focus on a single miRNA–MBS pair, as depicted in Figure 1B. Mimosa is designed to process such pairs separately, requiring distinct inputs for the miRNA and MBS. The generation of these inputs initiates with the tokenization of sequences, assigning tokens for A (adenine), C (cytosine), G (guanine), U (uracil) and an additional padding token X. This standardizes the lengths of miRNA and MBS inputs to 30 and 40 tokens, respectively. Subsequently, the inputs undergo refinement through the integration of three types of embeddings: context embedding (⁠|${E}_{{\rm sequence}}$|⁠), which captures contextual relevance through a high-dimensional embedding transformation; positional embedding (⁠|${E}_{{\rm position}}$|⁠), initialized from a normal distribution to accurately represent the positions of nucleotides; and base-pairing embedding (⁠|${E}_{{\rm pairing}}$|⁠), calculated from sequence alignment procedure to encode base pairing pattern. The inputs for miRNA and MBS are then formed by summing these embeddings, as follows:

$$\begin{eqnarray*}{I}^{mi} = E_{sequence}^{mi} + E_{position}^{mi} + E_{pairing}^{mi}\end{eqnarray*}$$

$$\begin{eqnarray*}{I}^{MBS} = E_{sequence}^{MBS} + E_{position}^{MBS} + E_{pairing}^{MBS}\end{eqnarray*}$$

Mimosa’s model employs two separate Transformer encoder modules to process inputs and extract critical attributes from miRNA–MBS pairs. Each encoder composes multiple blocks, incorporating a self-attention mechanism and a feed-forward network for in-depth sequence analysis. Subsequently, a cross-attention layer integrates miRNA features to identify essential elements within the MBS. The results from this layer are then passed through linear layers, culminating in the final predictions for each miRNA–MBS pair. During testing, Mimosa accommodates miRNA–mRNA pairs with mRNA lengths ranging from 6-nt to 32870-nt. As illustrated in Figure 1C, the model addresses this variability by employing a sliding window technique to generate subsets of longer mRNAs, consisting of 40-nt MBS segments. For mRNAs shorter than 40-nt, padding is applied. This systematic use of the sliding window and padding ensures consistent input lengths, which are crucial for the transitioning from site-level to gene-level predictions. If at least one MBS segment is predicted to form a functional pair with the miRNA, the entire mRNA is considered capable of functional interaction.

Sequence alignment

To tackle the intricate non-canonical base-pairing patterns, our model incorporates these patterns directly into its training process. This enables the model to discern and derive rules from a diverse array of base-pairing patterns. Our approach employs the Smith-Waterman algorithm, a dynamic programming method renowned for optimal local alignment, to perform sequence alignment and extract base-pairing patterns for miRNA–MBS pairs. Widely employed in biological sequence similarity detection (41,42), this algorithm assigns similarity scores between nucleotides of two sequences. In our study, however, we have adapted the algorithm to assign scores based on the stability of base-pairing. We classify base pairings into three categories: WC pairings (A-U and C-G), recognized for their most thermodynamic stability, are assigned a score of 1; wobble pairings (G-U), though atypical, hold functional significance and receive a score of 0; non-binding states such as gaps and mismatches are scored as -1 to indicate their instability. This alignment process involves initializing a scoring matrix and populating it based on our predefined scoring scheme (Supplementary Algorithm S1). A systematic traceback process then allows for the determination of the optimal pairing pattern, as depicted in Figure 1D. Our alignment specifically targets the extended seed region, encompassing miRNA positions 1–10 in the 5′-to-3′ direction and MBS positions 1–10 in the 3′to-5′ direction, a critical region highlighted in previous studies (30,31). This customized sequence alignment rapidly and precisely identifies base-pairing patterns, enriching the sequence characteristics and enhancing the model’s learning capability.

The network architecture

Transformer encoder

To effectively process sequence inputs, Mimosa integrates the Transformer encoder module into its architecture. The encoder consists of multiple identical blocks, each featuring a multi-head self-attention sub-layer as its core component, crucial for highlighting inter-token relationships. This sub-layer operates by first deriving three key matrices from the input |$X$|⁠: |$Q$| (Query), |$K$| (Key) and |$V$| (Value), as denoted by |$Q = X{W}^Q$|⁠, |$K = X{W}^K$| and |$V = X{W}^V$|⁠, where |${W}^Q$|⁠, |${W}^K$| and |${W}^V$| are learnable parameters. The attention scores for the tokens in |$X$| are calculated using the formula |$Attention( {Q,K,V} ) = softmax( {\frac{{Q{K}^T}}{{\sqrt {{d}_k} }}} )V$|⁠, where |${d}_k$| denotes the dimension of |$K$|⁠. The term ‘multi-head’ refers to the execution of multiple such attention operations in parallel, with the fusion of these results presented in the following equation:

$$\begin{eqnarray*}&MultiHeadAttention\left( {Q,K,V} \right) = Concat\left( {{head}_1},\ldots, {{head}_h} \right){W}^o\nonumber\\ &where\ hea{d}_i = Attention\left( {{Q}_i,{K}_i,{V}_i} \right)\end{eqnarray*}$$

In addition to the multi-head self-attention mechanism, each block in the encoder incorporates a feed-forward network. This network consists of two linear transformations with rectified linear unit (ReLU) activation, defined as:

$$\begin{eqnarray*}FeedForward\left( X \right) = \max \left( {0,X{W}_1 + {b}_1} \right){W}_2 + {b}_2\end{eqnarray*}$$

where |${W}_1$|⁠, |${W}_2$|⁠, |${b}_1$| and |${b}_2$| are learnable parameters. Following both the multi-head self-attention and the feed-forward network, a combination of residual connections and layer normalization is applied, alleviating the vanishing gradient problem and accelerating convergence. This implementation can be illustrated as follows:

$$\begin{eqnarray*}LayerNorm\left( {X + MultiHeadAttention\left( X \right)} \right)\end{eqnarray*}$$

$$\begin{eqnarray*}LayerNorm\left( {X + FeedForward\left( X \right)} \right)\end{eqnarray*}$$

Cross-attention layer

The cross-attention in our model has been utilized to integrate outputs from two separate encoders. Fundamentally different from the self-attention described above, which calculates attention scores by using one data set for both the query and the key, cross-attention uniquely assigns scores by using one data set as the query and referencing a different data set as the key. This method effectively draws connections between different data sources, thereby enhancing the overall performance of the model. The specifics of this operation are defined as follows:

$$\begin{eqnarray*}Q = X \times {W}^Q\end{eqnarray*}$$

$$\begin{eqnarray*}K^{\prime} = X^{\prime} \times {W}^K\end{eqnarray*}$$

$$\begin{eqnarray*}V^{\prime} = X^{\prime} \times {W}^V\end{eqnarray*}$$

$$\begin{eqnarray*}CrossAttention\left( {Q,K^{\prime},V^{\prime}} \right) = softmax\left( {\frac{{Q{{K^{\prime}}}^T}}{{\sqrt {{d}_{k^{\prime}}} }}} \right)V^{\prime}\end{eqnarray*}$$

Here, |$X$| denotes the query input and |$X^{\prime}$| represents reference input, with the cross-attention output dimensionally consistent with |$X$|⁠.

Training and evaluation strategies

In this study, we manually fine-tuned Mimosa’s hyperparameters to enhance its performance. The embedding dimensions for |${E}_{{\rm sequence}}$|⁠, |${E}_{{\rm position}}$| and |${E}_{pairing}$| were set to 64. The encoder was structured with 16 blocks, each featuring 8-head self-attention. A similar design was adopted for the cross-attention layer, which consists of 16 blocks with 8 heads each. The training parameters include a learning rate of 1e-4, a batch size of 256, and the Adam optimization algorithm with 1e-5 weight decay. To avoid overfitting, a dropout rate of 0.1 was integrated into each encoder module. For loss calculation, we applied binary cross-entropy, represented by the formula:

$$\begin{eqnarray*}Loss = - ({y}_F \cdot log\left( {{p}_F} \right) + {y}_{NF} \cdot log\left( {{p}_{NF}} \right)\end{eqnarray*}$$

where |${y}_F$| and |${y}_{NF}$| represent the truth labels for functional and non-functional, respectively, and |${p}_F$| and |${p}_{NF}$| are their prediction probabilities. The model underwent 40 epochs of training, with the version exhibiting the minimum loss selected as the final model.

We evaluated the model using six commonly employed metrics, derived from the counts of true positives (TP), true negatives (TN), false positives (FN) and false positives (FP). These metrics include accuracy, F1 score, recall, specificity, positive predictive value (PPV) and negative predictive value (NPV). The definitions of these metrics are as follows: accuracy is calculated as |$( {TP + TN} )/( {TP + FP + TN + FN} )$|⁠, F1 score is calculated as |$2TP/( {2TP + FP + FN} )$|⁠, recall as |$TP/( {TP + FN} )$|⁠, specificity as |$TN/( {TN + FP} )$|⁠, PPV as |$TP/( {TP + FP} )$| and NPV as |$TN/( {FN + TN} )$|⁠.

Results and discussion

In-depth analysis of model architecture

This section delves into the architecture of Mimosa, highlighting its pivotal components and strategies across four domains: the integration of base-pairing patterns, the formulation of a base-pairing detection strategy, the selection of input processing feature extractors and the amalgamation of these extracted features. Initially, our approach of solely utilizing conventional contextual and positional encoding for input data did not facilitate model convergence, a challenge we attributed to the compact data size and the brevity of sequences involved. The incorporation of base-pairing patterns notably enhanced model performance, leading to a 16.22% increase in accuracy and a 9.65% improvement in F1 score (refer to Figure 2A). We explored three distinct scoring strategies for detecting base-pairings: (i) assigning a score of 1 to both WC and wobble pairings, while treating gaps and mismatches with a score of -1; (ii) scoring WC and wobble pairing as 1, with gaps and mismatches scored as 0; (iii) attributing a score of 1 to WC pairings, 0 to wobble pairings, and -1 to gaps and mismatches. The scoring system that most accurately reflected interaction stability delivered substantial performance enhancements, improving accuracy by a minimum of 5.89% and the F1 score by 1.8% (shown in Figure 2B). In our comparison of input processing mechanisms, we scrutinized the complete Transformer model (Transformer-complete), its encoder component (Transformer-encoder), and Bert, an advanced NLP model predicated on the Transformer encoder (43). The Transformer-encoder emerged as the optimal choice due to its superior accuracy, F1 score and NPV, as depicted in Figure 2C. During the feature integration phase, employing cross-attention markedly outperformed simple tensor concatenation, specifically demonstrating a 7% accuracy increment. This finding positioned cross-attention as a cornerstone of Mimosa’s architectural framework (illustrated in Figure 2D).

Figure 2.

Analysis of Mimosa’s model architecture components. (A) Figure Illustrates the impact of different input encodings, underscoring the critical role of integrating base-pairing patterns for enhanced model performance. (B) Figure analyzes three distinct scoring strategies for base-pairing patterns, categorizing them into Watson–Crick (WC) pairings and Wobble (Wo) pairing, to determine the most effective approach. (C) Figure evaluates three different structures for processing inputs, with the Transformer-encoder emerging as the superior choice for accuracy and performance. (D) Figure explores various techniques for merging outputs from the miRNA and MBS Transformer-encoders, demonstrating the effectiveness of cross-attention mechanisms. (E) Figure describes an ablation study aimed at evaluating the comparative advantages of dual-input versus single-input models in terms of predictive accuracy and generalization.

To ascertain whether the integration of dual inputs is imperative for our model, and to evaluate the feasibility of transitioning to a single-input system, we conducted an ablation study. This study entailed retraining our model separately with only miRNA inputs and then only MBS inputs, to compare the individual performance against the dual-input setup. Results from this investigation indicated that the model trained solely with MBS inputs yielded superior performance relative to the dual-input structure during the training phase, as evidenced by the results in Supplementary Table S1. However, this superior performance was not sustained into the testing phase, a discrepancy highlighted in Figure 2E. This discrepancy is attributed to potential overfitting during the training phase, which seemed to compromise the model’s generalization ability to unseen data. Conversely, the dual-input model displayed steady and uniform predictive performance across both the validation and test datasets. This consistency underscores the model’s robust generalization capability and affirms the architectural superiority of the dual-input model.

Performance evaluation of mimosa

Given that 99.38% of mRNA sequences in our test sets exceeded 40 nucleotides, the selection of an appropriate step size for the sliding window was critical. The determination of Mimosa’s default step size was based on a thorough analysis of its impact on performance. Our experiments involved exploring step sizes ranging from 1 to 40 across all test sets. As illustrated in Figure 3A, accuracy fluctuated between 0.7354 and 0.7865, and the F1 score ranged from 0.7633 to 0.8109, indicating a marginal difference of approximately 5% for both metrics. We observed a slight positive trend in both accuracy and F1 score with step sizes smaller than 6, transitioning to a more irregular pattern at larger step sizes. Although the precise reasons for this variability remain unclear, it is evident that increasing step size results in less stable model performance. To facilitate comparison and optimize computation efficiency, all comparative experiments in this study utilized a step size of 5, unless otherwise specified.

Figure 3.

Evaluation of Mimosa’s predictive performance. (A) Analysis of step-size variation: this min-to-max boxplot illustrates the fluctuations in accuracy and F1 score across ten test sets, highlighting the impact of step size on the sliding window technique. (B) Model comparison: comparison of Mimosa’s metrics with other deep learning models. (C) CTS selection analysis: evaluation of the effect of different CTS criteria on Mimosa’s accuracy and F1 score.

To assess the effectiveness of Mimosa, we conducted a comparative analysis against state-of-the-art deep learning approaches, including deepTarget, miRAW and TargetNet. Using the data collected from a previous study (31), we examined the average performance of each model on ten test sets. As depicted in Figure 3B, deepTarget displayed significant variability in its metrics, exhibiting the highest specificity but the lowest recall. Conversely, miRAW demonstrated the least variation in metrics, albeit with lower accuracy and F1 score compared to TargetNet. Mimosa surpassed TargetNet in most metrics, although with a slightly weaker recall. Notably, Mimosa achieved the highest NPV, F1 score and accuracy among the four models, highlighting its superior predictive capabilities. To ensure equitable comparison with TargetNet, Mimosa’s performance was also assessed using a step size of 1, consistent with TargetNet’s methodology (Supplementary Table S2). Despite a minor decrease in F1 score and accuracy, Mimosa consistently outperformed TargetNet across all assessed metrics.

It is reasonable to deduce that Mimosa’s intricate base-pairing analysis obviates the necessity for pre-selecting CTS prior to prediction. To verify this assertion, we investigated the impact of employing CTS selection on performance. Specifically, we applied three CTS selection criteria outlined in (24) and compared them against the scenario without CTS selection. These criteria included (i) miRAW-6–1:10, necessitating a minimum of 6 base pairs within positions 1–10; (ii) miRAW-7–1:10, requiring at least 7 base pairs within positions 1–10; and (iii) miRAW-7–2:10, mandating a minimum of 7 base pairs within positions 2–10, encompassing both WC and wobble pairings. Comparative results presented in Figure 3C illustrate that while the selection criteria marginally impact accuracy and F1 score, stricter criteria notably diminish these metrics, especially when compared to the scenario without pre-selection criteria. This decline may stem from inadvertently filtering out viable miRNA targets during the pre-selection process. In summary, Mimosa’s efficacy remains unaffected by CTS selection criteria.

Effect of binding free energy

The binding free energy (⁠|$\vartriangle G$|⁠) between miRNA and its target sites is a crucial thermodynamic parameter used to assess the stability of double-stranded molecular structures, where lower |$\vartriangle G$| values indicate more stable bindings. |$\vartriangle G$| has been widely employed as a key filtering criterion in various established approaches, typically employing a threshold between -17 and -12 kcal/mol (19). To explore the influence of |$\vartriangle G$| filtering on the Mimosa model, we embarked on a thorough re-assessment. This re-assessment entailed the extension of our functional determination criteria to include not only a predictive probability exceeding 0.5 but also a |$\vartriangle G$| value below a specified threshold. This threshold underwent rigorous testing across a range from -20 to -1 kcal/mol. |$\vartriangle G$| calculations were performed using the RNAduplex software (44). Figure 4A illustrates how accuracy and F1 score vary with changes in |$\vartriangle G$|⁠, noting that a |$\vartriangle G$| value of 0 corresponds to Mimosa’s baseline prediction without |$\vartriangle G$| filtering. We observed a significant decline in both accuracy and F1 score as the |$\vartriangle G$| value became less negative, suggesting that a more stringent criterion reduces the identification of functional targets. Additionally, we found that |$\vartriangle G$| thresholds ranging between -10 and 0 had minimal impact on our test results, possibly due to our evaluation strategy relying on identifying at least one site-level target to define the function of the entire sequence. To further investigate the potential impact of such changes in site-level targets, we randomly selected six miRNA–mRNA pairs and analyzed the variation in the number of site-level targets with |$\vartriangle G$|⁠, as shown in Figure 4B. It became evident that the |$\vartriangle G$| filtering reduced the number of predicted site-level targets, thereby decreasing the potentially high false-positive rates associated with sequence-based predictions. We also found that this △G filtering was ineffective for Mimosa’s negative predictions. For instance, for ENSG00000168172, which is 12041 nucleotides long, Mimosa did not identify any target sites for hsa-miR-941. Since no sites were identified initially, applying △G filtering had no effect, as the number of target sites remained zero. Overall, although our Mimosa model excludes |$\vartriangle G$| filtering, combining predictions from multiple perspectives still holds promise.

$(A) Figure illustrates the impact of binding free energy (${\rm{\Delta }}G$) on accuracy and F1 score (average results of ten test sets), highlighting ${\rm{\Delta }}G$’s influence on gene-level predictions. (B) Figure reveals the variation in the count of predicted target sites in three cases, showing ${\rm{\Delta }}G$’s influence on site-level predictions. (C) Figure showcases Mimosa’s overall accuracy for both canonical and non-canonical sites in predictions for non-human species. (D) Figure depicts three example non-canonical sites from Mus musculus, Drosophila melanogaster and Rattus norvegicus that could be accurately identified by Mimosa.$

Figure 4.

(A) Figure illustrates the impact of binding free energy (⁠|${\rm{\Delta }}G$|⁠) on accuracy and F1 score (average results of ten test sets), highlighting |${\rm{\Delta }}G$|’s influence on gene-level predictions. (B) Figure reveals the variation in the count of predicted target sites in three cases, showing |${\rm{\Delta }}G$|’s influence on site-level predictions. (C) Figure showcases Mimosa’s overall accuracy for both canonical and non-canonical sites in predictions for non-human species. (D) Figure depicts three example non-canonical sites from Mus musculus, Drosophila melanogaster and Rattus norvegicus that could be accurately identified by Mimosa.

Mimosa’s prediction across different species

Considering the evolutionary conservation of miRNA regulatory mechanisms across species, we evaluated the efficacy of Mimosa in predicting miRNA targets in non-human organisms. We curated a dataset comprising 1156 experimentally validated miRNA-target pairs from 12 cellular organisms sourced from the MirTarBase database (45). These pairs were selected based solely on the species of target gene, ensuring they were non-human. This resulted in 1131 pairs where the miRNA species and target gene species were consistent, and 25 pairs where the miRNA and target gene species were inconsistent. Our analysis revealed that Mimosa demonstrated commendable performance by accurately predicting 955 out of the 1156 pairs, resulting in an accuracy rate of 82.61%. We further categorized these targets into canonical and non-canonical categories for a more nuanced analysis. Canonical sites encompassed patterns such as 8mer, 8mer-A1, 7mer-m8, 7mer-A1, 6mer (p1-p6), 6mer and 6mer (p3-p8) (detailed in Supplementary Table S3), with pairs not conforming to these patterns classified as non-canonical. The statistical outcomes of Mimosa’s predictions for each species are presented in Table 1. Of the total 880 canonical pairs, Mimosa successfully predicted 750, achieving an accuracy of 85.23%. Despite the relative scarcity of non-canonical pairs, totaling 276, Mimosa maintained a high accuracy of 74.28%, identifying 205 of these pairs, as illustrated in Figure 4C. We infer that the slightly lower performance at non-canonical sites may be attributed to the limited representation of non-canonical patterns in the training dataset, thereby constraining the model's ability to learn a broader range of patterns. Nevertheless, Mimosa can be explored as a valuable tool for identifying non-canonical sites, as shown in Figure 4D, which highlights its applicability and reliability across multiple species.

Table 1.

Open in new tab

Statistical summary of Mimosa's prediction results across different non-human species

Species	Total	Predicted total	Canonical total	Predicted canonical total	Non-canonical total	Predicted non-canonical total
Mus musculus	747	618	556	478	191	140
Rattus norvegicus	210	174	169	143	41	31
Drosophila melanogaster	64	45	53	38	11	7
Danio rerio	51	41	40	35	11	6
Caenorhabditis elegans	28	25	18	15	10	10
Gallus gallus	18	16	14	12	4	4
Sus scrofa	16	15	15	14	1	1
Bos taurus	12	11	10	10	2	1
Arabidopsis thaliana	3	3	2	2	1	1
Bombyx mori	3	3	NA	NA	3	3
Macaca mulatta	2	2	2	2	NA	NA
Taeniopygia guttata	2	2	1	1	1	1

Species	Total	Predicted total	Canonical total	Predicted canonical total	Non-canonical total	Predicted non-canonical total
Mus musculus	747	618	556	478	191	140
Rattus norvegicus	210	174	169	143	41	31
Drosophila melanogaster	64	45	53	38	11	7
Danio rerio	51	41	40	35	11	6
Caenorhabditis elegans	28	25	18	15	10	10
Gallus gallus	18	16	14	12	4	4
Sus scrofa	16	15	15	14	1	1
Bos taurus	12	11	10	10	2	1
Arabidopsis thaliana	3	3	2	2	1	1
Bombyx mori	3	3	NA	NA	3	3
Macaca mulatta	2	2	2	2	NA	NA
Taeniopygia guttata	2	2	1	1	1	1

Table 1.

Open in new tab

Statistical summary of Mimosa's prediction results across different non-human species

Species	Total	Predicted total	Canonical total	Predicted canonical total	Non-canonical total	Predicted non-canonical total
Mus musculus	747	618	556	478	191	140
Rattus norvegicus	210	174	169	143	41	31
Drosophila melanogaster	64	45	53	38	11	7
Danio rerio	51	41	40	35	11	6
Caenorhabditis elegans	28	25	18	15	10	10
Gallus gallus	18	16	14	12	4	4
Sus scrofa	16	15	15	14	1	1
Bos taurus	12	11	10	10	2	1
Arabidopsis thaliana	3	3	2	2	1	1
Bombyx mori	3	3	NA	NA	3	3
Macaca mulatta	2	2	2	2	NA	NA
Taeniopygia guttata	2	2	1	1	1	1

Species	Total	Predicted total	Canonical total	Predicted canonical total	Non-canonical total	Predicted non-canonical total
Mus musculus	747	618	556	478	191	140
Rattus norvegicus	210	174	169	143	41	31
Drosophila melanogaster	64	45	53	38	11	7
Danio rerio	51	41	40	35	11	6
Caenorhabditis elegans	28	25	18	15	10	10
Gallus gallus	18	16	14	12	4	4
Sus scrofa	16	15	15	14	1	1
Bos taurus	12	11	10	10	2	1
Arabidopsis thaliana	3	3	2	2	1	1
Bombyx mori	3	3	NA	NA	3	3
Macaca mulatta	2	2	2	2	NA	NA
Taeniopygia guttata	2	2	1	1	1	1

Operating mimosa

To enhance accessibility for the scientific community, we have made available a user-friendly webserver for Mimosa dedicated to binary interaction predictions for miRNA–mRNA pairs. This webserver accommodates the processing of up to ten pairs simultaneously, allowing sequences of up to 8000-nt in length. Its interface provides two separate textboxes for submitting miRNA and mRNA sequences in FASTA format, ensuring each pair is matched correctly. Furthermore, our platform offers specialized functionality targeting the 3′UTR region within complete mRNA sequences. Users can designate ‘3′UTR’ as the predictive region from a dropdown menu, enabling the system to identify and utilize the longest 3′UTR segment for prediction. To optimize usability, we have set the default step size of the sliding window at five, though users have the option to customize this setting according to their preferences. The results page delivers a binary predictive label for each query pair, indicating whether the interaction is functional or non-functional. For a detailed exploration of target sites and specific base pairing patterns, users can visit our code repository.

Exploring miR-BART6-5p targeting dicer: a case study

Epstein-Barr virus (EBV) is a member of the γ-herpesvirus family associated with various lymphoid and epithelial malignancies (46). It infects over 90% of the global adult population, persisting as a lifelong infection. Previous studies have indicated that EBV-infected cells express abundant EBV miRNAs, with more than 40 mature EBV miRNAs identified to date (47). EBV-encoded miRNAs have been found to be able to hijack intracellular pathways related to the synthesis and maturation of both viral and host miRNAs. For example, ebv-miR-BART6-5p (5′-UAAGGUUGGUCCAAUCCAUAGG-3′) targets the 3′ UTR of human Dicer mRNA, a key player in miRNA production, thereby affecting miRNA maturation and causing global suppression of these molecules (48). Interestingly, these target sites are not conserved in mouse or chimpanzee Dicer mRNAs, highlighting EBV’s host specificity, which is critical for viral persistence and maintenance of latent infection. In this section, we utilized Mimosa to analyze the targeting of ebv-miR-BART6-5p on the Dicer mRNA 3′UTR. Mimosa identified 191 functional segments within the Dicer 3′UTR using a size step of 1. These segments were evenly distributed across the Dicer 3′UTR, with predominantly non-canonical and only twelve canonical segments. As depicted in Figure 5, our analysis revealed a shared non-canonical site within a target-dense interval, featuring a wobble pairing and a bulge, suggesting potential targeting by ebv-miR-BART6-5p. Among the twelve canonical functional segments, we identified seven non-overlapping target sites (please refer to Supplementary Table S4 for more details). Remarkably, four of the seven canonical sites identified by Mimosa have been validated as unique to the human Dicer 3′UTR using the Luciferase Reporter technology (48), exhibiting strong interaction characteristics, particularly with 7mer-A1 types. The remaining three sites display weaker interactions, characterized by 6mer and offset 6mer types. Overall, Mimosa demonstrates the comprehensive predictive capability for both canonical and non-canonical target sites of ebv-miR-BART6-5p on the Dicer 3′UTR. This analysis provides valuable insights to support and enhance research into EBV miRNA functions.

Figure 5.

Predicted ebv-miR-BART6-5p (a miRNA of the EBV) target sites by Mimosa within the Dicer mRNA 3′UTR, meticulously highlights a non-canonical site alongside all seven canonical sites.

Conclusion

miRNAs serve as pivotal regulatory factors in numerous cellular processes, exerting significant influence on cell differentiation, development and homeostasis. Identifying miRNA targets is fundamental for unraveling the functions and mechanisms of these regulatory molecules. To expedite this process, we introduced Mimosa, a Transformer-based method designed for predicting miRNA targets. A notable strength of Mimosa lies in its consideration of base-pairing patterns, which not only broadens its recognition scope to encompass both canonical and non-canonical sites but also reduces reliance on pre-selecting candidate targets during prediction. By doing so, Mimosa enables the detection of non-canonical sites that might otherwise be overlooked due to stringent pre-selection criteria. Our evaluations demonstrate that Mimosa outperforms existing deep learning models and effectively identifies miRNA targets across diverse species. Meanwhile, the user-friendly interface of Mimosa’s webserver provides a streamlined and efficient means of pinpointing target sites accurately, thereby enhancing the analysis process. We anticipate that Mimosa will emerge as a vital tool for researchers investigating miRNA-specific targets and exploring the intricate small RNA regulatory network.

It is important to acknowledge that the predictions generated by Mimosa for non-canonical sites may be subjected to a high rate of ‘false positives,’ as evidenced in our case study. MiRNA regulation is governed by a complex interplay of multiple factors, including miRNA expression levels, the assembly and activity of the miRISC, and the modulation of competing endogenous RNAs (ceRNAs). This intricate network enables miRNAs to finely tune gene expression. The target sites identified by Mimosa reflect potential regulatory mechanisms based on genetic predisposition, with non-canonical sites poised within genes, awaiting activation under appropriate conditions, such as heightened miRNA expression. However, the activation of these sites depends not only on miRNA presence but also on the synergy of other factors, including specific cellular states and external signals. Furthermore, the inherent instability of non-canonical sites complicates their detection and validation. Therefore, while Mimosa offers a predictive framework, comprehensively considering all relevant factors and their dynamic interplay remains a challenge in understanding the full complexity of miRNA interactions.

Data availability

The web server of Mimosa is publicly available at http://monash.bioweb.cloud.edu.au/Mimosa/. The source code and the data underlying this article are publicly available on GitHub (https://github.com/biyueeee/Mimosa) and Zenodo (https://doi.org/10.5281/zenodo.12702864).

Supplementary data

Supplementary Data are available at NAR Online.

Acknowledgements

We thank the members of the Song Lab at Monash University for providing valuable discussions and supports.

Funding

National Natural Science Foundation of China [62202388]; National Key Research and Development Program of China [2022YFF1000100]; Qin Chuangyuan Innovation and Entrepreneurship Talent Project [QCYRCXM-2022-230]; Chinese Universities Scientific Fund [2452024407].

Conflict of interest statement. None declared.

References

Bartel

D.P.

MicroRNAs: target recognition and regulatory functions

Cell

2009

;

136

215

–

233

Lal

Navarro

Maher

C.A.

Maliszewski

L.E.

Yan

O’Day

Chowdhury

Dykxhoorn

D.M.

Tsai

Hofmann

et al. .

miR-24 inhibits cell proliferation by targeting E2F2, MYC, and other cell-cycle genes via binding to “seedless” 3'UTR microRNA recognition elements

Mol. Cell

2009

;

610

–

625

Bartel

D.P.

Metazoan microRNAs

Cell

2018

;

173

–

Lewis

B.P.

Burge

C.B.

Bartel

D.P.

Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets

Cell

2005

;

120

–

Agarwal

Bell

G.W.

Nam

J.W.

Bartel

D.P.

Predicting effective microRNA target sites in mammalian mRNAs

eLife

2015

;

e05005

Vlachos

I.S.

Paraskevopoulou

M.D.

Karagkouni

Georgakilas

Vergoulis

Kanellos

Anastasopoulos

I.L.

Maniou

Karathanou

Kalfakakou

et al. .

DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions

Nucleic Acids Res.

2015

;

D153

–

D159

Hafner

Katsantoni

Köster

Marks

Mukherjee

Staiger

Ule

Zavolan

CLIP and complementary methods

Nat. Rev. Methods Primers

2021

;

Crossref

Hammell

Long

Zhang

Lee

Carmack

C.S.

Han

Ding

Ambros

mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein-enriched transcripts

Nat. Methods

2008

;

813

–

819

Großhans

Johnson

Reinert

K.L.

Gerstein

Slack

F.J.

The temporal patterning microRNA let-7 regulates several transcription factors at the larval to adult transition in C. elegans

Dev. Cell

2005

;

321

–

330

10.

Lim

L.P.

Lau

N.C.

Garrett-Engele

Grimson

Schelter

J.M.

Castle

Bartel

D.P.

Linsley

P.S.

Johnson

J.M.

Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs

Nature

2005

;

433

769

–

773

11.

Farh

K.K.-H.

Grimson

Jan

Lewis

B.P.

Johnston

W.K.

Lim

L.P.

Burge

C.B.

Bartel

D.P.

The widespread impact of mammalian MicroRNAs on mRNA repression and evolution

Science

2005

;

310

1817

–

1821

12.

Kim

Sung

Y.M.

Park

Kim

Park

Bae

J.Y.

Kim

Baek

General rules for functional microRNA targeting

Nat. Genet.

2016

;

1517

–

1526

13.

Broughton

J.P.

Lovci

M.T.

Huang

J.L.

Yeo

G.W.

Pasquinelli

A.E.

Pairing beyond the seed supports MicroRNA targeting specificity

Mol. Cell

2016

;

320

–

333

14.

Shin

Nam

J.W.

Farh

K.K.

Chiang

H.R.

Shkumatava

Bartel

D.P.

Expanding the microRNA targeting code: functional sites with centered pairing

Mol. Cell

2010

;

789

–

802

15.

Grimson

Farh

K.K.

Johnston

W.K.

Garrett-Engele

Lim

L.P.

Bartel

D.P.

MicroRNA targeting specificity in mammals: determinants beyond seed pairing

Mol. Cell

2007

;

–

105

16.

Brennecke

Stark

Russell

R.B.

Cohen

S.M.

Principles of microRNA-target recognition

PLoS Biol.

2005

;

e85

17.

McGeary

S.E.

Bisaria

Pham

T.M.

Wang

P.Y.

Bartel

D.P.

MicroRNA 3'-compensatory pairing occurs through two binding modes, with affinity shaped by nucleotide identity and position

eLife

2022

;

e69803

18.

Gebert

L.F.R.

MacRae

I.J.

Regulation of microRNA function in animals

Nat. Rev. Mol. Cell Biol.

2019

;

–

19.

Kern

Backes

Hirsch

Fehlmann

Hart

Meese

Keller

What's the target: understanding two decades of in silico microRNA-target prediction

Brief Bioinform

2020

;

1999

–

2010

20.

Khorshid

Hausser

Zavolan

van Nimwegen

A biophysical miRNA-mRNA interaction model infers canonical and noncanonical targets

Nat. Methods

2013

;

253

–

255

21.

Chiu

H.-S.

Llobet-Navas

Yang

Chung

W.-J.

Ambesi-Impiombato

Iyer

Kim

H.R.

Seviour

E.G.

Luo

Sehgal

et al. .

Cupid: simultaneous reconstruction of microRNA-target and ceRNA networks

Genome Res.

2015

;

257

–

267

22.

Ghoshal

Shankar

Bagchi

Grama

Chaterji

MicroRNA target prediction using thermodynamic and sequence curves

BMC Genomics

2015

;

999

23.

Peng

Wang

Guo

Gao

Song

RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins

Briefings Bioinf.

2022

;

bbac215

Crossref

24.

Guo

Wang

Pan

Guo

Webb

G.I.

Yao

Jia

Song

Clarion is a multi-label problem transformation method for identifying mRNA subcellular localizations

Brief Bioinform

2022

;

bbac467

25.

Pan

Wang

Gasser

R.B.

Purcell

A.W.

Akutsu

Webb

G.I.

Imoto

Song

et al. .

PFresGO: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships

Bioinformatics

2023

;

btad094

26.

Guo

Jin

Chen

Xiang

Song

Coin

L.J.M.

Porpoise: a new approach for accurate prediction of RNA pseudouridine sites

Brief Bioinform

2021

;

bbab245

27.

Wang

Jia

Pan

Coin

L.J.

Song

PLANNER: a multi-scale deep language model for the origins of replication site prediction

IEEE J Biomed Health Inform

2024

;

2445

–

2454

Crossref

28.

Wang

Guo

Akutsu

Webb

G.I.

Coin

L.J.M.

Kurgan

Song

ProsperousPlus: a one-stop and comprehensive platform for accurate protease-specific substrate cleavage prediction and machine-learning model construction

Brief Bioinform

2023

;

bbad372

29.

Lee

Baek

Park

Yoon

deepTarget: end-to-end learning framework for microRNA target prediction using deep recurrent neural networks

Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

2016

;

Seattle

434

–

442

30.

Pla

Zhong

Rayner

miRAW: a deep learning-based approach to predict microRNA targets by analyzing whole microRNA transcripts

PLoS Comput. Biol.

2018

;

e1006185

31.

Min

Lee

Yoon

TargetNet: functional microRNA target prediction with deep neural networks

Bioinformatics

2022

;

671

–

677

32.

Moore

M.J.

Scheel

T.K.

Luna

J.M.

Park

C.Y.

Fak

J.J.

Nishiuchi

Rice

C.M.

Darnell

R.B.

miRNA–target chimeras reveal miRNA 3′-end pairing as a major determinant of Argonaute target specificity

Nat. Commun.

2015

;

8864

33.

Loeb

G.B.

Khan

A.A.

Canner

Hiatt

J.B.

Shendure

Darnell

R.B.

Leslie

C.S.

Rudensky

A.Y.

Transcriptome-wide miR-155 binding map reveals widespread noncanonical microRNA targeting

Mol. Cell

2012

;

760

–

770

34.

Chi

S.W.

Hannon

G.J.

Darnell

R.B.

An alternative mode of microRNA target recognition

Nat. Struct. Mol. Biol.

2012

;

321

–

327

35.

Seok

Ham

Jang

E.S.

Chi

S.W.

MicroRNA target recognition: insights from transcriptome-wide non-canonical interactions

Mol. Cells

2016

;

375

–

381

36.

Vaswani

Shazeer

Parmar

Uszkoreit

Jones

Gomez

A.N.

Kaiser

Ł.

Polosukhin

Attention is all you need

Adv. Neural Inf. Process Syst.

2017

;

5998

–

6008

OpenURL Placeholder Text