Prediction of RNA secondary structure including pseudoknots for long sequences

Datasets used in our experiments. Each element of the table shows the number of sequences

	Pseudoknot-free			Pseudoknotted
	Short	Medium	Long	Short	Medium	Long
Length (nt)	(12–150)	(151–500)	(501–4381)	(12–150)	(151–500)	(501–4381)
(Single)
bpRNA-1m	1971	514	420	125	162	245
Rfam 14.5	6299	723	9	1692	477	151
(Multiple)
Rfam 14.5	5118	554	4	1692	477	151

	Pseudoknot-free			Pseudoknotted
	Short	Medium	Long	Short	Medium	Long
Length (nt)	(12–150)	(151–500)	(501–4381)	(12–150)	(151–500)	(501–4381)
(Single)
bpRNA-1m	1971	514	420	125	162	245
Rfam 14.5	6299	723	9	1692	477	151
(Multiple)
Rfam 14.5	5118	554	4	1692	477	151

Table 1

Open in new tab Download slide

Datasets used in our experiments. Each element of the table shows the number of sequences

	Pseudoknot-free			Pseudoknotted
	Short	Medium	Long	Short	Medium	Long
Length (nt)	(12–150)	(151–500)	(501–4381)	(12–150)	(151–500)	(501–4381)
(Single)
bpRNA-1m	1971	514	420	125	162	245
Rfam 14.5	6299	723	9	1692	477	151
(Multiple)
Rfam 14.5	5118	554	4	1692	477	151

	Pseudoknot-free			Pseudoknotted
	Short	Medium	Long	Short	Medium	Long
Length (nt)	(12–150)	(151–500)	(501–4381)	(12–150)	(151–500)	(501–4381)
(Single)
bpRNA-1m	1971	514	420	125	162	245
Rfam 14.5	6299	723	9	1692	477	151
(Multiple)
Rfam 14.5	5118	554	4	1692	477	151

Results

Effectiveness of pseudo-expected accuracy

First, to show the effectiveness of the automatic selection from among thresholds |$\tau ^{(1)}, \ldots , \tau ^{(m)}$| based on the pseudo-expected accuracy, Figure 2 and Table S1 in Supplementary Information show the prediction accuracy on the dataset of short sequences (⁠|$\leq 150$| nt) using automatic selection and manual selection of the threshold |$\tau $| values. For IPknot, we fixed the number of decomposed sets of secondary substructures |$m=2$|⁠, and varied threshold parameters |$\tau $| values for base pairing probability in such a way that |$\{(\tau ^{(1)}, \tau ^{(2)}) \mid \tau ^{(p)}=2^{-t}, p=1,2, \, t=1,2,3,4, \, \tau ^{(1)} \geq \tau ^{(2)}\}$|⁠. In IPknot with pseudo-expected accuracy, the best secondary structure in the sense of pseudo-expected |$F$| is selected from the same range of |$(\tau ^{(1)}, \tau ^{(2)})$| for each sequence. For these variants of IPknot, the LinearPartition model with CONTRAfold parameters (LinearPartition-C) was used to calculate base pairing probabilities. In addition, we compared the prediction accuracy of IPknot with that of ThreshKnot [35], which also calculates base pairing probabilities using LinearPartition-C. We used |$\{2^{-t} \mid t=1,2,3,4\} \cup \{0.3\}$| as the threshold parameter |$\theta $| for ThreshKnot because the default threshold parameter of ThreshKnot is |$\theta =0.3$|⁠. IPknot with threshold parameters of |$\tau ^{(1)}=0.125$| and |$\tau ^{(2)}=0.125$| had the highest prediction accuracy of |$F=0.659$|⁠. IPknot with pseudo-expected accuracy has a prediction accuracy of |$F=0.658$|⁠, which is comparable to the highest accuracy obtained. ThreshKnot with a threshold of 0.25 has an accuracy of |$F=0.656$|⁠, which is also comparable to the best accuracy obtained.

$PPV–SEN plot of IPknot and ThreshKnot for short RNA sequences ($\leq 150$ nt).$

Figure 2

PPV–SEN plot of IPknot and ThreshKnot for short RNA sequences (⁠|$\leq 150$| nt).

The pseudo-expected |$F$|-value and “true” |$F$|-value are relatively highly correlated (Spearman correlation coefficient |$\rho =0.639$|⁠), indicating that the selection of predicted secondary structure using pseudo-expected accuracy works well.

While the accuracy of the prediction of the entire secondary structure has already been considered, as shown in Figure 2, for the prediction of secondary structures with pseudoknots, it is necessary to evaluate the prediction accuracy focused on the crossing base pairs. In terms of prediction accuracy limited to only crossing base pairs, IPknot with pseudo-expected accuracy yielded |$F_{cb}=0.258$|⁠, while the highest accuracy achieved by IPknot with the threshold parameters and ThreshKnot was considerably lower at |$F_{cb}=0.161$| and |$0.057$|⁠, respectively (See Table S1 in Supplementary Information). We can observe the similar tendency to the above in Figures S1 and S2, and Tables S2 and S3 in Supplementary Information for medium (151–500 nt) and long (⁠|$>500$| nt) sequences. These results suggest that prediction of crossing base pairs is improved by selecting the predicted secondary structure while considering both the pseudo-expected accuracy of the entire secondary structure and the pseudo-expected accuracy of the crossing base pairs.

Comparison with previous methods for single RNA sequences

Using our dataset, we compared our algorithm with several previous methods that can predict pseudoknots, including ThreshKnot utilizing LinearPartition (committed on 17 March 2021) [22], Knotty (committed on Mar 28, 2018) [22] and SPOT-RNA (committed on 1 April 2021) [6], and those that can predict only pseudoknot-free structures, including CONTRAfold (version 2.02) [4] and RNAfold in the ViennaRNA package (version 2.4.17) [22]. IPknot has several options for the calculation model for base pairing probabilities, namely the LinearPartition model with CONTRAfold parameters (LinearPartition-C), the LinearPartition model with ViennaRNA parameters (LinearPartition-V), the CONTRAfold model and the ViennaRNA model. In addition, ThreshKnot has two possible LinearPartition models for calculating base pairing probabilities. The other existing methods were tested using the default settings.

We evaluated the prediction accuracy according to the |$F$|-value as defined by Equation (14) for pseudoknot-free sequences (PKF in Table 2), pseudoknotted sequences (PK in Table 2) and only crossing base pairs (CB in Table 2) by stratifying sequences by length: short (12–150 nt), medium (151–500 nt) and long (500–4381 nt).

Table 2

A comparison of prediction accuracies (⁠|$F$|-values) by sequence length for each method

Length	Short (12–150 nt)			Medium (151–500 nt)			Long (501–4381 nt)
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.681	0.552	0.258	0.492	0.482	0.128	0.433	0.428	0.061
(LinearPartition-V)	0.669	0.499	0.143	0.478	0.461	0.091	0.380	0.370	0.038
(CONTRAfold)	0.678	0.550	0.259	0.495	0.505	0.154	0.426	0.413	0.066
(ViennaRNA)	0.669	0.500	0.144	0.480	0.461	0.091	0.212	0.317	0.041
ThreshKnot
(LinearPartition-C)	0.681	0.501	0.027	0.493	0.475	0.019	0.439	0.431	0.008
(LinearPartition-V)	0.669	0.484	0.033	0.481	0.456	0.026	0.383	0.372	0.014
Knotty	0.641	0.550	0.315	—	—	—	—	—	—
SPOT-RNA	0.658	0.621	0.322	0.462	0.479	0.127	—	—	—
CONTRAfold	0.682	0.519	0.000	0.500	0.497	0.000	0.425	0.415	0.000
RNAfold	0.668	0.472	0.000	0.474	0.442	0.000	0.361	0.347	0.000

Length	Short (12–150 nt)			Medium (151–500 nt)			Long (501–4381 nt)
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.681	0.552	0.258	0.492	0.482	0.128	0.433	0.428	0.061
(LinearPartition-V)	0.669	0.499	0.143	0.478	0.461	0.091	0.380	0.370	0.038
(CONTRAfold)	0.678	0.550	0.259	0.495	0.505	0.154	0.426	0.413	0.066
(ViennaRNA)	0.669	0.500	0.144	0.480	0.461	0.091	0.212	0.317	0.041
ThreshKnot
(LinearPartition-C)	0.681	0.501	0.027	0.493	0.475	0.019	0.439	0.431	0.008
(LinearPartition-V)	0.669	0.484	0.033	0.481	0.456	0.026	0.383	0.372	0.014
Knotty	0.641	0.550	0.315	—	—	—	—	—	—
SPOT-RNA	0.658	0.621	0.322	0.462	0.479	0.127	—	—	—
CONTRAfold	0.682	0.519	0.000	0.500	0.497	0.000	0.425	0.415	0.000
RNAfold	0.668	0.472	0.000	0.474	0.442	0.000	0.361	0.347	0.000

PKF, |$F$|-value for pseudoknot-free sequences; PK, |$F$|-value for pseudoknotted sequences; CB, |$F$|-value of crossing base pairs.

Table 2

Open in new tab Download slide

A comparison of prediction accuracies (⁠|$F$|-values) by sequence length for each method

Length	Short (12–150 nt)			Medium (151–500 nt)			Long (501–4381 nt)
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.681	0.552	0.258	0.492	0.482	0.128	0.433	0.428	0.061
(LinearPartition-V)	0.669	0.499	0.143	0.478	0.461	0.091	0.380	0.370	0.038
(CONTRAfold)	0.678	0.550	0.259	0.495	0.505	0.154	0.426	0.413	0.066
(ViennaRNA)	0.669	0.500	0.144	0.480	0.461	0.091	0.212	0.317	0.041
ThreshKnot
(LinearPartition-C)	0.681	0.501	0.027	0.493	0.475	0.019	0.439	0.431	0.008
(LinearPartition-V)	0.669	0.484	0.033	0.481	0.456	0.026	0.383	0.372	0.014
Knotty	0.641	0.550	0.315	—	—	—	—	—	—
SPOT-RNA	0.658	0.621	0.322	0.462	0.479	0.127	—	—	—
CONTRAfold	0.682	0.519	0.000	0.500	0.497	0.000	0.425	0.415	0.000
RNAfold	0.668	0.472	0.000	0.474	0.442	0.000	0.361	0.347	0.000

Length	Short (12–150 nt)			Medium (151–500 nt)			Long (501–4381 nt)
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.681	0.552	0.258	0.492	0.482	0.128	0.433	0.428	0.061
(LinearPartition-V)	0.669	0.499	0.143	0.478	0.461	0.091	0.380	0.370	0.038
(CONTRAfold)	0.678	0.550	0.259	0.495	0.505	0.154	0.426	0.413	0.066
(ViennaRNA)	0.669	0.500	0.144	0.480	0.461	0.091	0.212	0.317	0.041
ThreshKnot
(LinearPartition-C)	0.681	0.501	0.027	0.493	0.475	0.019	0.439	0.431	0.008
(LinearPartition-V)	0.669	0.484	0.033	0.481	0.456	0.026	0.383	0.372	0.014
Knotty	0.641	0.550	0.315	—	—	—	—	—	—
SPOT-RNA	0.658	0.621	0.322	0.462	0.479	0.127	—	—	—
CONTRAfold	0.682	0.519	0.000	0.500	0.497	0.000	0.425	0.415	0.000
RNAfold	0.668	0.472	0.000	0.474	0.442	0.000	0.361	0.347	0.000

PKF, |$F$|-value for pseudoknot-free sequences; PK, |$F$|-value for pseudoknotted sequences; CB, |$F$|-value of crossing base pairs.

For short sequences, SPOT-RNA archived high accuracy, especially for pseudoknotted sequences. However, a large difference in accuracy between the bpRNA-1m-derived and Rfam 14.5-derived sequences can be observed for SPOT-RNA compared with the other methods (See Tables S4–S9 in Supplementary Information). Notably, bpRNA-1m contains many sequences in the same family as the SPOT-RNA training data, and although we performed filtering based on sequence identity, there is still a concern of overfitting. Knotty can predict structures including pseudoknots with an accuracy comparable to that of SPOT-RNA, but as shown in Figure 3, it can perform secondary structure prediction for only short sequences, owing to its huge computational complexity. Comparing IPknot using the LinearPartition-C and -V models with its counterparts, the original CONTRAfold model and ViennaRNA model achieved comparable accuracy. However, because the computational complexity of the original models is cubic with respect to sequence length, the computational time of the original models increases rapidly as the sequence length exceeds 1500 bases. On the other hand, the computational complexity of the LinearPartition models is linear with respect to sequence length, so the base pairing probabilities can be quickly calculated even when the sequence length exceeds 4000 bases. In addition to calculating the base pairing probabilities, IP calculations are required, but because the number of variables and constraints to be considered can be greatly reduced using the threshold cut technique, the overall execution time is not significantly affected if the sequence length is several thousand bases. Because ThreshKnot, like IPknot, uses the LinearPartition model, it is able to perform fast secondary structure prediction even for long sequences. However, for the prediction accuracy of crossing base pairs, ThreshKnot is even less accurate.

Figure 3

Computational time of each method as a function of sequence length. For SPOT-RNA with GPGPU, we used a Linux workstation with Intel Xeon Gold 6136 and NVIDIA Tesla V100. All other computations were performed on Linux workstations with AMD EPYC 7702. For IPknot, we employed IBM CPLEX Optimizer as the IP solver.

Pseudoknots are found not only in cellular RNAs but also in viral RNAs, performing a variety of functions [8]. Tables S10–S11 in Supplementary Information show the results of the secondary structure prediction by separating the datasets into cellular RNAs and viral RNAs, indicating that there is no significant difference in the prediction accuracy between cellular RNAs and viral RNAs.

Prediction of common secondary structures with pseudoknots

Few methods exist that can perform prediction of common secondary structures including pseudoknots for sequence alignments longer than 1000 bases. Table 3 and Tables S12–S20 in Supplementary Information compare the accuracy of IPknot that employs the LinearPartition model, and RNAalifold in the ViennaRNA package. We performed common secondary structure prediction for the Rfam reference alignment and the alignment calculated by MAFFT, as well as secondary structure prediction of single sequences only for the seed sequence included in the alignment, and evaluated the prediction accuracy for the seed sequence. In most cases, the prediction accuracy improved as the quality of the alignment increased (Single < MAFFT < Reference). IPknot predicts crossing base pairs based on pseudo-expected accuracy, whereas RNAalifold is unable to predict pseudoknots.

Table 3

A comparison of prediction accuracies (⁠|$F$|-values) of common secondary structure prediction by sequence alignments for each method

	Reference			MAFFT			Single
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.765	0.616	0.220	0.732	0.585	0.218	0.718	0.548	0.227
(LinearPartition-V)	0.761	0.565	0.177	0.729	0.529	0.165	0.714	0.494	0.124
RNAalifold	0.804	0.611	0.000	0.745	0.540	0.000	0.716	0.474	0.000

	Reference			MAFFT			Single
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.765	0.616	0.220	0.732	0.585	0.218	0.718	0.548	0.227
(LinearPartition-V)	0.761	0.565	0.177	0.729	0.529	0.165	0.714	0.494	0.124
RNAalifold	0.804	0.611	0.000	0.745	0.540	0.000	0.716	0.474	0.000

PKF, |$F$|-value for pseudoknot-free sequences; PK, |$F$|-value for pseudoknotted sequences; CB, |$F$|-value of crossing base pairs.

Table 3

A comparison of prediction accuracies (⁠|$F$|-values) of common secondary structure prediction by sequence alignments for each method

	Reference			MAFFT			Single
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.765	0.616	0.220	0.732	0.585	0.218	0.718	0.548	0.227
(LinearPartition-V)	0.761	0.565	0.177	0.729	0.529	0.165	0.714	0.494	0.124
RNAalifold	0.804	0.611	0.000	0.745	0.540	0.000	0.716	0.474	0.000

	Reference			MAFFT			Single
	PKF	PK	CB	PKF	PK	CB	PKF	PK	CB
IPknot
(LinearPartition-C)	0.765	0.616	0.220	0.732	0.585	0.218	0.718	0.548	0.227
(LinearPartition-V)	0.761	0.565	0.177	0.729	0.529	0.165	0.714	0.494	0.124
RNAalifold	0.804	0.611	0.000	0.745	0.540	0.000	0.716	0.474	0.000

PKF, |$F$|-value for pseudoknot-free sequences; PK, |$F$|-value for pseudoknotted sequences; CB, |$F$|-value of crossing base pairs.

Discussion

Both IPknot and ThreshKnot use the LinearPartition model to calculate base pairing probabilities, and then perform secondary structure prediction using different strategies. ThreshKnot predicts the base pairs |$x_i$| and |$x_j$| that are higher than a predetermined threshold |$\theta $| and have the largest |$p_{ij}$| in terms of both |$i$| and |$j$|⁠. IPknot predicts the pseudoknot structure with multiple thresholds |$\tau ^{(1)}, \ldots , \tau ^{(m)}$| in a hierarchical manner based on IP (5)–(11), and then carefully selects from among these thresholds based on pseudo-expected accuracy. Because both the pseudo-expected accuracy of the entire secondary structure as well as the pseudo-expected accuracy of the crossing base pairs are taken into account, the prediction accuracy of the pseudoknot structure is inferred to be enhanced in IPknot.

Because the LinearPartition model uses the same parameters as the CONTRAfold and ViennaRNA packages, there is no significant difference in accuracy between using LinearPartition-C and -V and their counterparts, the CONTRAfold and ViennaRNA models. It has been shown that LinearPartition has no significant effect on accuracy even though it ignores structures whose probability is extremely low owing to its use of beam search, which makes the calculation linear with respect to the sequence length [22]. The LinearPartition model enables IPknot to perform secondary structure prediction including pseudoknots of very long sequences, such as mRNA, lncRNA, and viral RNA.

SPOT-RNA [6], which uses deep learning, showed notable prediction accuracy in our experiments, especially in short sequences containing pseudoknots, with |$F$|-value of 0.621, which is superior to other methods. However, SPOT-RNA requires considerable computing resources such as GPGPU and long computational time. Furthermore, SPOT-RNA showed a large difference in prediction accuracy between sequences that are close to the training data and those that are not compared with the other methods. Therefore, the situations in which SPOT-RNA can be used are considered to be limited. In contrast, IPknot uses CONTRAfold parameters, which is also based on machine learning, but we did not observe as much overfitting with IPknot as with SPOT-RNA.

Approaches that provide an exact solution for limited-complexity pseudoknot structures, such as PKNOTS [14], pknotsRG [15], and Knotty [16], can predict pseudoknot structures with high accuracy but demand a huge amount of computation |$O(n^4)$|–|$O(n^6)$| for sequence length |$n$|⁠, limiting secondary structure prediction to sequences only up to about 150 bases. On the other hand, IPknot predicts the pseudoknot structure using a fast computational heuristic-based method with the linear time computation, which does not allow us to find an exact solution. Instead, IPknot improves the prediction accuracy of the pseudoknot structure by choosing the best solution from among several solutions based on the pseudo-expected accuracy.

IPknot uses pseudoknot-free algorithms, such as CONTRAfold and ViennaRNA, to calculate base pairing probabilities, and its prediction accuracy of the resulting secondary structure strongly depends on the algorithm used to calculate base pairing probabilities. Therefore, we can expect to improve the prediction accuracy of IPknot by calculating the base pairing probabilities based on state-of-the-art pseudoknot-free secondary structure prediction methods such as MXfold2 [7].

It is well known that common secondary structure prediction from sequence alignments improves the accuracy of secondary structure prediction. However, among the algorithms for predicting common secondary structure including pseudoknots, only IPknot can deal with sequence alignments longer than several thousand bases. In the RNA virus SARS-CoV-2, programmed -1 ribosomal frameshift (-1 PRF), in which a pseudoknot structure plays an important role, has been identified and is attracting attention as a drug target [10]. Because many closely related strains of SARS-CoV-2 have been sequenced, it is expected that structural motifs including pseudoknots, such as -1 PRF, can be found by predicting the common secondary structure from the alignment.

Conclusions

We have developed an improvement to IPknot that enables calculation in linear time by employing the LinearPartition model and automatically selects the optimal threshold parameters based on the pseudo-expected accuracy. LinearPartition can calculate the base pairing probability with linear computational complexity with respect to the sequence length. By employing LinearPartition, IPknot is able to predict the secondary structure considering pseudoknots for long sequences such as mRNA, lncRNA, and viral RNA. By choosing the thresholds for each sequence based on the pseudo-expected accuracy, we can select a nearly optimal secondary structure prediction.

The LinearPartition model realized the predictiction of secondary structures considering pseudoknots for long sequences. However, the prediction accuracy is still not sufficiently high, especially for crossing base pairs. We expect that by learning parameters from long sequences [36], we can achieve high accuracy even for long sequences.

Key Points

We reduced the computational time required by IPknot from cubic to linear with respect to the sequence length by employing the LinearPartition model and enabled the secondary structure prediction including pseudoknots for long RNA sequences such as mRNA, lncRNA, and viral RNA.
We improved the accuracy of secondary structure prediction including pseudoknots by introducing pseudo-expected accuracy not only for the entire base pairs but also for crossing base pairs.
To the best of our knowledge, IPknot is the only method that can perform RNA secondary structure prediction including pseudoknot not only for very long single sequence, but also for very long sequence alignments.

Availability

The IPknot source code is freely available at https://github.com/satoken/ipknot. IPknot is also available for use from a web server at http://rtips.dna.bio.keio.ac.jp/ipknot++/. The datasets used in our experiments are available at https://doi.org/10.5281/zenodo.4923158.

Author contributions statement

K.S. conceived the study, implemented the algorithm, collected the datasets, conducted experiments, and drafted the manuscript. K.S. and Y.K. discussed the algorithm and designed the experiments. All authors read, contributed to the discussion of and approved the final manuscript.

Funding

This work was partially supported by a Grant-in-Aid for Scientific Research (B) (No. 19H04210) and Challenging Exploratory Research (No. 19K22897) from the Japan Society for the Promotion of Science (JSPS) to K.S. and a Grant-in-Aid for Scientific Research (C) (Nos. 18K11526 and 21K12109) from JSPS to Y.K.

Acknowledgments

The supercomputer system used for this research was made available by the National Institute of Genetics, Research Organization of Information and Systems.

Kengo Sato is an assistant professor at the Department of Biosciences and Informatics at Keio University, Japan. He received his PhD in Computer Science from Keio University, Japan, in 2003. His research interests include bioinformatics, computational linguistics and machine learning.

Yuki Kato is an assistant professor at Department of RNA Biology and Neuroscience, Graduate School of Medicine, and at Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Japan. His research interests include biological sequence analysis and single-cell genomics.

References

Zuker

Mfold web server for nucleic acid folding and hybridization prediction

Nucleic Acids Res

2003

;

(

3406

–

Lorenz

Bernhart

Höner Zu Siederdissen

, et al.

ViennaRNA package 2.0

Algorithms Mol Biol

2011

;

Reuter

Mathews

RNAstructure: software for RNA secondary structure prediction and analysis

BMC Bioinformatics

2010

;

129

Woods

Batzoglou

CONTRAfold: RNA secondary structure prediction without physics-based models

Bioinformatics

2006

;

(

e90

–

Zakov

Goldberg

Elhadad

, et al.

Rich parameterization improves RNA structure prediction

J Comput Biol

2011

;

(

1525

–

Singh

Hanson

Paliwal

, et al.

RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning

Nat Commun

2019

;

(

5407

Sato

Akiyama

Sakakibara

RNA secondary structure prediction using deep learning with thermodynamic integration

Nat Commun

2021

;

(

941

Brierley

Pennell

Gilbert

RJC

Viral RNA pseudoknots: versatile motifs in gene expression and replication

Nat Rev Microbiol

2007

;

(

598

–

610

Staple

Butcher

Pseudoknots: RNA structures with diverse functions

PLoS Biol

2005

;

(

):e213.

10.

Kelly

Olson

Neupane

, et al.

Structural and functional conservation of the programmed -1 ribosomal frameshift signal of SARS coronavirus 2 (SARS-CoV-2)

J Biol Chem

2020

;

295

(

10741

–

11.

Trifonov

Gabdank

Barash

, et al.

Primordia vita. deconvolution from modern sequences

Orig Life Evol Biosph

December 2006

;

(

5–6

559

–

12.

Akutsu

Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots

Discrete Appl Math

2000

;

104

(

–

13.

Lyngsø

Pedersen

RNA pseudoknot prediction in energy-based models

J Comput Biol

2000

;

(

3–4

409

–

PubMed

14.

Rivas

Eddy

A dynamic programming algorithm for RNA structure prediction including pseudoknots

J Mol Biol

1999

;

285

(

2053

–

15.

Reeder

Giegerich

Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics

BMC Bioinformatics

2004

;

104

16.

Jabbari

Wark

Montemagno

, et al.

Knotty: efficient and accurate prediction of complex RNA pseudoknot structures

Bioinformatics

2018

;

(

3849

–

17.

Ruan

Stormo

Zhang

An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots

Bioinformatics

2004

;

(

–

18.

Ren

Rastegari

Condon

, et al.

HotKnots: heuristic prediction of RNA secondary structures including pseudoknots

RNA

2005

;

(

1494

–

504

19.

Chen

S-M

, et al.

FlexStem: improving predictions of RNA secondary structures with pseudoknots by reducing the search space

Bioinformatics

2008

;

(

1994

–

2001

20.

Bellaousov

Mathews

H. ProbKnot: fast prediction of RNA secondary structure including pseudoknots

RNA

2010

;

(

1870

–

21.

Sato

Kato

Hamada

, et al.

IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming

Bioinformatics

2011

;

(

i85

–

22.

Zhang

Mathews

, et al.

LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities

Bioinformatics

2020

;

(

Supplement_1

i258

–

23.

Hamada

Sato

Asai

Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

BMC Bioinformatics

2010

;

586

24.

Hamada

Kiryu

Sato

, et al.

Prediction of RNA secondary structure using generalized centroid estimators

Bioinformatics

2009

;

(

465

–

25.

McCaskill

The equilibrium partition function and base pair binding probabilities for RNA secondary structure

Biopolymers

1990

;

(

6–7

1105

–

PubMed

26.

Kiryu

Kin

Asai

Robust prediction of consensus secondary structures using averaged base pairing probability matrices

Bioinformatics

2007

;

(

434

–

27.

Hamada

Sato

Asai

Improving the accuracy of predicting secondary structure for aligned RNA sequences

Nucleic Acids Res

2011

;

(

393

–

402

28.

Danaee

Rouches

Wiley

, et al.

bpRNA: large-scale automated annotation and analysis of RNA secondary structure

Nucleic Acids Res

2018

;

(

5381

–

29.

Nawrocki

Burge

Bateman

, et al.

Rfam 12.0: updates to the RNA families database

Nucleic Acids Res

2015

;

(

Database issue

D130

–

PubMed