CyloFold: secondary structure prediction including pseudoknots Open Access

Prediction results corresponding to 26 RNA structures that are available in the Protein Dank Bank

PDB	Description	L	PKF	CF			PK			HK			UF
PDB	Description	L	PKF	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV
1A60	TYMV tRNA-like structure	44	13.6	0.74	0.77	0.71	0.96	1.00	0.93	0.83	0.77	0.91	0.83	0.77	0.91
1CX0	HDV ribozyme	72	22.2	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00
1E95	SRV-1 pseudoknot	36	33.3	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.70	0.50	1.00
1HVU	HIV RT bind. pseudoknot	30	26.6	0.95	1.00	0.91	0.95	1.00	0.91	0.56	0.40	0.80	0.56	0.40	0.80
1KAJ	MMTV RNA pseudoknot	32	25.0	0.85	1.00	0.73	0.85	1.00	0.73	0.85	1.00	0.73	0.53	0.50	0.57
1KH6	HCV IRES domain	42	0.0	0.74	0.77	0.71	0.55	0.54	0.58	0.53	0.54	0.54	0.92	0.93	0.93
1KPY	PEMV-1 P1P2 pseudoknot	27	22.2	0.89	1.00	0.80	0.94	1.00	0.89	0.79	0.62	1.00	0.79	0.63	1.00
1KXK	GroupII self-splic. intron	70	0.0	0.91	0.87	0.95	0.81	0.83	0.79	0.81	0.83	0.79	0.96	0.96	0.96
1L2X	Viral RNA pseudoknot	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
1Q9A	23S rRNA sarcin/ricin	27	0.0	0.91	0.83	1.00	0.77	0.83	0.71	0.86	1.00	0.75	0.77	0.83	0.71
1U8D	Guanine riboswitch	67	11.9	0.87	0.87	0.87	0.88	0.78	1.00	0.88	0.78	1.00	0.88	0.78	1.00
2A43	Luteoviral pseudoknot	26	23.0	0.93	1.00	0.88	0.93	1.00	0.88	0.75	0.57	1.00	0.75	0.57	1.00
2G1W	tmRNA pseudoknot	22	18.1	0.81	1.00	0.67	0.86	1.00	0.75	0.81	0.67	1.00	0.81	0.67	1.00
2GIS	SAM- riboswitch	94	8.5	0.80	0.76	0.85	0.80	0.76	0.85	0.86	0.86	0.86	0.55	0.55	0.55
2HOO	thi-box riboswitch	83	0.0	0.70	0.67	0.74	0.58	0.62	0.54	0.58	0.62	0.54	0.58	0.62	0.54
2K95	P2B-P3 telo-merase RNA	48	37.5	0.89	0.80	1.00	0.89	0.80	1.00	0.75	0.8	0.71	0.54	0.40	0.75
2OIU	L1 Ribozyme Ligase adduct	71	0.0	0.86	0.78	0.95	0.98	0.96	1.00	0.98	0.96	1.00	0.98	1.00	1.00
2QUS	Hammerhead Ribozyme	68	2.9	0.95	0.91	1.00	0.95	0.91	1.00	0.95	0.91	1.00	0.95	1.00	1.00
2QWY	SAM-II riboswitch	52	26.9	0.48	0.46	0.50	0.48	0.46	0.5	0.34	0.31	0.4	0.35	0.31	0.40
2RP0	PEMV1 mRNA pseudoknot	26	15.3	0.88	1.00	0.78	0.88	1.00	0.78	0.84	0.71	1.00	0.84	0.71	1.00
2TPK	T2 gene 32 mRNA p.k.	36	27.7	1.00	1.00	1.00	1.00	1.00	1.00	0.71	0.58	0.88	0.63	0.58	0.70
361D	Domain E of 5S rRNA	19	0.0	0.86	1.00	0.75	0.86	1.00	0.75	0.83	0.83	0.83	0.83	0.83	0.83
3DIG	Lysine Riboswitch	173	30.1	0.89	0.85	0.93	0.74	0.72	0.76	0.74	0.72	0.76	0.74	0.72	0.76
3FU2	class-I preQ1 riboswitch	32	18.8	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00
3PHP	TYMV p.k. hairpin	23	0.0	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
437D	rib. frame-shifting p.k.	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
Mean	All			0.83	0.85	0.83	0.82	0.84	0.81	0.75	0.71	0.83	0.73	0.65	0.83
Mean	No pseudoknots		<5.0	0.87	0.85	0.89	0.81	0.84	0.80	0.82	0.84	0.81	0.87	0.90	0.87
Mean	Pseudoknots		>5.0	0.81	0.84	0.80	0.82	0.84	0.82	0.73	0.65	0.84	0.66	0.55	0.80

PDB	Description	L	PKF	CF			PK			HK			UF
PDB	Description	L	PKF	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV
1A60	TYMV tRNA-like structure	44	13.6	0.74	0.77	0.71	0.96	1.00	0.93	0.83	0.77	0.91	0.83	0.77	0.91
1CX0	HDV ribozyme	72	22.2	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00
1E95	SRV-1 pseudoknot	36	33.3	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.70	0.50	1.00
1HVU	HIV RT bind. pseudoknot	30	26.6	0.95	1.00	0.91	0.95	1.00	0.91	0.56	0.40	0.80	0.56	0.40	0.80
1KAJ	MMTV RNA pseudoknot	32	25.0	0.85	1.00	0.73	0.85	1.00	0.73	0.85	1.00	0.73	0.53	0.50	0.57
1KH6	HCV IRES domain	42	0.0	0.74	0.77	0.71	0.55	0.54	0.58	0.53	0.54	0.54	0.92	0.93	0.93
1KPY	PEMV-1 P1P2 pseudoknot	27	22.2	0.89	1.00	0.80	0.94	1.00	0.89	0.79	0.62	1.00	0.79	0.63	1.00
1KXK	GroupII self-splic. intron	70	0.0	0.91	0.87	0.95	0.81	0.83	0.79	0.81	0.83	0.79	0.96	0.96	0.96
1L2X	Viral RNA pseudoknot	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
1Q9A	23S rRNA sarcin/ricin	27	0.0	0.91	0.83	1.00	0.77	0.83	0.71	0.86	1.00	0.75	0.77	0.83	0.71
1U8D	Guanine riboswitch	67	11.9	0.87	0.87	0.87	0.88	0.78	1.00	0.88	0.78	1.00	0.88	0.78	1.00
2A43	Luteoviral pseudoknot	26	23.0	0.93	1.00	0.88	0.93	1.00	0.88	0.75	0.57	1.00	0.75	0.57	1.00
2G1W	tmRNA pseudoknot	22	18.1	0.81	1.00	0.67	0.86	1.00	0.75	0.81	0.67	1.00	0.81	0.67	1.00
2GIS	SAM- riboswitch	94	8.5	0.80	0.76	0.85	0.80	0.76	0.85	0.86	0.86	0.86	0.55	0.55	0.55
2HOO	thi-box riboswitch	83	0.0	0.70	0.67	0.74	0.58	0.62	0.54	0.58	0.62	0.54	0.58	0.62	0.54
2K95	P2B-P3 telo-merase RNA	48	37.5	0.89	0.80	1.00	0.89	0.80	1.00	0.75	0.8	0.71	0.54	0.40	0.75
2OIU	L1 Ribozyme Ligase adduct	71	0.0	0.86	0.78	0.95	0.98	0.96	1.00	0.98	0.96	1.00	0.98	1.00	1.00
2QUS	Hammerhead Ribozyme	68	2.9	0.95	0.91	1.00	0.95	0.91	1.00	0.95	0.91	1.00	0.95	1.00	1.00
2QWY	SAM-II riboswitch	52	26.9	0.48	0.46	0.50	0.48	0.46	0.5	0.34	0.31	0.4	0.35	0.31	0.40
2RP0	PEMV1 mRNA pseudoknot	26	15.3	0.88	1.00	0.78	0.88	1.00	0.78	0.84	0.71	1.00	0.84	0.71	1.00
2TPK	T2 gene 32 mRNA p.k.	36	27.7	1.00	1.00	1.00	1.00	1.00	1.00	0.71	0.58	0.88	0.63	0.58	0.70
361D	Domain E of 5S rRNA	19	0.0	0.86	1.00	0.75	0.86	1.00	0.75	0.83	0.83	0.83	0.83	0.83	0.83
3DIG	Lysine Riboswitch	173	30.1	0.89	0.85	0.93	0.74	0.72	0.76	0.74	0.72	0.76	0.74	0.72	0.76
3FU2	class-I preQ1 riboswitch	32	18.8	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00
3PHP	TYMV p.k. hairpin	23	0.0	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
437D	rib. frame-shifting p.k.	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
Mean	All			0.83	0.85	0.83	0.82	0.84	0.81	0.75	0.71	0.83	0.73	0.65	0.83
Mean	No pseudoknots		<5.0	0.87	0.85	0.89	0.81	0.84	0.80	0.82	0.84	0.81	0.87	0.90	0.87
Mean	Pseudoknots		>5.0	0.81	0.84	0.80	0.82	0.84	0.82	0.73	0.65	0.84	0.66	0.55	0.80

L, Sequence length; PKF, fraction of pseudoknot interactions; For each of the four different prediction methods (CF, Cylofold; PK, pknotsRG; HK, HotKnots 2.0; UF, UNAFold) we report three different measures of prediction quality (SNS, sensitivity; PPV, positive predictive value).

Table 1.

Prediction results corresponding to 26 RNA structures that are available in the Protein Dank Bank

PDB	Description	L	PKF	CF			PK			HK			UF
PDB	Description	L	PKF	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV
1A60	TYMV tRNA-like structure	44	13.6	0.74	0.77	0.71	0.96	1.00	0.93	0.83	0.77	0.91	0.83	0.77	0.91
1CX0	HDV ribozyme	72	22.2	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00
1E95	SRV-1 pseudoknot	36	33.3	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.70	0.50	1.00
1HVU	HIV RT bind. pseudoknot	30	26.6	0.95	1.00	0.91	0.95	1.00	0.91	0.56	0.40	0.80	0.56	0.40	0.80
1KAJ	MMTV RNA pseudoknot	32	25.0	0.85	1.00	0.73	0.85	1.00	0.73	0.85	1.00	0.73	0.53	0.50	0.57
1KH6	HCV IRES domain	42	0.0	0.74	0.77	0.71	0.55	0.54	0.58	0.53	0.54	0.54	0.92	0.93	0.93
1KPY	PEMV-1 P1P2 pseudoknot	27	22.2	0.89	1.00	0.80	0.94	1.00	0.89	0.79	0.62	1.00	0.79	0.63	1.00
1KXK	GroupII self-splic. intron	70	0.0	0.91	0.87	0.95	0.81	0.83	0.79	0.81	0.83	0.79	0.96	0.96	0.96
1L2X	Viral RNA pseudoknot	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
1Q9A	23S rRNA sarcin/ricin	27	0.0	0.91	0.83	1.00	0.77	0.83	0.71	0.86	1.00	0.75	0.77	0.83	0.71
1U8D	Guanine riboswitch	67	11.9	0.87	0.87	0.87	0.88	0.78	1.00	0.88	0.78	1.00	0.88	0.78	1.00
2A43	Luteoviral pseudoknot	26	23.0	0.93	1.00	0.88	0.93	1.00	0.88	0.75	0.57	1.00	0.75	0.57	1.00
2G1W	tmRNA pseudoknot	22	18.1	0.81	1.00	0.67	0.86	1.00	0.75	0.81	0.67	1.00	0.81	0.67	1.00
2GIS	SAM- riboswitch	94	8.5	0.80	0.76	0.85	0.80	0.76	0.85	0.86	0.86	0.86	0.55	0.55	0.55
2HOO	thi-box riboswitch	83	0.0	0.70	0.67	0.74	0.58	0.62	0.54	0.58	0.62	0.54	0.58	0.62	0.54
2K95	P2B-P3 telo-merase RNA	48	37.5	0.89	0.80	1.00	0.89	0.80	1.00	0.75	0.8	0.71	0.54	0.40	0.75
2OIU	L1 Ribozyme Ligase adduct	71	0.0	0.86	0.78	0.95	0.98	0.96	1.00	0.98	0.96	1.00	0.98	1.00	1.00
2QUS	Hammerhead Ribozyme	68	2.9	0.95	0.91	1.00	0.95	0.91	1.00	0.95	0.91	1.00	0.95	1.00	1.00
2QWY	SAM-II riboswitch	52	26.9	0.48	0.46	0.50	0.48	0.46	0.5	0.34	0.31	0.4	0.35	0.31	0.40
2RP0	PEMV1 mRNA pseudoknot	26	15.3	0.88	1.00	0.78	0.88	1.00	0.78	0.84	0.71	1.00	0.84	0.71	1.00
2TPK	T2 gene 32 mRNA p.k.	36	27.7	1.00	1.00	1.00	1.00	1.00	1.00	0.71	0.58	0.88	0.63	0.58	0.70
361D	Domain E of 5S rRNA	19	0.0	0.86	1.00	0.75	0.86	1.00	0.75	0.83	0.83	0.83	0.83	0.83	0.83
3DIG	Lysine Riboswitch	173	30.1	0.89	0.85	0.93	0.74	0.72	0.76	0.74	0.72	0.76	0.74	0.72	0.76
3FU2	class-I preQ1 riboswitch	32	18.8	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00
3PHP	TYMV p.k. hairpin	23	0.0	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
437D	rib. frame-shifting p.k.	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
Mean	All			0.83	0.85	0.83	0.82	0.84	0.81	0.75	0.71	0.83	0.73	0.65	0.83
Mean	No pseudoknots		<5.0	0.87	0.85	0.89	0.81	0.84	0.80	0.82	0.84	0.81	0.87	0.90	0.87
Mean	Pseudoknots		>5.0	0.81	0.84	0.80	0.82	0.84	0.82	0.73	0.65	0.84	0.66	0.55	0.80

PDB	Description	L	PKF	CF			PK			HK			UF
PDB	Description	L	PKF	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV	MCC	SNS	PPV
1A60	TYMV tRNA-like structure	44	13.6	0.74	0.77	0.71	0.96	1.00	0.93	0.83	0.77	0.91	0.83	0.77	0.91
1CX0	HDV ribozyme	72	22.2	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00	–0.01	0.00	0.00
1E95	SRV-1 pseudoknot	36	33.3	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.70	0.50	1.00
1HVU	HIV RT bind. pseudoknot	30	26.6	0.95	1.00	0.91	0.95	1.00	0.91	0.56	0.40	0.80	0.56	0.40	0.80
1KAJ	MMTV RNA pseudoknot	32	25.0	0.85	1.00	0.73	0.85	1.00	0.73	0.85	1.00	0.73	0.53	0.50	0.57
1KH6	HCV IRES domain	42	0.0	0.74	0.77	0.71	0.55	0.54	0.58	0.53	0.54	0.54	0.92	0.93	0.93
1KPY	PEMV-1 P1P2 pseudoknot	27	22.2	0.89	1.00	0.80	0.94	1.00	0.89	0.79	0.62	1.00	0.79	0.63	1.00
1KXK	GroupII self-splic. intron	70	0.0	0.91	0.87	0.95	0.81	0.83	0.79	0.81	0.83	0.79	0.96	0.96	0.96
1L2X	Viral RNA pseudoknot	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
1Q9A	23S rRNA sarcin/ricin	27	0.0	0.91	0.83	1.00	0.77	0.83	0.71	0.86	1.00	0.75	0.77	0.83	0.71
1U8D	Guanine riboswitch	67	11.9	0.87	0.87	0.87	0.88	0.78	1.00	0.88	0.78	1.00	0.88	0.78	1.00
2A43	Luteoviral pseudoknot	26	23.0	0.93	1.00	0.88	0.93	1.00	0.88	0.75	0.57	1.00	0.75	0.57	1.00
2G1W	tmRNA pseudoknot	22	18.1	0.81	1.00	0.67	0.86	1.00	0.75	0.81	0.67	1.00	0.81	0.67	1.00
2GIS	SAM- riboswitch	94	8.5	0.80	0.76	0.85	0.80	0.76	0.85	0.86	0.86	0.86	0.55	0.55	0.55
2HOO	thi-box riboswitch	83	0.0	0.70	0.67	0.74	0.58	0.62	0.54	0.58	0.62	0.54	0.58	0.62	0.54
2K95	P2B-P3 telo-merase RNA	48	37.5	0.89	0.80	1.00	0.89	0.80	1.00	0.75	0.8	0.71	0.54	0.40	0.75
2OIU	L1 Ribozyme Ligase adduct	71	0.0	0.86	0.78	0.95	0.98	0.96	1.00	0.98	0.96	1.00	0.98	1.00	1.00
2QUS	Hammerhead Ribozyme	68	2.9	0.95	0.91	1.00	0.95	0.91	1.00	0.95	0.91	1.00	0.95	1.00	1.00
2QWY	SAM-II riboswitch	52	26.9	0.48	0.46	0.50	0.48	0.46	0.5	0.34	0.31	0.4	0.35	0.31	0.40
2RP0	PEMV1 mRNA pseudoknot	26	15.3	0.88	1.00	0.78	0.88	1.00	0.78	0.84	0.71	1.00	0.84	0.71	1.00
2TPK	T2 gene 32 mRNA p.k.	36	27.7	1.00	1.00	1.00	1.00	1.00	1.00	0.71	0.58	0.88	0.63	0.58	0.70
361D	Domain E of 5S rRNA	19	0.0	0.86	1.00	0.75	0.86	1.00	0.75	0.83	0.83	0.83	0.83	0.83	0.83
3DIG	Lysine Riboswitch	173	30.1	0.89	0.85	0.93	0.74	0.72	0.76	0.74	0.72	0.76	0.74	0.72	0.76
3FU2	class-I preQ1 riboswitch	32	18.8	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00	0.79	0.63	1.00
3PHP	TYMV p.k. hairpin	23	0.0	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
437D	rib. frame-shifting p.k.	27	22.2	0.94	1.00	0.89	0.94	1.00	0.89	0.79	0.63	1.00	0.79	0.63	1.00
Mean	All			0.83	0.85	0.83	0.82	0.84	0.81	0.75	0.71	0.83	0.73	0.65	0.83
Mean	No pseudoknots		<5.0	0.87	0.85	0.89	0.81	0.84	0.80	0.82	0.84	0.81	0.87	0.90	0.87
Mean	Pseudoknots		>5.0	0.81	0.84	0.80	0.82	0.84	0.82	0.73	0.65	0.84	0.66	0.55	0.80

In order to quantify the time-complexity of the folding method, we fitted a function of the form a*N^b (with N being the number of residues in the input sequence) to the execution time needed for the cases of the 241 sequence set. We found that the execution time (measured in seconds) of the structure prediction is well described by the function 2.74*10⁻⁸*N^4.47. The timing evaluation was performed on a computer with 4 GB of RAM and an Intel 64-bit Xeon processor (3.0MHz).

We report in Tables 1 and 2 prediction results for these two data sets together with the corresponding results obtained by running the RNA secondary structure prediction programs HotKnots 2.0 (8), pknotsRG (7) and UNAFold (23).

Table 2.

Prediction results for a set of 241 RNA sequences that are part of PseudoBase for the programs CyloFold, pknotsRG (7), HotKnots 2.0 (8) and UNAFold (23)

	MCC	SNS	PPV
CyloFold	0.752	0.763	0.747
pknotsRG	0.748	0.753	0.756
HotKnots 2.0	0.611	0.565	0.684
UNAFold	0.597	0.532	0.692

SNS, sensitivity of predicted base pairs; PPV, positive predictive value.

Table 2.

Prediction results for a set of 241 RNA sequences that are part of PseudoBase for the programs CyloFold, pknotsRG (7), HotKnots 2.0 (8) and UNAFold (23)

	MCC	SNS	PPV
CyloFold	0.752	0.763	0.747
pknotsRG	0.748	0.753	0.756
HotKnots 2.0	0.611	0.565	0.684
UNAFold	0.597	0.532	0.692

SNS, sensitivity of predicted base pairs; PPV, positive predictive value.

The average Matthews correlation coefficient (MCC) obtained by comparing the base pairing pattern of the predicted secondary structures with their respective reference secondary structure is for data set 1 and CyloFold 0.83; this can be compared to pknotsRG (0.82), HotKnots 2.0 (0.75) and UNAFold (0.73) (see row of Table 1 named ‘All’).

We divided this data set into two subsets according to the fraction of pseudoknot base pairs in the respective structures. The results can be seen in the last two rows of Table 1. The eight PDB structures with <5% pseudoknotted base pairs correspond to an average MCC of 0.87 for CyloFold compared to 0.81 for pknotsRG, 0.82 for HotKnots 2.0 and 0.87 for UNAFold. The 18 structures listed in Table 1 that have a pseudoknot amount >5% correspond to an average MCC of 0.81 for CyloFold, 0.82 for pknotsRG, 0.73 for HotKnots 2.0 and 0.66 for UnaFold.

Using the larger data set 2, one obtains an average MCC of 0.752 for CyloFold and 0.748 for pknotsRG (Table 2). In Table 2 one can see the RNA secondary structure predictions obtained by CyloFold correspond to the highest MCC (compared to the programs pknotsRG, HotKnots 2.0 and UNAFold). It also has the highest average base pair prediction sensitivity (0.763). For another measure, the positive predictive value (how often are predicted base pairs part of the reference secondary structure), all programs obtain averages between 0.68 and 0.76 for data set 2 with pknotsRG leading with a value of 0.756. It should be noted that the MCC is often used as an overall measure of prediction quality, while sensitivity, specificity and positive predictive value capture certain other aspects of the prediction quality.

These results indicate that the prediction accuracy of CyloFold compared to pknotsRG is similar. The key advantage of CyloFold is that there is no restriction in terms of the classes of pseudoknots that are being considered. Also, it should be noted that the employed model of simulated RNA folding by placing helices with a probability according to their free energy contribution is in essence very simple (24). In that sense it is surprising how well the method performs, and it should be an encouragement to continue to develop RNA folding algorithms that are substantially different from established approaches.

CONCLUSION

CyloFold is a new method for RNA secondary structure prediction. We show using two different data sets that the prediction accuracy (MCC) is comparable to the RNA secondary structure prediction program pknotsRG. The search algorithm has no restriction in terms of pseudoknot complexity. Another novel aspect is that at each step during the simulated folding process, the steric feasibility of the predicted structures is checked for steric feasibility using a highly coarse-grained 3D representation. The method is made available in the form of a user-friendly web server.

FUNDING

This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under contract HHSN26120080001E. This Research was supported by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research. Funding for open access charge: National Cancer Institute.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

We wish to thank the Advanced Biomedical Computing Center (ABCC) at the NCI for their computing support. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

REFERENCES

Nussinov

Pieczenik

Griggs

Kleitman

Algorithms for loop matchings

SIAM J. Appl. Math.

1978

, vol.

(pg.

)

Hofacker

Fontana

Stadler

Bonhoeffer

Tacker

Schuster

Fast folding and comparison of RNA secondary structures

Monatshefte f. Chemie

1994

, vol.

125

(pg.

167

188

)

Mathews

Sabina

Zuker

Turner

Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure

J. Mol. Biol.

1999

, vol.

288

(pg.

911

940

)

Mathews

Disney

Childs

Schroeder

Zuker

Turner

Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure

Proc. Natl Acad. Sci. USA

2004

, vol.

101

(pg.

7287

7292

)

Zuker

Stiegler

Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information

Nucleic Acids Res.

1981

, vol.

(pg.

133

148

)

Rivas

Eddy

A dynamic programming algorithm for RNA structure prediction including pseudoknots

J. Mol. Biol.

1999

, vol.

285

(pg.

2053

2068

)

Reeder

Steffen

Giegerich

pknotsRG: RNA pseudoknot folding including near-optimal structures and sliding windows

Nucleic Acids Res.

2007

, vol.

(pg.

W320

W324

)

Andronescu

Pop

Condon

Improved free energy parameters for RNA pseudoknotted secondary structure prediction

RNA

2010

, vol.

(pg.

)

Ruan

Stormo

Zhang

An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots

Bioinformatics

2004

, vol.

(pg.

)

Shapiro

Predicting RNA H-type pseudoknots with the massively parallel genetic algorithm

Comput. Appl. Biosci.

1997

, vol.

(pg.

459

471

)

PubMed

OpenURL Placeholder Text

Shapiro

Bengali

Kasprzak

RNA folding pathway functional intermediates: their prediction and analysis

J Mol Biol

2001

, vol.

312

(pg.

)

Benedetti

Morosetti

A genetic algorithm to search for optimal and suboptimal RNA secondary structures

Biophys. Chem.

1995

, vol.

(pg.

253

259

)

Gultyaev

van Batenburg

Pleij

The computer simulation of RNA folding pathways using a genetic algorithm

J. Mol. Biol.

1995

, vol.

250

(pg.

)

Shapiro

Navetta

A massively parallel genetic algorithm for RNA secondary structure prediction

J. Supercomput.

1994

, vol.

(pg.

195

207

)

Shapiro

Kasprzak

Grunewald

Aman

Graphical exploratory data analysis of RNA secondary structure dynamics predicted by the massively parallel genetic algorithm

J. Mol. Graph Model

2006

, vol.

(pg.

514

531

)

Shapiro

Bengali

Potts

The massively parallel genetic algorithm for RNA folding: MIMD implementation and population variation

Bioinformatics

2001

, vol.

(pg.

137

148

)

Ren

Rastegari

Condon

Hoos

HotKnots: heuristic prediction of RNA secondary structures including pseudoknots

RNA

2005

, vol.

(pg.

1494

1504

)

Rocher

Brown

The Definitive Guide to Grails

2009

Apress, New York

Darty

Denise

Ponty

VARNA: interactive drawing and editing of the RNA secondary structure

Bioinformatics

2009

, vol.

(pg.

1974

1975

)

Yang

Jossinet

Leontis

Chen

Westbrook

Berman

Westhof

Tools for the automatic identification and classification of RNA base pairs

Nucleic Acids Res.

2003

, vol.

(pg.

3450

3460

)

van Batenburg

Gultyaev

Pleij

PseudoBase: structural information on RNA pseudoknots

Nucleic Acids Res

2001

, vol.

(pg.

194

195

)

van Batenburg

Gultyaev

Pleij

Oliehoek

PseudoBase: a database with RNA pseudoknots

Nucleic Acids Res.

2000

, vol.

(pg.

201

204

)

Markham

Zuker

UNAFold: software for nucleic acid folding and hybridization

Methods Mol. Biol.

2008

, vol.

453

(pg.

)

PubMed

OpenURL Placeholder Text