NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure

ABSTRACT

The Nearest Neighbor Database (NNDB, http://rna.urmc.rochester.edu/NNDB) is a web-based resource for disseminating parameter sets for predicting nucleic acid secondary structure stabilities. For each set of parameters, the database includes the set of rules with descriptive text, sequence-dependent parameters in plain text and html, literature references to experiments and usage tutorials. The initial release covers parameters for predicting RNA folding free energy and enthalpy changes.

INTRODUCTION

Nearest neighbor approaches were developed to predict the folding stabilities of nucleic acid secondary structures (1). These parameter sets utilize empirical rules, generally derived from optical melting experimental data, as the basis of the predictions. For RNA, rules exist for predicting both free energy and enthalpy change of Watson–Crick helices, GU pairs and loops (2–5). Parameters for DNA have also been assembled for predicting Watson–Crick pair free energy and enthalpy change and free energy changes of loops (6,7). These parameter sets are the basis of computer programs that predict low free energy secondary structures. Such programs include Mfold/UnaFold (8,9), the Vienna RNA package (10), RNA structure (2), RNAsoft (11) and Sfold (12). Additional approaches that use statistical learning of parameters for RNA folding have also used the rules from the nearest neighbor methods and derived new parameter values (13,14).

Nearest neighbor parameter sets include both a set of rules, called either equations or features, for predicting stability and a set of parameter values used by the equations (14). For RNA, separate rules exist for predicting stabilities of helices, hairpin loops, small internal loops, large internal loops, bulge loops, multibranch loops, exterior loops and pseudoknots. Given the number of rules and constraints on the length of journal publications, it is difficult to assemble all the parameters in one publication and provide meaningful tutorials for using the parameters. This is a barrier to software development for novel algorithms that could take advantage of the parameters. For example, many software packages that use RNA parameters still implement the set of parameters assembled in 1999 (4), in spite of the fact the RNA parameters were updated in 2004 (2) based on experimental results.

The Nearest Neighbor Database (NNDB) is a web-based tool for assembling and archiving complete nearest neighbor sets, including rules and values. It is available online at http://rna.urmc.rochester.edu/NNDB. It provides documentation of parameter sets and tutorials on how to apply the parameters. Currently, the 1999 and 2004 sets of RNA folding parameters are provided (2–5).

WEBSITE ORGANIZATION

The NNDB is built using a set of static html, specifically XHTML 1.0 transitional pages with a page hierarchy shown in Figure 1. Text is encoded in Unicode (utf-8) to facilitate display of equations in pages with diverse browsers running on diverse operating systems. The top-level page provides access to a help page, available parameter sets and a page of references to RNA optical melting experiments. Additionally, links provide downloading of the whole database in either zip or gzipped tar format. The help page introduces the purpose of the database and defines basic terms, including the set of structural features defined by secondary structures. For example, Figure 2, from the help page, shows an RNA secondary structure that illustrates the loop features covered by nearest neighbor parameter sets. The basic equations for utilizing the parameters to extrapolate folding free energy changes to temperatures other than 37°C and to predict melting temperatures are also provided.

Figure 1.

The webpage hierarchy of the NNDB. This figure illustrates the page hierarchy by following the linked pages down through the 1999 parameters and down to the hairpin loop pages. Note that there are five example calculations for hairpin loops to illustrate the separate sequence-dependent rules that are used depending on the specific loop.

Open in new tab Download slide

Figure 2.

An RNA secondary structure illustrating the types of features included in nearest neighbor parameter sets. This figure appears on the help page of the website. Loops are composed of nucleotides not in canonical pairs. Hairpin loops have one exiting helix. Internal and bulge loops have two exiting helices. Internal loops have nucleotides not in canonical pairs on each of two strands, but bulge loops have nucleotides not in canonical pairs on only one strand. Multibranch loops, also called helical junctions, have three or more exiting helices. Exterior loops contain the ends of sequences and one or more exiting helices. Pseudoknots are canonical pairs connecting loop regions closed by other helices. Formally, a pseudoknot occurs when there are at least two pairs, with indices i paired to j and i′ paired to j′, that satisfy the condition i < i′ < j < j′. The pseudoknot helix is often considered to be composed of the fewest pairs that need to be removed to relieve the pseudoknot (19). In this structure, the tan nucleotides are in pairs that could be removed to relieve the pseudoknot.

Open in new tab Download slide

For each set of parameters, a first page introduces the available parameters, which vary from set to set. For example, the 1999 RNA rules predict only folding free energy changes (4), but the 2004 rules can be used to predict both folding free energy and enthalpy changes (2,5). For each structural feature, a page defines the basic equations and provides links to parameter values (in plain text and html), references and tutorial pages (e.g. Figure 3). The number of tutorials varies from feature to feature; the set of tutorials is designed to cover each type of rule that can be encountered in practice. For example, the Watson–Crick helix parameters are covered with two tutorials, one for self-complementary and one for non-self-complementary strands. These two tutorials also demonstrate the difference in the calculation when there are terminal AU base pairs, which receive a free energy and enthalpy change penalty (3), because the self-complementary duplex example has two terminal AU pairs and the non-self-complementary case has no terminal AU pairs.

Figure 3.

An example tutorial from the database. This tutorial demonstrates the prediction of folding free energy change for a hairpin loop of six unpaired nucleotides using the 2004 parameters (2,3).

Open in new tab Download slide

The individual pages are designed for ease of navigation and clarity. Individual pages above the level of value tables have top banner, a left navigation bar that allows the user to navigate back up the hierarchy to any level above and a bottom bar with the date of last editing. For pages edited after the database has gone online, previous versions of the page are available using this bottom content bar. To facilitate indexing by search engines, all pages have a descriptive title, including the set of parameters to which it belongs (if applicable).

WEBSITE CONTENT

The first release of the NNDB contains the RNA folding rules assembled in 1999 and 2004 (2–5). These rules represent the most recent set of parameters and a prior set that is widely used in software packages. Because folding rules are derived to work as a set, the two versions of rules and values should not be mixed and the website hierarchy reinforces this.

The website is designed to be expandable to additional sets of parameters. It is anticipated, for example, that additional pages will be written to include nearest neighbors for DNA folding (6,7) and for predicting RNA pseudoknot stabilities (15–18). Additionally, the values derived from the re-estimation of the values of the 1999 parameter set using the set of known RNA secondary structures will also be included (14).

DISCUSSION

The NNDB is designed to provide a convenient location for assembling parameter sets for predicting the stability of nucleic acid secondary structures. It is modular in design, which facilitates its future expansion to contain additional parameter sets. Furthermore, the web format makes it feasible to provide extensive tutorials for utilizing the parameters, which is generally not possible in print.

FUNDING

The creation of the NNDB was supported by United States National Institutes of Health grants GM076485 to D.H.M. and GM22939 to D.H.T. Funding for open access charge: United States National Institutes of Health.

REFERENCES

Tinoco

Borer

Dengler

Levin

Uhlenbeck

Crothers

Gralla

Improved estimation of secondary structure in ribonucleic acids

Nat. New Biol.

(

1973

)

246

–

Mathews

Disney

Childs

Schroeder

Zuker

Turner

Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure

Proc. Natl Acad. Sci. USA

(

2004

)

101

7287

–

7292

Google Scholar

Crossref

WorldCat

Xia

SantaLucia

Burkard

Kierzek

Schroeder

Jiao

Cox

Turner

Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson–Crick pairs

Biochemistry

(

1998

)

14719

–

14735

Mathews

Sabina

Zuker

Turner

Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA secondary structure

J. Mol. Biol.

(

1999

)

288

911

–

940

Turner

Mathews

A set of nearest neighbor parameters for predicting the enthalpy change of RNA secondary structure formation

Nucleic Acids Res.

(

2006

)

4912

–

4924

SantaLucia

A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics

Proc. Natl Acad. Sci. USA

(

1998

)

1460

–

1465

Google Scholar

Crossref

WorldCat

SantaLucia

Hicks

The thermodynamics of DNA structural motifs

Annu. Rev. Biophys. Biomol. Struct.

(

2004

)

415

–

440

Zuker

Mfold web server for nucleic acid folding and hybridization prediction

Nucleic Acids Res.

(

2003

)

3406

–

3415

Zuker

Mathews

Turner

RNA Biochemistry and Biotechnology

—

Barciszewski

Clark

BFC

, eds. (

1999

)

Boston

Kluwer Academic Publishers

–

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

10.

Hofacker

Fontana

Stadler

Bonhoeffer

Tacker

Schuster

Fast folding and comparison of RNA secondary structures

Monatsh. Chem.

(

1994

)

125

167

–

168

Google Scholar

Crossref

WorldCat

11.

Andronescu

Aguirre-Hernandez

Condon

Hoos

RNAsoft: a suite of RNA secondary structure prediction and design software tools

Nucleic Acids Res.

(

2003

)

3416

–

3422

12.

Ding

Chan

Lawrence

Sfold web server for statistical folding and rational design of nucleic acids

Nucleic Acids Res.

(

2004

)

W135

–

W141

13.

Woods

Batzoglou

CONTRAfold: RNA secondary structure prediction without physics-based models

Bioinformatics

(

2006

)

e90

–

e98

14.

Andronescu

Condon

Hoos

Mathews

Murphy

Efficient parameter estimation for RNA secondary structure prediction

Bioinformatics

(

2007

)

i19

–

i28

15.

Dirks

Pierce

A partition function algorithm for nucleic acid secondary structure including pseudoknots

J. Comput. Chem.

(

2003

)

1664

–

1677

16.

Gultyaev

van Batenburg

FHD

Pleij

CWA

An approximation of loop free energy values of RNA H-pseudoknots

RNA

(

1999

)

609

–

617

17.

Cao

Chen

Predicting RNA pseudoknot folding thermodynamics

Nucleic Acids Res.

(

2006

)

2634

–

2652

18.

Cao

Chen

Predicting structures and stabilities for H-type pseudoknots with interhelix loops

RNA

(

2009

)

696

–

706

19.

Smit

Rother

Heringa

Knight

From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal

RNA

(

2008

)

410

–

416

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
April 2017	6
May 2017	9
June 2017	14
July 2017	6
August 2017	5
September 2017	11
October 2017	11
November 2017	25
December 2017	27
January 2018	28
February 2018	49
March 2018	48
April 2018	78
May 2018	40
June 2018	38
July 2018	28
August 2018	33
September 2018	17
October 2018	27
November 2018	36
December 2018	37
January 2019	33
February 2019	28
March 2019	46
April 2019	37
May 2019	24
June 2019	38
July 2019	43
August 2019	52
September 2019	49
October 2019	46
November 2019	43
December 2019	43
January 2020	38
February 2020	64
March 2020	31
April 2020	46
May 2020	39
June 2020	42
July 2020	43
August 2020	41
September 2020	40
October 2020	60
November 2020	46
December 2020	49
January 2021	61
February 2021	54
March 2021	70
April 2021	64
May 2021	66
June 2021	62
July 2021	70
August 2021	82
September 2021	75
October 2021	81
November 2021	78
December 2021	68
January 2022	101
February 2022	69
March 2022	94
April 2022	89
May 2022	76
June 2022	56
July 2022	63
August 2022	86
September 2022	59
October 2022	77
November 2022	85
December 2022	74
January 2023	74
February 2023	98
March 2023	120
April 2023	92
May 2023	114
June 2023	83
July 2023	43
August 2023	89
September 2023	71
October 2023	86
November 2023	125
December 2023	113
January 2024	111
February 2024	81
March 2024	85
April 2024	77
May 2024	97
June 2024	55
July 2024	79
August 2024	69
September 2024	72
October 2024	83
November 2024	98
December 2024	76
January 2025	70
February 2025	85
March 2025	105
April 2025	92
May 2025	31

Article Contents

NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure

ABSTRACT

INTRODUCTION

WEBSITE ORGANIZATION

WEBSITE CONTENT

DISCUSSION

FUNDING

REFERENCES

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure

ABSTRACT

INTRODUCTION

WEBSITE ORGANIZATION

WEBSITE CONTENT

DISCUSSION

FUNDING

REFERENCES

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only