DOMIRE: a web server for identifying structural domains and their neighbors in proteins

Author Notes

Abstract

Summary: The DOMIRE web server implements a novel, automatic, protein structural domain assignment procedure based on 3D substructures of the query protein which are also found within structures of a non-redundant protein database. These common 3D substructures are transformed into a co-occurrence matrix that offers a global view of the protein domain organization. Three different algorithms are employed to define structural domain boundaries from this co-occurrence matrix. For each query, a list of structural neighbors and their alignments are provided. DOMIRE, by displaying the protein structural domain organization, can be a useful tool for defining protein common cores and for unravelling the evolutionary relationship between different proteins.

Availability: http://genome.jouy.inra.fr/domire

Contact: [email protected]

1 INTRODUCTION

The modular nature of proteins is well established. First described at the level of their structure, based on their compactness character and folding properties (Wetlaufer, 1973), their evolutionary role has been substantiated by comparing amino acid sequences, following the pioneering works of Doolittle (1985). Databases such as ProDom (http://prodom.prabi.fr/prodom/current/html/home.php) or Pfam (http://pfam.sanger.ac.uk/) provide domain definitions at the level of amino acid sequences. Databases such as CATH (http://www.cathdb.info/) and SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/index.html) define domain at the level of the 3D structures. Domains are exchangeable segments of amino acids that retain their 3D structure and molecular function. Domain identification is thus an important tool for a number of studies about protein 3D structures or evolution.

Recently, we applied the protein structure comparison program VAST (Gibrat et al., 1996; Madej et al., 1995) and we found that the recurrence of small common 3D substructures (typically four secondary structures) between the query protein and proteins of a non-redundant dataset were sufficient to define the boundaries of the domains (Tai et al., 2011). The methodology described in the latter paper is now available as an online server named DOMIRE for DOMain Identification from REcurrence. The server also provides, for each query, a list of structural neighbors with their alignments.

2 METHODS

Domain definitions: VAST was used with very liberal cut-offs (Pcli ≥−10 and rmsd ≤5 Å) to collect a maximum number of common 3D substructures between the query protein and target proteins of the non-redundant dataset. Unaligned regions of <40 residues between two aligned secondary structures were also included, giving rise to padded Locally Similar Structural Pieces (pLSSPs). These pLSSPs were mapped onto the query protein, resulting in an alignment matrix which was then transformed into a co-occurrence N matrix (Fig. 1a) from which the domains were parsed by three different methods: PCM, SMF and SVD (see Tai et al., 2011 for details).

Fig. 1.

(a) N matrix of the phenylalanyl-tRNA syhthetase (1jjc) chain B (contour map) where N_ij is the number of pLSSP in which residues i and j of the query protein are found together. This chain consists of six domains. For instance, SMF algorithm finds the following domains: D1: (1–41, 151–196), D2: (42–150), D3: (197–398), D4: (399–484), D5: (485–676), D6: (677–785). Note that D1 is a segmented domain as defined in CATH and SCOP. This matrix made of several ten of thousands pLSSPs, gives a global view of the recurrence and of the domain organization. (b) An example of aligned structural neighbors for 1jjc chain B, sorted by decreasing numbers of percent of aligned residues, here a snapshot of 20 out of a few hundreds with their corresponding PDB names.

Open in new tab Download slide

Structural neighbors: from the list of pLSSPs used to build the N matrix, some targets are highlighted when they fulfil the following two criteria: (i) the length of the target pLSSP amounts to ≥80% of the target length and (ii) >40% of the target length are aligned by VAST to the query in the corresponding common 3D substructure. In other words, this region in the query protein can be extensively aligned with most of the target 3D structure in the PDB. These targets define the structural neighbors. Besides the list of these targets, a graphic representation of their alignments along the query is provided (Fig. 1b).

DOMIRE input/output: DOMIRE takes as input a single protein chain with its PDB accession code. Alternately, user can upload a file of coordinates with a PDB format. When the job is completed, the user receives an email to reach a web page displaying three interactive 3D representations of the query with colored domains using the Jmol applet (one for each domain assignment method). It shows also the N matrix as a heat map and a contour map (Fig. 1a) as well as the alignments of the structural neighbors on the query protein (Fig. 1b). These results are available online for 1 month and can be downloaded as a tarball.

3 DISCUSSION

With a benchmark of 128 chains, using SCOP or/and CATH classifications as gold standards, SMF and SVD algorithms provided results similar to those of the PUU server but less close to those of Domain Parser or PDP (Tai et al., 2011). PCM performs better than SMF and SVD for chains having three or more domains. For that reason, we provide the results of the three algorithms on the web site with their, possibly, different boundaries.

The criterion for selecting the structural neighbors of 40% or more aligned residues is based on the fact that VAST alignments involve essentially the secondary structures and that on average secondary structures amount to ∼50% of the residues of a protein. The criterion of at least 80% of the target aligned with the query tends to select regions of the query that can be aligned with whole structures in the PDB. Some of them are homologues of the query or of its domains (to be published).

4 CONCLUSION

In addition to offering an automatic partitioning of protein structures into domains with performances comparable with the best existing programs, DOMIRE provides the identification and alignments of structural neighbors. This can be useful for identifying remote homologues. It provides a tool to analyze the common 3D substructures in polypeptide chains shedding light on their evolution.

ACKNOWLEDGEMENT

We are grateful to the INRA MIGALE platform for providing computational resources.

Funding: Intramural Research Program of the Center for Cancer Research, National Cancer Institute and of the Division of Computational Bioscience, Center for Information Technology, NIH in USA and financially supported by the Institut National de la Recherche Agronomique in France (in part).

Conflict of Interest: none declared.

REFERENCES

Doolittle

R.F.

The genealogy of some recently evolved vertebrate proteins

Trends Biochem. Sci.

1985

, vol.

(pg.

233

237

)

Google Scholar

Crossref

WorldCat

Gibrat

J.F.

et al. ,

Surprising similarities in structure comparison

Curr. Opin. Struct. Biol.

1996

, vol.

(pg.

377

385

)

Madej

et al. ,

Threading a database of protein cores

Proteins

1995

, vol.

(pg.

356

369

)

Tai

C.-H.

et al. ,

Protein domain assignment from the recurrence of locally similar structures

Proteins

2011

, vol.

(pg.

853

866

)

Wetlaufer

D.B.

Nucleation, rapid folding and globular intra chain regions in proteins

Proc. Natl Acad. Sci. USA

1973

, vol.

(pg.

697

701

)

Google Scholar

Crossref

WorldCat

Author notes

Associate Editor: Anna Tramontano

Download all slides

Month:	Total Views:
December 2016	1
February 2017	6
March 2017	1
May 2017	4
June 2017	1
July 2017	3
August 2017	4
September 2017	1
October 2017	3
November 2017	3
December 2017	8
January 2018	8
February 2018	8
March 2018	16
April 2018	14
May 2018	8
June 2018	5
July 2018	10
August 2018	6
September 2018	5
October 2018	6
November 2018	13
December 2018	9
January 2019	7
February 2019	7
March 2019	14
April 2019	18
May 2019	14
June 2019	9
July 2019	12
August 2019	21
September 2019	22
October 2019	14
November 2019	6
December 2019	6
January 2020	13
February 2020	8
March 2020	12
April 2020	4
May 2020	8
June 2020	11
July 2020	5
August 2020	7
September 2020	22
October 2020	15
November 2020	3
December 2020	5
January 2021	1
February 2021	5
March 2021	8
April 2021	15
May 2021	5
June 2021	5
July 2021	10
August 2021	5
September 2021	2
October 2021	6
November 2021	6
December 2021	2
January 2022	3
February 2022	7
March 2022	8
April 2022	3
May 2022	3
June 2022	5
July 2022	10
August 2022	8
September 2022	3
October 2022	4
November 2022	1
December 2022	2
January 2023	2
February 2023	3
March 2023	3
April 2023	7
May 2023	5
June 2023	2
July 2023	2
August 2023	10
September 2023	3
October 2023	10
November 2023	9
December 2023	7
January 2024	11
February 2024	7
March 2024	5
April 2024	8
May 2024	6
June 2024	9
July 2024	9
August 2024	5
September 2024	10
October 2024	5
November 2024	3
December 2024	5
January 2025	3
February 2025	2
March 2025	6
April 2025	4
May 2025	2

Month:

Total Views:

December 2016

February 2017

March 2017

May 2017

June 2017

July 2017

August 2017

September 2017

October 2017

November 2017

December 2017

January 2018

February 2018

March 2018

April 2018

May 2018

June 2018

July 2018

August 2018

September 2018

October 2018

November 2018

December 2018

January 2019

February 2019

March 2019

April 2019

May 2019

June 2019

July 2019

August 2019

September 2019

October 2019

November 2019

December 2019

January 2020

February 2020

March 2020

April 2020

May 2020

June 2020

July 2020

August 2020

September 2020

October 2020

November 2020

December 2020

January 2021

February 2021

March 2021

April 2021

May 2021

June 2021

July 2021

August 2021

September 2021

October 2021

November 2021

December 2021

January 2022

February 2022

March 2022

April 2022

May 2022

June 2022

July 2022

August 2022

September 2022

October 2022

November 2022

December 2022

January 2023

February 2023

March 2023

April 2023

May 2023

June 2023

July 2023

August 2023

September 2023

October 2023

November 2023

December 2023

January 2024

February 2024

March 2024

April 2024

May 2024

June 2024

July 2024

August 2024

September 2024

October 2024

November 2024

December 2024

January 2025

February 2025

March 2025

April 2025

May 2025

Article Contents

DOMIRE: a web server for identifying structural domains and their neighbors in proteins

Abstract

1 INTRODUCTION

2 METHODS

3 DISCUSSION

4 CONCLUSION

ACKNOWLEDGEMENT

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

Article Contents

DOMIRE: a web server for identifying structural domains and their neighbors in proteins

Abstract

1 INTRODUCTION

2 METHODS

3 DISCUSSION

4 CONCLUSION

ACKNOWLEDGEMENT

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

This Feature Is Available To Subscribers Only