Fold-LTR-TCP: protein fold recognition based on triadic closure principle

Performance of TCP, PageRank and HITS performed on five different protein similarity networks on LE benchmark dataset via 2-fold cross-validation

Methods	Accuracy
PageRank (LTR)^a	72.5%
HITS (LTR)^a	70.1%
TCP (LTR) ^a	73.2%
PageRank (DF)^b	71.3%
HITS (DF)^b	69.4%
TCP (DF)^b	72.6%
PageRank (CS)^c	4.9%
HITS (CS)^c	10.1%
TCP (CS)^c	20.9%
PageRank (CL)^d	9.3%
HITS (CL)^d	12.5%
TCP (CL)^d	19.3%
PageRank (GK)^e	6.5%
HITS (GK)^e	10.3%
TCP (GK)^e	11.2%

Methods	Accuracy
PageRank (LTR)^a	72.5%
HITS (LTR)^a	70.1%
TCP (LTR) ^a	73.2%
PageRank (DF)^b	71.3%
HITS (DF)^b	69.4%
TCP (DF)^b	72.6%
PageRank (CS)^c	4.9%
HITS (CS)^c	10.1%
TCP (CS)^c	20.9%
PageRank (CL)^d	9.3%
HITS (CL)^d	12.5%
TCP (CL)^d	19.3%
PageRank (GK)^e	6.5%
HITS (GK)^e	10.3%
TCP (GK)^e	11.2%

a

^athe network based on LTR calculated by LTR;

b

^bthe network based on DF calculated by DeepFRpro [34];

c

^cthe network based on CS calculated by cosine function [18];

d

^dthe networks based on CL calculated by correlation function [18];

e

^ethe networks based on GK calculated by Gaussian kernel function [18].

Table 1

Performance of TCP, PageRank and HITS performed on five different protein similarity networks on LE benchmark dataset via 2-fold cross-validation

Methods	Accuracy
PageRank (LTR)^a	72.5%
HITS (LTR)^a	70.1%
TCP (LTR) ^a	73.2%
PageRank (DF)^b	71.3%
HITS (DF)^b	69.4%
TCP (DF)^b	72.6%
PageRank (CS)^c	4.9%
HITS (CS)^c	10.1%
TCP (CS)^c	20.9%
PageRank (CL)^d	9.3%
HITS (CL)^d	12.5%
TCP (CL)^d	19.3%
PageRank (GK)^e	6.5%
HITS (GK)^e	10.3%
TCP (GK)^e	11.2%

Methods	Accuracy
PageRank (LTR)^a	72.5%
HITS (LTR)^a	70.1%
TCP (LTR) ^a	73.2%
PageRank (DF)^b	71.3%
HITS (DF)^b	69.4%
TCP (DF)^b	72.6%
PageRank (CS)^c	4.9%
HITS (CS)^c	10.1%
TCP (CS)^c	20.9%
PageRank (CL)^d	9.3%
HITS (CL)^d	12.5%
TCP (CL)^d	19.3%
PageRank (GK)^e	6.5%
HITS (GK)^e	10.3%
TCP (GK)^e	11.2%

a

^athe network based on LTR calculated by LTR;

b

^bthe network based on DF calculated by DeepFRpro [34];

c

^cthe network based on CS calculated by cosine function [18];

d

^dthe networks based on CL calculated by correlation function [18];

e

^ethe networks based on GK calculated by Gaussian kernel function [18].

In order to illustrate the computational efficiency of the Fold-LTR-TCP predictor, its time complexity is analyzed. There are two algorithms in Fold-LTR-TCP, including lambdaMART algorithm in LTR model, and TCP. The time complexity of lambdaMART is |$O(TNM)$|⁠, where |$T$| is the maximum number of the iterations, |$N$| is the number of the query samples of each iteration and |$M$| is the number of the feedback proteins of each query sample. In the TCP, the total number of query proteins in the protein similar network is |$N$| (cf. Equation 12), and the time complexity of the TCP is |$O\Big({N}^2\Big)$|⁠. Therefore, the time complexity of the Fold-LTR-TCP predictor is |$O\Big( TNM+{N}^2\Big)$|⁠. In this study, the LE dataset was split into two subsets with 159 sequences and 162 sequences. The 2-fold cross-validation strategy was used to evaluate the performance of Fold-LTR-TCP. The total training time is 17 340 s and the total test time is only 170 s. This experiment was performed on a computer with the CPU of 20 cores with 2.4GHz and memory of 128G, indicating that the Fold-LTR-TCP predictor is an efficient method with high accuracy.

Comparison with other competing methods

The performance of the proposed Fold-LTR-TCP predictor is compared with other predictors, including PSI-Blast [7], HMMER [16], SAM-T98 [15], BLASTLINK [39], SSEARCH [51], SSHMM [52], THREADER [53], Fugue [54], RAPTOR [55], SPARKS [56], SPARKS-X [14], SP3 [57], SP4 [58], SP5 [59], BoostThreader [60], HH-fold [61], RFDN-Fold [62], DN-FoldS [62], DN-FoldR [62], MT-fold [63], HHpred [12], FFAS-3D [13], TA-fold [61], FOLDpro [64], DN-Fold [62], RNDN-Fold [62], RF-Fold [18], dRHP-PseRA [65], DeepFRpro [34] and DeepSVM-fold [35]. Table 2 shows the performance of these aforementioned methods, from which we conclude that the Fold-LTR-TCP predictor achieves the best performance. Besides Fold-LTR-TCP, all the other methods can only consider the pairwise similarity between the query protein and the template protein. The Fold-LTR-TCP is the first predictor to consider the global relationships among the query proteins and the template proteins based on the protein similarity network (Figure 3). This is the main reason for its better performance. These results further confirm that the Fold-LTR-TCP predictor is efficient for protein fold recognition and will facilitate the studies of protein structures and functions.

Table 2

Performance comparison of the Fold-LTR-TCP with other state-of-the-art methods on LE benchmark dataset via 2-fold cross-validation

Methods	Accuracy	Source
PSI-Blast	4.0%	[39]
HMMER	4.4%	[39]
SAM-T98	3.4%	[39]
BLASTLINK	6.9%	[39]
SSEARCH	5.6%	[39]
SSHMM	6.9%	[39]
THREADER	14.6%	[39]
Fugue	12.5%	[64]
RAPTOR	25.4%	[64]
SPARKS	28.7%	[64]
SP3	30.8%	[64]
FOLDpro	26.5%	[64]
HHpred	25.2%	[62]
SP4	30.8%	[62]
SP5	37.9%	[62]
BoostThreader	42.6%	[62]
SPARKS-X	45.2%	[62]
RF-Fold	40.8%	[62]
DN-Fold	33.6%	[62]
RFDN-Fold	37.7%	[62]
DN-FoldS	33.3%	[62]
DN-FoldR	27.4%	[62]
FFAS-3D	35.8%	[61]
HH-fold	42.1%	[61]
TA-fold	53.9%	[61]
dRHP-PseRA	34.9%	[65]
MT-fold	59.1%	[65]
DeepFRpro	66.0%	[34]
DeepSVM-fold	67.3%	[35]
Fold-LTR-TCP	73.2%	This study

Methods	Accuracy	Source
PSI-Blast	4.0%	[39]
HMMER	4.4%	[39]
SAM-T98	3.4%	[39]
BLASTLINK	6.9%	[39]
SSEARCH	5.6%	[39]
SSHMM	6.9%	[39]
THREADER	14.6%	[39]
Fugue	12.5%	[64]
RAPTOR	25.4%	[64]
SPARKS	28.7%	[64]
SP3	30.8%	[64]
FOLDpro	26.5%	[64]
HHpred	25.2%	[62]
SP4	30.8%	[62]
SP5	37.9%	[62]
BoostThreader	42.6%	[62]
SPARKS-X	45.2%	[62]
RF-Fold	40.8%	[62]
DN-Fold	33.6%	[62]
RFDN-Fold	37.7%	[62]
DN-FoldS	33.3%	[62]
DN-FoldR	27.4%	[62]
FFAS-3D	35.8%	[61]
HH-fold	42.1%	[61]
TA-fold	53.9%	[61]
dRHP-PseRA	34.9%	[65]
MT-fold	59.1%	[65]
DeepFRpro	66.0%	[34]
DeepSVM-fold	67.3%	[35]
Fold-LTR-TCP	73.2%	This study

The bold values represent the proposed method achieving the top performance.

Table 2

Performance comparison of the Fold-LTR-TCP with other state-of-the-art methods on LE benchmark dataset via 2-fold cross-validation

Methods	Accuracy	Source
PSI-Blast	4.0%	[39]
HMMER	4.4%	[39]
SAM-T98	3.4%	[39]
BLASTLINK	6.9%	[39]
SSEARCH	5.6%	[39]
SSHMM	6.9%	[39]
THREADER	14.6%	[39]
Fugue	12.5%	[64]
RAPTOR	25.4%	[64]
SPARKS	28.7%	[64]
SP3	30.8%	[64]
FOLDpro	26.5%	[64]
HHpred	25.2%	[62]
SP4	30.8%	[62]
SP5	37.9%	[62]
BoostThreader	42.6%	[62]
SPARKS-X	45.2%	[62]
RF-Fold	40.8%	[62]
DN-Fold	33.6%	[62]
RFDN-Fold	37.7%	[62]
DN-FoldS	33.3%	[62]
DN-FoldR	27.4%	[62]
FFAS-3D	35.8%	[61]
HH-fold	42.1%	[61]
TA-fold	53.9%	[61]
dRHP-PseRA	34.9%	[65]
MT-fold	59.1%	[65]
DeepFRpro	66.0%	[34]
DeepSVM-fold	67.3%	[35]
Fold-LTR-TCP	73.2%	This study

Methods	Accuracy	Source
PSI-Blast	4.0%	[39]
HMMER	4.4%	[39]
SAM-T98	3.4%	[39]
BLASTLINK	6.9%	[39]
SSEARCH	5.6%	[39]
SSHMM	6.9%	[39]
THREADER	14.6%	[39]
Fugue	12.5%	[64]
RAPTOR	25.4%	[64]
SPARKS	28.7%	[64]
SP3	30.8%	[64]
FOLDpro	26.5%	[64]
HHpred	25.2%	[62]
SP4	30.8%	[62]
SP5	37.9%	[62]
BoostThreader	42.6%	[62]
SPARKS-X	45.2%	[62]
RF-Fold	40.8%	[62]
DN-Fold	33.6%	[62]
RFDN-Fold	37.7%	[62]
DN-FoldS	33.3%	[62]
DN-FoldR	27.4%	[62]
FFAS-3D	35.8%	[61]
HH-fold	42.1%	[61]
TA-fold	53.9%	[61]
dRHP-PseRA	34.9%	[65]
MT-fold	59.1%	[65]
DeepFRpro	66.0%	[34]
DeepSVM-fold	67.3%	[35]
Fold-LTR-TCP	73.2%	This study

The bold values represent the proposed method achieving the top performance.

Besides the aforementioned methods, DeepSF [49] and SVM-fold [61] are two state-of-the-art methods. However, these two methods cannot be directly compared with our method and other related methods on the same LE dataset for the following reasons: (i) these two methods were not evaluated on the LE dataset; and (ii) they contain several hyper-parameters, which should be optimized for different datasets, and their source code is unavailable. However, our method can indirectly compare with these two methods. Evaluated on two benchmark datasets derived from SCOP database, the DeepSF achieved accuracies of 75.3% and 73% as reported in [49]. The Fold-LTR-TCP achieved an accuracy of 73.2% on the LE benchmark dataset, which was also derived from SCOP database. Therefore, we conclude that the Fold-LTR-TCP is better than or at least comparable with DeepSF. The Fold-LTR-TCP predictor was directly compared with TA-fold, and the performance was shown in Table 2. TA-fold combined SVM-fold and HH-fold and outperformed the SVM-fold as reported in [61]. Because the Fold-LTR-TCP outperformed TA-fold (Table 2), we conclude that Fold-LTR-TCP is better than SVM-fold.

Conclusion

In this study, we proposed a new computational predictor called Fold-LTR-TCP for protein fold recognition by combining the LTR and TCP. The Fold-LTR-TCP predictor is a general method for detecting different protein fold types. Because the protein folds with more proteins will provide more training samples for training Fold-LTR-TCP, it will achieve better performance for these protein folds. Experimental results showed that the Fold-LTR-TCP outperformed other competing methods. Fold-LTR-TCP has the following advantages: (i) it incorporates various features into the framework of LTR model in a supervised manner, treating protein fold recognition as a ranking task; and (ii) the ranking list generated by the LTR model is further improved by using the TCP by considering the one-to-multiple relationship among query protein and multiple template proteins. To the best knowledge of ours, Fold-LTR-TCP is the first approach to use the global relationships among the query proteins and all template proteins for protein fold recognition. It can be anticipated that the proposed framework would have many potential applications when the global interactions among biological sequences are required, such as protein complex identification [66], circRNA–disease association prediction [67], microRNA–disease identification [68], etc.

Key Points

Protein fold recognition is a very important problem in the field of protein structure and function studies. Although the existing computational predictors contribute the development of this field, they failed to accurately detect the protein folds due to the low sequence similarities of proteins in the same fold.
This study represents a new predictor called Fold-LTR-TCP for protein fold recognition. The protein similarity network describing the relationship among proteins was constructed based on LTR model, and then TCP was performed on this network by considering the one-to-multiple relationship among query protein and multiple template proteins for accurate protein fold recognition.
Experimental results on the LE benchmark dataset showed the proposed Fold-LTR-TCP outperformed 29 existing state-of-the-art methods for protein fold recognition. To the best knowledge of ours, Fold-LTR-TCP is the first predictor considering the relationships among proteins in the dataset for protein fold recognition, which is main reason for its better performance.

Acknowledgements

The authors are very much indebted to the four anonymous reviewers, whose constructive comments are very helpful in strengthening the presentation of this article.

Funding

This work was supported by the National Natural Science Foundation of China (61672184, 61822306), Fok Ying-Tung Education Foundation for Young Teachers in the Higher Education Institutions of China (161063) and Scientific Research Foundation in Shenzhen (JCYJ20180306172207178).

Bin Liu, PhD, is a professor at the School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China. His expertise is in bioinformatics, nature language processing and machine learning.

Yulin Zhu

Yulin Zhu is a master student at the School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China. His expertise is in bioinformatics.

Ke Yan

Ke Yan is a is a Ph. D candidate at the School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China. His expertise is in bioinformatics.

References

1.

Chen

JJ

,

Guo

MY

,

Li

SM

, et al.

ProtDec-LTR2.0: an improved method for protein remote homology detection by combining pseudo protein and supervised learning to rank

.

Bioinformatics

2017

;

33

:

3473

–

6

.

2.

Stroud

RM

.

Introduction to protein-structure. Branden, C, Tooze, J

.

Science

1991

;

253

:

685

–

6

.

3.

Sander

C

,

Marks

D

.

Solutions to the computational protein folding problem

.

FASEB J

2018

;

32

.

4.

Wei

L

,

Zou

Q

.

Recent progress in machine learning-based methods for protein fold recognition

.

Int J Mol Sci

2016

;

17

:

2118

.

5.

Weston

J

,

Kuang

R

,

Leslie

C

, et al.

Protein ranking by semi-supervised network propagation

.

BMC Bioinformatics

2006

;

7

.

6.

O'Driscoll

A

,

Belogrudov

V

,

Carroll

J

, et al.

HBLAST: parallelised sequence similarity—a Hadoop MapReducable basic local alignment search tool

.

J Biomed Inform

2015

;

54

:

58

–

64

.

7.

Altschul

SF

,

Gish

W

,

Miller

W

, et al.

Basic local alignment search tool

.

J Mol Biol

1990

;

215

:

403

–

10

.

8.

Pearson

WR

.

Searching protein-sequence libraries—comparison of the sensitivity and selectivity of the Smith–Waterman and Fasta algorithms

.

Genomics

1991

;

11

:

635

–

50

.

9.

Zou

Q

,

Hu

Q

,

Guo

M

, et al.

HAlign: fast multiple similar DNA/RNA sequence alignment based on the centre star strategy

.

Bioinformatics

2015

;

31

:

2475

–

81

.

10.

Wan

S

,

Zou

Q

.

HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing

.

Algorithms Mol Biol

2017

;

12

:

25

.

11.

Baldi

P

,

Chauvin

Y

,

Hunkapiller

T

, et al.

Hidden Markov-models of biological primary sequence information

.

Proc Natl Acad Sci U S A

1994

;

91

:

1059

–

63

.

12.

Soding

J

,

Biegert

A

,

Lupas

AN

.

The HHpred interactive server for protein homology detection and structure prediction

.

Nucleic Acids Res

2005

;

33

:

W244

–

8

.

13.

Xu

D

,

Jaroszewski

L

,

Li

ZW

, et al.

FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking

.

Bioinformatics

2014

;

30

:

660

–

7

.

14.

Carlson

BE

,

Ostgaard

N

,

Kochkin

P

, et al.

Meter-scale spark X-ray spectrum statistics

.

J Geophys Res Atmos

2015

;

120

:

11191

–

202

.

15.

Karplus

K

,

Barrett

C

,

Hughey

R

.

Hidden Markov models for detecting remote protein homologies

.

Bioinformatics

1998

;

14

:

846

–

56

.

16.

Finn

RD

,

Clements

J

,

Eddy

SR

.

HMMER web server: interactive sequence similarity searching

.

Nucleic Acids Res

2011

;

39

:

W29

–

37

.

17.

Remmert

M

,

Biegert

A

,

Hauser

A

, et al.

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment

.

Nat Methods

2012

;

9

:

173

–

5

.

18.

Jo

T

,

Cheng

JL

.

Improving protein fold recognition by random forest

.

BMC Bioinformatics

2014

;

15

:

S14

.

19.

Liu

XQ

,

Wu

QL

,

Pan

WT

.

Sentiment classification of micro-blog comments based on Randomforest algorithm

.

Concurr Comput

2019

;

31

.

20.

Ding

CH

,

Dubchak

I

.

Multi-class protein fold recognition using support vector machines and neural networks

.

Bioinformatics

2001

;

17

:

349

–

58

.

21.

Polat

O

,

Dokur

Z

.

Protein fold classification with grow-and-learn network

.

Turk J Electrical Eng Comp Sci

2017

;

25

:

1184

–

96

.

22.

Yan

K

,

Fang

X

,

Xu

Y

, et al.

Protein fold recognition based on multi-view Modeling

.

Bioinformatics

2019

;

35

:

2982

–

2990

.

23.

Liu

B

,

Chen

J

,

Guo

M

, et al.

Protein remote homology detection and fold recognition based on sequence-order frequency matrix

.

IEEE/ACM Trans Comput Biol Bioinform

2019

;

16

:

292

–

300

.

24.

Liu

B

,

Wang

X

,

Lin

L

, et al.

A discriminative method for protein remote homology detection and fold recognition combining top-n-grams and latent semantic analysis

.

BMC Bioinformatics

2008

;

9

:

510

.

25.

Yan

K

,

Xu

Y

,

Fang

X

, et al.

Protein fold recognition based on sparse representation based classification

.

Artif Intell Med

2017

;

79

:

1

–

8

.

26.

Zou

Q

,

Xing

P

,

Wei

L

, et al.

Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA

.

RNA

2019

;

25

:

205

–

18

.

27.

Zhang

Z

,

Zhao

Y

,

Liao

X

, et al.

Deep learning in omics: a survey and guideline

.

Brief Funct Genomics

2019

;

18

:

41

–

57

.

28.

Yu

L

,

Sun

X

,

Tian

SW

, et al.

Drug and nondrug classification based on deep learning with various feature selection strategies

.

Curr Bioinform

2018

;

13

:

253

–

9

.

29.

Lv

ZB

,

Ao

CY

,

Zou

Q

.

Protein function prediction: from traditional classifier to deep learning

.

Proteomics

2019

;

19

:

2

.

30.

Peng

L

,

Peng

MM

,

Liao

B

, et al.

The advances and challenges of deep learning application in biological big data processing

.

Curr Bioinform

2018

;

13

:

352

–

9

.

31.

Wei

L

,

Su

R

,

Wang

B

, et al.

Integration of deep feature representations and handcrafted features to improve the prediction of N 6-methyladenosine sites

.

Neurocomputing

2019

;

324

:

3

–

9

.

32.

Liu

B

,

Li

S

.

ProtDet-CCH: protein remote homology detection by combining long short-term memory and ranking methods

.

IEEE/ACM Trans Comput Biol Bioinform

2019

;

16

:

1203

–

10

.

33.

Li

C-C

,

Liu

B

.

MotifCNN-fold: Protein Fold Recognition based on Fold-specific Features Extracted by Motif-Based Convolutional Neural Networks

.

Briefings in Bioinformatics

; doi:

10.1093/bib/bbz133

.

34.

Zhu

JW

,

Zhang

HC

,

Li

SC

, et al.

Improving protein fold recognition by extracting fold-specific features from predicted residue–residue contacts

.

Bioinformatics

2017

;

33

:

3749

–

57

.

35.

Liu

B

,

Li

C

,

Yan

K

.

DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks

.

Brief Bioinform

. doi:

10.1093/bib/bbz098

.

36.

Liu

B

,

Jiang

S

,

Zou

Q

.

HITS-PR-HHblits: Protein Remote Homology Detection by Combining PageRank and Hyperlink-Induced Topic Search

.

Briefings in Bioinformatics

; doi:

10.1093/bib/bby104

.

37.

Trotman

A

.

Learning to rank

.

Inform Retrieval

2005

;

8

:

359

–

81

.

38.

Kovacs

IA

,

Luck

K

,

Spirohn

K

, et al.

Network-based prediction of protein interactions

.

Nat Commun

2019

;

10

:

1240

.

39.

Lindahl

E

,

Elofsson

A

.

Identification of related proteins on family, superfamily and fold level

.

J Mol Biol

2000

;

295

:

613

–

25

.

40.

Chandonia

JM

,

Hon

G

,

Walker

NS

, et al.

The ASTRAL compendium in 2004

.

Nucleic Acids Res

2004

;

32

:

D189

–

92

.

41.

Liu

B

,

Chen

J

,

Wang

X

.

Application of learning to rank to protein remote homology detection

.

Bioinformatics

2015

;

31

:

3492

–

8

.

42.

Liu

B

,

Zhu

Y

.

ProtDec-LTR3.0: protein remote homology detection by incorporating profile-based features into learning to rank

.

IEEE Access

2019

;

7

:

102499

–

507

.

43.

Liu

B

,

Chen

JJ

,

Wang

XL

.

Protein remote homology detection by combining Chou's distance-pair pseudo amino acid composition and principal component analysis

.

Briefings in Bioinformatics

; doi:

10.1093/bib/bbx165

.

44.

Liu

B

.

BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches

.

Briefings in Bioinformatics

; doi:

10.1093/bib/bbx165

.

45.

Dong

QW

,

Zhou

SG

,

Guan

JH

.

A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation

.

Bioinformatics

2009

;

25

:

2655

–

62

.

46.

Liu

B

,

Gao

X

,

Zhang

H

.

BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA, and protein sequences at sequence level and residue level based on machine learning approaches

.

Nucleic Acids Res

. doi:

10.1093/nar/gkz740

.

47.

Liu

B

,

Wang

XL

,

Lin

L

, et al.

A discriminative method for protein remote homology detection and fold recognition combining top-n-grams and latent semantic analysis

.

BMC Bioinformatics

2008

;

9

.

48.

Mulekar

MS

,

Brown

CS

,

Mulekar

MS

, et al.

Distance and Similarity Measures

,

2014

.

49.

Hou

J

,

Adhikari

B

,

Cheng

JL

.

DeepSF: deep convolutional neural network for mapping protein sequences to folds

.

Bioinformatics

2018

;

34

:

1295

–

303

.

50.

Drago

F

,

Myszkowski

K

,

Annen

T

, et al.

Adaptive logarithmic mapping for displaying high contrast scenes

.

Comput Graph Forum

2003

;

22

:

419

–

26

.

51.

Pearson

WR

.

Comparison of methods for searching protein-sequence databases

.

Protein Sci

1995

;

4

:

1145

–

60

.

52.

Hargbo

J

,

Elofsson

A

.

Hidden Markov models that use predicted secondary structures for fold recognition

.

Proteins

1999

;

36

:

68

–

76

.

53.

Jones

DT

,

Taylor

WR

,

Thornton

JM

.

A new approach to protein fold recognition

.

Nature

1992

;

358

:

86

–

9

.

54.

Shi

JY

,

Blundell

TL

,

Mizuguchi

K

.

FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties

.

J Mol Biol

2001

;

310

:

243

–

57

.

55.

Xu

J

,

Li

M

,

Kim

D

, et al.

RAPTOR: optimal protein threading by linear programming

.

J Bioinform Comput Biol

2003

;

1

:

95

–

117

.

56.

Zhou

H

,

Zhou

Y

.

Single-body residue-level knowledge-based energy score combined with sequence-profile and secondary structure information for fold recognition

.

Proteins

2004

;

55

:

1005

–

13

.

57.

Zhou

HY

,

Zhou

YQ

.

Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments

.

Proteins

2005

;

58

:

321

–

8

.

58.

Liu

S

,

Zhang

C

,

Liang

SD

, et al.

Fold recognition by concurrent use of solvent accessibility and residue depth

.

Proteins

2007

;

68

:

636

–

45

.

59.

Zhang

W

,

Liu

S

,

Zhou

YQ

.

SP5: improving protein fold recognition by using torsion angle profiles and profile-based gap penalty model

.

PLoS One

2008

;

3

:

e2325

.

60.

Peng

J

,

Xu

JB

.

Boosting protein threading accuracy

.

Res Comput Mol Biol Proc

2009

;

5541

:

31+

.

61.

Xia

JQ

,

Peng

ZL

,

Qi

DW

, et al.

An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier

.

Bioinformatics

2017

;

33

:

863

–

70

.

62.

Jo

T

,

Hou

J

,

Eickholt

J

, et al.

Improving protein fold recognition by deep learning networks

.

Sci Rep

2015

;

5

:

17573

.

64.

Cheng

JL

,

Baldi

P

.

A machine learning information retrieval approach to protein fold recognition

.

Bioinformatics

2006

;

22

:

1456

–

63

.

65.

Chen

JJ

,

Long

R

,

Wang

XL

, et al.

dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation

.

Sci Rep

2016

;

6

.

66.

Wu

Z

,

Liao

Q

,

Liu

B

.

A comprehensive review and evaluation of computational methods for identifying protein complexes from protein–protein interaction networks

.

Brief Bioinform

. doi:

10.1093/bib/bbz085

.

67.

Wei

H

,

Liu

B

.

iCircDA-MF: identification of CircRNA–disease associations based on matrix factorization

.

Brief Bioinform

. doi:

10.1093/bib/bbz057

.

68.

Zou

Q

,

Li

J

,

Song

L

, et al.

Similarity computation strategies in the microRNA–disease network: a survey

.

Brief Funct Genomics

2016

;

15

:

55

–

64

.

PubMed