-
PDF
- Split View
-
Views
-
Cite
Cite
Tianyang Zhang, Qiang Tang, Fulei Nie, Qi Zhao, Wei Chen, DeepLncPro: an interpretable convolutional neural network model for identifying long non-coding RNA promoters, Briefings in Bioinformatics, Volume 23, Issue 6, November 2022, bbac447, https://doi.org/10.1093/bib/bbac447
- Share Icon Share
Abstract
Long non-coding RNA (lncRNA) plays important roles in a series of biological processes. The transcription of lncRNA is regulated by its promoter. Hence, accurate identification of lncRNA promoter will be helpful to understand its regulatory mechanisms. Since experimental techniques remain time consuming for gnome-wide promoter identification, developing computational tools to identify promoters are necessary. However, only few computational methods have been proposed for lncRNA promoter prediction and their performances still have room to be improved. In the present work, a convolutional neural network based model, called DeepLncPro, was proposed to identify lncRNA promoters in human and mouse. Comparative results demonstrated that DeepLncPro was superior to both state-of-the-art machine learning methods and existing models for identifying lncRNA promoters. Furthermore, DeepLncPro has the ability to extract and analyze transcription factor binding motifs from lncRNAs, which made it become an interpretable model. These results indicate that the DeepLncPro can server as a powerful tool for identifying lncRNA promoters. An open-source tool for DeepLncPro was provided at https://github.com/zhangtian-yang/DeepLncPro.
Introduction
Long non-coding RNA (lncRNA) is a kind of non-coding RNAs with the length greater than 200 nucleotides [1]. Although they lack the protein-coding potential, lncRNAs play important roles in various of biological processes [2, 3], such as the regulation of cell cycle, apoptosis, transcription, splicing, translation, genomic rearrangement and genetic imprinting [4–9], etc. Furthermore, a growing number of evidences demonstrated that lncRNAs also associated with human diseases and even cancer development [10, 11]. For example, the abnormal expression of lncRNA is associated with the development of cardiovascular diseases and Huntington’s disease [12, 13]. Hence, in order to reveal their biological functions, more researches on lncRNAs are needed and necessary [14]. Knowing about the origins of lncRNA is the first step to illustrate their regulatory roles. A promoter is a regulatory element located upstream of the transcription start site (TSS) [15], which initiates and regulates the transcription of RNA through the binding of transcription factors [16, 17]. Therefore, accurately identifying the promoter of lncRNA will be not only helpful to determine its origins, but also to understand its regulatory mechanisms.
Experimental methods for identifying promoters are mainly mutation analysis and immunoprecipitation analysis [18, 19]. Although these methods are gold standard for determining promoters, they are still time consuming and cost-ineffective for genome wide analysis [20–22]. Fortunately, a large amount of data was generated from these experiments, especially for Homo sapiens and Mus musculus, which are valuable resources for developing in computational methods for identifying lncRNA promoters. In 2019, Alam et al. proposed a deep learning based method, called DeepCNPP [23], for identifying human lncRNA promoters. However, neither the web-server nor source code was provided for DeepCNPP, which hindered its applications in lncRNA promoter identification. Later on, Tang et al. proposed a freely accessible web-server ncPro-ML for identifying lncRNA promoters in human and mouse [24]. Unfortunately, ncPro-ML is only based on hand-crafted features and is lack of biological interpretability. In conclusion, both DeepCNPP and ncPro-ML fall short in interpreting the model from biological perspectives, and their predictive accuracies for identifying lncRNA promoters still have room for improvement. Therefore, there is a need to develop interpretable models to accurately identify lncRNA promoters.
We therefore proposed a convolutional neural network (CNN)-based method, called DeepLncPro, to identify lncRNA promoters in human and mouse. In DeepLncPro, the sequences were encoded by using one-hot, nucleotide chemical properties and dinucleotide physical–chemical properties. In order to obtain a robust model, the hyperparameter optimization process was performed to obtain optimal hyper-parameters of CNN. The evaluations based on independent test dataset showed that DeepLncPro outperformed state-of-the-art machine learning methods. In addition, comparative results demonstrated that DeepLncPro is superior to existing methods for predicting lncRNA promoters. DeepLncPro also has the benefits in the biological interpretation and is capable of capturing sequence motifs, which can be matched to transcription factor binding motifs. For facilitating researchers to implement DeepLncPro, the command-line version of DeepLncPro was available at https://github.com/zhangtian-yang/DeepLncPro. We expect that DeepLncPro will be helpful for the identification of lncRNA promoters.
Materials and methods
Dataset
In this study, we constructed the benchmark dataset in a similar way to our previous work [24]. The promoter sequences of lncRNA from human and mouse were obtained from the Eukaryotic Promoter Database (EPD) [25]. Considering that RNA polymerases usually bind in the upstream regions of the TSS [26], positive samples were taken around the TSS and contained more upstream regions. Negative samples were taken from the downstream regions away from the TSS. Considering that core promoter elements usually locate in the upstream region of the TSS and the length of the upstream region may have an impact on the model performance, we constructed seven datasets based on sequence lengths from 61 to 301 bp with a step of 40 bp. For a dataset with the sequences of n bp length, positive samples were extracted from (n-20) bp upstream of the TSS to 20 bp downstream of the TSS. Negative samples were extracted in the same way, but 1000 bp downstream of the TSS. The ratio of positive to negative samples was kept at 1:1. For each dataset, 60% of the samples were randomly selected out and used as training data to train the model, 20% were used as validation data to tune the model parameters, and the remaining 20% were used as test data to evaluate the final model (Figure 1A). The details of the datasets were shown in Table 1.

An overview of DeepLncPro. (A) Data sets. The dataset contains 2339 positive and 2339 negative samples from human and 3077 positive and 3077 negative samples from mouse. Each sample was intercepted at different lengths from 61 bp to 301 bp with a step of 40 bp. (B) Feature encoding. These samples were encoded by using three feature encoding methods. The encoded features were merged into a 13 × L matrix. (C) Framework of DeepLncPro. DeepLncPro was built based on convolutional neural network. Each sample got a prediction score, ranging from 0 to 1. If the score was >0.5, the sequence is predicted as a lncRNA promoter; otherwise, a non-lncRNA promoter.
Name . | Training dataset . | Validation dataset . | Testing dataset . | |||
---|---|---|---|---|---|---|
Positive . | Negative . | Positive . | Negative . | Positive . | Negative . | |
Human | 1403 | 1403 | 468 | 468 | 468 | 468 |
Mouse | 1846 | 1846 | 616 | 616 | 615 | 615 |
Name . | Training dataset . | Validation dataset . | Testing dataset . | |||
---|---|---|---|---|---|---|
Positive . | Negative . | Positive . | Negative . | Positive . | Negative . | |
Human | 1403 | 1403 | 468 | 468 | 468 | 468 |
Mouse | 1846 | 1846 | 616 | 616 | 615 | 615 |
Name . | Training dataset . | Validation dataset . | Testing dataset . | |||
---|---|---|---|---|---|---|
Positive . | Negative . | Positive . | Negative . | Positive . | Negative . | |
Human | 1403 | 1403 | 468 | 468 | 468 | 468 |
Mouse | 1846 | 1846 | 616 | 616 | 615 | 615 |
Name . | Training dataset . | Validation dataset . | Testing dataset . | |||
---|---|---|---|---|---|---|
Positive . | Negative . | Positive . | Negative . | Positive . | Negative . | |
Human | 1403 | 1403 | 468 | 468 | 468 | 468 |
Mouse | 1846 | 1846 | 616 | 616 | 615 | 615 |
Feature representation algorithms
For the convenience of feature description, a DNA sequence were denoted as S = D1D2… DL, where L is the length of the sequence and Di ∈ {A, T, G, C} represents the deoxynucleotide at the i-th position in the sequence. The one-hot, nucleotide chemical properties (NCP) and dinucleotide physical–chemical properties were used to encode the samples in the dataset, Figure 1B.
One-hot
NCP
Dinucleotide physicochemical properties (DPCP)
Model architectures
In recent years, CNN has been widely used in biological sequence analysis [35–38]. In the present work, we employed CNN to build the DeepLncPro model as well. The implementation of DeepLncPro was based on the deep learning library Pytorch [39]. DeepLncPro contains two 1D convolutional layers with 24 filters with the size of 10, which were determined by performing hyperparameter optimization. Since the rectified linear unit (ReLU) can keep the input values that are positive [40], it was used to combat the vanishing gradient problem. The framework of the proposed model DeepLncPro was shown in Figure 1C.
Hyper-parameter optimization and model selection
In order to obtain models with better performance and generalization capability, we performed hyper-parameter optimization. To make the training process more stable, the Adam algorithm [44] was applied to automatically determine the learning rate based on the batch gradient descent. The random search method was used to determine the hyperparameters including learning rate, number of neurons, size of convolutional layers and number of filters.
In the hyper-parameter optimization process, we first trained a basic model by selecting a set of hyperparameters within a reasonable range (see details in Supplementary Table S1 available online at http://bib.oxfordjournals.org/). Then, by keeping the other hyperparameters fixed, a certain hyperparameter was searched in the given range. According to the performance obtained from the validation dataset, an optimal hyperparameter was selected. This process was repeated until all hyperparameters were optimized. Once all hyperparameters were determined, they were used to train DeepLncPro again on the training and validation datasets. It should be pointed out that only the combination of hyperparameters with the highest accuracy in the validation set was retained.
Performance evaluation
In addition, we also used the receiver operating characteristic (ROC) curve [46] and the area under the curve (AUC) as the threshold independent metrics to objectively evaluate the performances of DeepLncPro and existing methods.
Motif extraction
In order to make DeepLncPro interpretable, we used the same method as in deepRAM [41] to extract the motifs from its first convolutional layer. For each filter in the first convolutional layer, according to our preliminary test, we extracted sequence segments, which could activate the filter with the activation value greater than 65% of the filter’s maximum value. By stacking these segments, we computed the nucleotide frequencies and obtained the position weight matrix (PWM) which was considered as the local motif captured by DeepLncPro. Afterwards, the correlation between the PWM and the transcription factor binding motifs in the JASPAR [47] database was calculated by using TOMTOM [48].
Result and discussion
Effect of sequence length and encoding schemes on model performance
To determine the optimal sequence length and encoding schemes for predicting lncRNA promoters, the effects of sequence lengths and encoding schemes on the model performance were investigated. For this aim, we built different models based on the combinations of different types of sequence lengths and encoding schemes. In order to obtain a model with satisfactory generalizability, the training data from human and mouse were combined together to train the models. For each model, its hyper-parameters were optimized according to the procedures introduced in Hyper-parameter optimization and model selection section. The accuracies of the models for identifying human and mouse lncRNA promoters in the validation set were shown in Figure 2. The corresponding sensitivity, specificity and Matthew’s correlation coefficient were listed in Supplementary Tables S2 and S3 available online at http://bib.oxfordjournals.org/. It was found that the model based on the sequence length of 181 bp and the combinations of the three kinds of encoding schemes obtained the best accuracies of 87.07% and 87.73% for identifying lncRNA promoters in both human and mouse, respectively. Accordingly, based on the above obtained optimal sequence length (181 bp), combinational encoding method and the best hyper-parameters (Supplementary Table S4 available online at http://bib.oxfordjournals.org/), the DeepLncPro was developed for predicting lncRNA promoters in both human and mouse. In addition, we also evaluated the models trained by the data either from human or mouse and reported the results in Supplementary Figure S1, Supplementary Tables S5 and S6 available online at http://bib.oxfordjournals.org/. The obtained results demonstrated that the performances of these models were all lower than that of DeepLncPro.

Performance of the models based on different sequence lengths and encoding schemes. The vertical coordinate represents the sequence length ranging from 61 to 301 bp. The horizontal coordinate represents different encoding schemes, including one-hot, NCP, DPCP and their combinations. (A) The predictive accuracies of different models for identifying lncRNA promoters in human; (B) The predictive accuracies of different models for identifying lncRNA promoters in mouse.
Comparison with classical machine learning methods
Considering that machine learning methods were widely used in DNA sequence elements identification, we compared DeepLncPro with five classical ML methods, namely random forest (RF), logistic regression (LR), k-nearest neighbors (KNN), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost). The three input matrices of DeepLncPro were flattened into a 13 L dimensional vector and used as the input of RF, LR, KNN, SVM and XGBoost. The evaluating metrics of DeepLncPro and ML models for identifying human and mouse lncRNA promoters in the test dataset were listed in Table 2. DeepLncPro obtained the best accuracies of 86.21% and 86.82% for identifying human and mouse lncRNA promoters, respectively. We also plotted the ROC curves of DeepLncLoc and the machine learning methods in Figure 3. It was found that DeepLncPro obtained the best AUCs of 0.928 and 0.931, and outperformed the other machine learning models for predicting lncRNA promoters in both human and mouse.
Performance of DeepLncPro and different machine learning models for identifying lncRNA promoters in test set
Method . | Species . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
RF | Human | 85.90% | 83.55% | 84.72% | 0.69 |
Mouse | 82.60% | 83.74% | 83.17% | 0.66 | |
LR | Human | 83.97% | 78.63% | 81.30% | 0.63 |
Mouse | 83.25% | 81.63% | 82.44% | 0.65 | |
KNN | Human | 81.41% | 62.18% | 71.79% | 0.44 |
Mouse | 86.83% | 66.50% | 76.67% | 0.54 | |
SVM | Human | 82.91% | 79.49% | 81.20% | 0.62 |
Mouse | 84.72% | 85.37% | 85.04% | 0.70 | |
XGBoost | Human | 85.26% | 83.33% | 84.29% | 0.69 |
Mouse | 85.69% | 85.53% | 85.61% | 0.71 | |
CNN | Human | 89.74% | 82.69% | 86.22% | 0.73 |
Mouse | 88.78% | 84.88% | 86.83% | 0.74 |
Method . | Species . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
RF | Human | 85.90% | 83.55% | 84.72% | 0.69 |
Mouse | 82.60% | 83.74% | 83.17% | 0.66 | |
LR | Human | 83.97% | 78.63% | 81.30% | 0.63 |
Mouse | 83.25% | 81.63% | 82.44% | 0.65 | |
KNN | Human | 81.41% | 62.18% | 71.79% | 0.44 |
Mouse | 86.83% | 66.50% | 76.67% | 0.54 | |
SVM | Human | 82.91% | 79.49% | 81.20% | 0.62 |
Mouse | 84.72% | 85.37% | 85.04% | 0.70 | |
XGBoost | Human | 85.26% | 83.33% | 84.29% | 0.69 |
Mouse | 85.69% | 85.53% | 85.61% | 0.71 | |
CNN | Human | 89.74% | 82.69% | 86.22% | 0.73 |
Mouse | 88.78% | 84.88% | 86.83% | 0.74 |
Performance of DeepLncPro and different machine learning models for identifying lncRNA promoters in test set
Method . | Species . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
RF | Human | 85.90% | 83.55% | 84.72% | 0.69 |
Mouse | 82.60% | 83.74% | 83.17% | 0.66 | |
LR | Human | 83.97% | 78.63% | 81.30% | 0.63 |
Mouse | 83.25% | 81.63% | 82.44% | 0.65 | |
KNN | Human | 81.41% | 62.18% | 71.79% | 0.44 |
Mouse | 86.83% | 66.50% | 76.67% | 0.54 | |
SVM | Human | 82.91% | 79.49% | 81.20% | 0.62 |
Mouse | 84.72% | 85.37% | 85.04% | 0.70 | |
XGBoost | Human | 85.26% | 83.33% | 84.29% | 0.69 |
Mouse | 85.69% | 85.53% | 85.61% | 0.71 | |
CNN | Human | 89.74% | 82.69% | 86.22% | 0.73 |
Mouse | 88.78% | 84.88% | 86.83% | 0.74 |
Method . | Species . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
RF | Human | 85.90% | 83.55% | 84.72% | 0.69 |
Mouse | 82.60% | 83.74% | 83.17% | 0.66 | |
LR | Human | 83.97% | 78.63% | 81.30% | 0.63 |
Mouse | 83.25% | 81.63% | 82.44% | 0.65 | |
KNN | Human | 81.41% | 62.18% | 71.79% | 0.44 |
Mouse | 86.83% | 66.50% | 76.67% | 0.54 | |
SVM | Human | 82.91% | 79.49% | 81.20% | 0.62 |
Mouse | 84.72% | 85.37% | 85.04% | 0.70 | |
XGBoost | Human | 85.26% | 83.33% | 84.29% | 0.69 |
Mouse | 85.69% | 85.53% | 85.61% | 0.71 | |
CNN | Human | 89.74% | 82.69% | 86.22% | 0.73 |
Mouse | 88.78% | 84.88% | 86.83% | 0.74 |

The ROC curves of DeepLncPro, RF, LR, KNN, SVM and XGBoost validated in the test dataset. (A) The ROC curves for identifying lncRNA promoters in human. (B) The ROC curves for identifying lncRNA promoters in mouse.
Comparison with the existing predictor
To further illustrate its superiority, we compared DeepLncPro with the existing predictor ncPro-ML [24]. For a fair comparison, the two predictors were all validated on the same test set. As shown in Table 3, the accuracy of DeepLncPro were 4.57% and 3.74% higher than that of ncPro-ML for identifying human and mouse lncRNA promoters, respectively. The corresponding sensitivity, specificity and Matthew’s correlation coefficient were improved 8.47%, 0.65% and 0.10 in human, and 7.12%, 0.36% and 0.08 in mouse, respectively. These results demonstrated that the DeepLncPro is more superior to identify human and mouse lncRNA promoters.
Comparison of the prediction performance of DeepLncPro with ncPro-ML based on the test set
Species . | Name . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
Human | ncPro-ML | 81.27% | 82.04% | 81.65% | 0.63 |
DeepLncPro | 89.74% | 82.69% | 86.22% | 0.73 | |
Mouse | ncPro-ML | 81.66% | 84.52% | 83.09% | 0.66 |
DeepLncPro | 88.78% | 84.88% | 86.83% | 0.74 |
Species . | Name . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
Human | ncPro-ML | 81.27% | 82.04% | 81.65% | 0.63 |
DeepLncPro | 89.74% | 82.69% | 86.22% | 0.73 | |
Mouse | ncPro-ML | 81.66% | 84.52% | 83.09% | 0.66 |
DeepLncPro | 88.78% | 84.88% | 86.83% | 0.74 |
Comparison of the prediction performance of DeepLncPro with ncPro-ML based on the test set
Species . | Name . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
Human | ncPro-ML | 81.27% | 82.04% | 81.65% | 0.63 |
DeepLncPro | 89.74% | 82.69% | 86.22% | 0.73 | |
Mouse | ncPro-ML | 81.66% | 84.52% | 83.09% | 0.66 |
DeepLncPro | 88.78% | 84.88% | 86.83% | 0.74 |
Species . | Name . | Sn(%) . | Sp(%) . | Acc(%) . | MCC . |
---|---|---|---|---|---|
Human | ncPro-ML | 81.27% | 82.04% | 81.65% | 0.63 |
DeepLncPro | 89.74% | 82.69% | 86.22% | 0.73 | |
Mouse | ncPro-ML | 81.66% | 84.52% | 83.09% | 0.66 |
DeepLncPro | 88.78% | 84.88% | 86.83% | 0.74 |

Distribution of positive and negative samples in the 2D feature space. The blue and orange dots represent positive and negative samples, respectively. (A) The feature space of the original input features. (B) The feature space based on the outputs of the first convolutional layer. (C) The feature space based on the outputs of the second convolutional layer. (D) The feature space based on the outputs of the fully connected layer.

The four representative motifs extracted by DeepLncPro in human lncRNAs. The motifs correspond to the binding sites of the transcription factors SP1 (P = 9.17e − 06), HIF1A (P-value = 1.61e − 06), ZNF384 (P-value = 3.83e − 05) and SP3 (P-value = 6.89e − 05). In each case, the top panel was the known motif in the JASPAR database, and the bottom panel was the motif extracted by DeepLncPro.
Model interpretation and visualization
To explain the performance of the proposed model, we extracted and visualized the inputs and outputs from all layers of DeepLncPro, namely the original inputs, outputs of the first convolutional layer, outputs of the second convolutional layer and outputs of the fully connected layer. To facilitate understanding these features, the UMAP [49] was used to show the distribution of positive and negative samples. It was found that the positive and negative samples couldn’t be separated in the feature space formed by the original input features (Figure 4A). However, the margins between positive and negative samples were more clearly separated in the feature space based on the output features of the first and second convolutional layers (Figure 4B and C). The positive and negative samples could be more clearly separated based on the output features of the fully connected layer (Figure 4D). These results demonstrated the ability of the proposed model in extracting potential features, which help to learn a better decision margin for identifying lncRNA promoters.
To demonstrate the ability of DeepLncPro in capturing informative motifs, we calculated the PWM (see Motif extraction section for details) to analyze the extracted motifs from the 24 filters of the first convolutional layer. The TOMTOM was then used to map the motifs learned from each filter to known transcription factor (TF) binding motifs in the JASPAR database. Finally, we obtained 87 and 85 known motifs in JASPAR with P-value < 0.05 in human and mouse (Supplementary Tables S7 and S8 available online at http://bib.oxfordjournals.org/), respectively. The representative binding motifs of the four TFs (SP1; HIF1A, ZNF384 and SP3) obtained from human and mouse were shown in Figure 5 and Supplementary Figure S2 available online at http://bib.oxfordjournals.org/. In each case, the top panel was the known motif in the JASPAR database, and the bottom panel was the motif extracted by DeepLncPro. It was observed that the representative motifs in mouse were very similar to those in human. As indicated by the hTFtarget database [50], the transcription factors SP1, HIF1A, ZNF384 and SP3 were all involved in the regulation of lncRNA expression, which demonstrated the biological significance of DeepLncPro.
Conclusion
LncRNA plays important regulatory roles in various biological processes. Accurate identification of lncRNA promoter is helpful to understand its regulatory mechanisms. In order to improve the model performance and provide model explainability in promoter prediction, we proposed a deep learning based model, called DeepLncPro, to identify lncRNA promoters in human and mouse. A series of comparative experiments demonstrated that DeepLncPro is superior to the state-of-the-art machine learning methods and existing models for identifying lncRNA promoters. The excellent performance of DeepLncPro is attributed to the informative features extracted from the convolutional layers. By mapping these features to JASPAR database, it was found that they are the known transcription factor binding motifs, which provides the interpretability of DeepLncPro. An open-source tool for DeepLncPro was provided at https://github.com/zhangtian-yang/DeepLncPro, which will stimulate further studies on lncRNA promoter identification.
It should be pointed out that only the sequence-derived information was used in DeepLncPro, which is not enough to capture the information depicting promoters. It has been reported that the data from both ATAC-seq and ChIP-seq are also key signals in promoter regions [51, 52]. Therefore, in the future work, we need to collect and integrate these data for identifying lncRNA promoters.
Authors’ contributions
W.C. and T.Y.Z. conceived and designed the work. T.Y.Z., Q.T. and F.L.N. performed the data collection and analysis, visualized the results. T.Y.Z. and Q.T. collected the data. T.Y.Z., Q.Z. and W.C. wrote the manuscript. All authors read and approved the final manuscript.
Data availability
The data and code that support the findings of this study are available at https://github.com/zhangtian-yang/DeepLncPro.
A convolutional neural network based model, called DeepLncPro, is proposed to identify human and mouse lncRNA promoters.
Comparative studies demonstrated that DeepLncPro outperforms existing models for identification of lncRNA promoters.
DeepLncPro is capable of capturing transcription factor binding sites, which facilitates its biological interpretation.
An open-source tool for DeepLncPro is provided at https://github.com/zhangtian-yang/DeepLncPro.
Funding
Natural Science Foundation of Sichuan (No. 2022NSFSC1770), National Natural Science Foundation of China (No. 31771471), Natural Science Foundation of Foundation of Education Department of Liaoning Province (No. LJKZ0280).
Tian-Yang Zhang is a graduate student at the School of Life Sciences, North China University of Science and Technology. His research interests focus on bioinformatics.
Qiang Tang is a Ph.D. candidate at School of Basic Medical Sciences, Chengdu University of Traditional Chinese Medicine. His research interests include bioinformatics and machine learning.
Fulei Nie is a Ph.D. candidate at School of Life Sciences, North China University of Science and Technology. Her research interests include bioinformatics and machine learning.
Qi Zhao is a professor at School of Computer Science and Software Engineering, University of Science and Technology Liaoning. His research interests include bioinformatics, complex network and machine learning.
Wei Chen is a professor at Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine. His research interests include bioinformatics and machine learning.