Orchestrating information across tissues via a novel multitask GAT framework to improve quantitative gene regulation relation modeling for survival analysis

Duan, Meiyu; Wang, Yueying; Zhao, Dong; Liu, Hongmei; Zhang, Gongyou; Li, Kewei; Zhang, Haotian; Huang, Lan; Zhang, Ruochi; Zhou, Fengfeng

doi:10.1093/bib/bbad238

Abstract

Survival analysis is critical to cancer prognosis estimation. High-throughput technologies facilitate the increase in the dimension of genic features, but the number of clinical samples in cohorts is relatively small due to various reasons, including difficulties in participant recruitment and high data-generation costs. Transcriptome is one of the most abundantly available OMIC (referring to the high-throughput data, including genomic, transcriptomic, proteomic and epigenomic) data types. This study introduced a multitask graph attention network (GAT) framework DQSurv for the survival analysis task. We first used a large dataset of healthy tissue samples to pretrain the GAT-based HealthModel for the quantitative measurement of the gene regulatory relations. The multitask survival analysis framework DQSurv used the idea of transfer learning to initiate the GAT model with the pretrained HealthModel and further fine-tuned this model using two tasks i.e. the main task of survival analysis and the auxiliary task of gene expression prediction. This refined GAT was denoted as DiseaseModel. We fused the original transcriptomic features with the difference vector between the latent features encoded by the HealthModel and DiseaseModel for the final task of survival analysis. The proposed DQSurv model stably outperformed the existing models for the survival analysis of 10 benchmark cancer types and an independent dataset. The ablation study also supported the necessity of the main modules. We released the codes and the pretrained HealthModel to facilitate the feature encodings and survival analysis of transcriptome-based future studies, especially on small datasets. The model and the code are available at http://www.healthinformaticslab.org/supp/.

bioinformatics, graph attention network, multitask learning, transfer learning, survival analysis

INTRODUCTION

Prognosis and life quality are major concerns of cancer patients but remain challenging for cancer researchers [1, 2]. Survival analysis is a routine approach to assess an event of interest by a specific time and has been improved to handle imperfect datasets with missing death dates of some censored patients [3]. Cox proportional hazard model is one of the most widely used survival analysis approaches and can handle multiple risk factors and censored samples [4, 5].

The rapid advancement of high-throughput sequencing technologies has generated an extremely high dimension of genic features, which poses a computational challenge for survival analysis. Several survival studies have chosen feature selection algorithms to reduce the feature dimensions, including L₂ penalty in ridge regression [6, 7], L₁ Lasso regularization [8, 9] and tree-based ensemble methods [10]. XGBLC [11] was an improved survival prediction model combining L₁ regularization and ensemble learner extreme gradient boosting (XGBoost). This method was designed to avoid overfitting and to get more stable performance. However, these methods do not consider the complex intergenic correlations sufficiently and have limited ability to explore non-linear relationships.

Deep learning technologies are good at automatic extraction of complex patterns and have been used to extend the conventional Cox regression approach for the high-dimensional transcriptome-based survival analysis e.g. Cox-nnet [12]. These methods combined artificial neural networks with Cox regression models to extract the intergenic non-linear patterns within high-throughput OMIC data. Transfer learning has been widely used to avoid overfitting through the utilization of additional information from a source task. Hanczar et al. showed that transfer learning could significantly improve the cancer prediction task on a small training set of transcriptomic profiles [13]. The pretrained variational autoencoder Cox model (VAECox) [14] was another representative work, transferring genomic information from other cancer types to the survival analysis task of a specific cancer type.

A living system is highly complicated, and biological networks offer valuable information for understanding system-level patterns between genes [15–17]. Hu et al. conducted a comprehensive survey of how TFs were involved in gene expression, gene regulation and interaction, and suggested that TFs were also significantly related with cancer survival [18]. Some studies have attempted to integrate the intergene cooperative and synergistic effects in survival analysis through biological networks such as protein–protein interaction network, gene-coexpression network and gene regulatory network [19]. Zhang et al. applied the gene network information in the Cox regression approach, and the proposed Net-Cox algorithm outperformed the conventional Cox models on the prognosis prediction of ovarian cancer treatment [5]. Li et al. defined a signaling entropy calculation method to infer personalized pathway activations and quantified the dysregulations of essential pathways for survival analysis [20]. Ghosh Roy et al. used the prior signaling pathway knowledge and gene mutation information to simulate signal flow disruptions in a specific cancer type and proposed a novel Mutated Pathway Visible Neural Network architecture for the interpretable cancer survival risk prediction [21]. However, they focused on a few representative pathways, and the overall view of the gene regulatory network remained to be fully characterized.

Graph neural network is another deep learning framework to model the coexpression gene modules for the OMIC-based disease diagnosis and prognosis [22]. Graph neural networks have demonstrated powerful capabilities in extracting OMIC-based information for the classification tasks of cancer diagnosis and prognosis [23, 24]. The graph convolutional network with the embedded graph learning module (glmGCN) was designed to formulate the complex correlations of lncRNAs and mRNAs, and accurately predicted the distant metastasis of cancers [23]. A novel interpretable multi-level attention graph neural network (MLA-GNN) was proposed to formulate the intergenic coexpression patterns for disease classification and survival prediction and achieved the best performance on two cancer types and coronavirus disease 2019[22].

Research studies have shown that the gene activity patterns in normal tissues carry important information and may serve as ideal references for understanding the developmental trajectories of complex diseases such as cancer [25]. Normal control tissues are usually extracted from the adjacent regions of cancer lesions, and their close involvement in the tumor microenvironments may not accurately reflect the genic activities in the actual healthy tissues [26]. However, it is often difficult to provide the OMIC profiles of the actual healthy tissues from the same cohort of participants in clinical practice [27]. Uhlen et al. explored the associations between the gene expression levels in human solid tumors and the corresponding normal tissues and revealed that the normal tissues were more similar to the other normal samples than to the cancer samples from the same cohort of participants [28]. The complex synergetic collaboration of various regulatory factors remains to be modeled by advanced algorithms such as deep learning.

In this study, we proposed a graph attention network (GAT) framework to model the synergetic interactions of regulatory factors (HealthModel) in the gene regulatory network of healthy samples and transferred the pretrained HealthModel to the multitask survival prediction framework (DQSurv) for cancer survival analysis. The experimental procedure of this study is shown in Figure 1. Both transcription factors (TFs) and long intergenic noncoding RNAs (lincRNAs) positively contributed to the gene expression prediction based on the HealthModel-encoded latent features. We obtained a GAT-based DiseaseModel after fine-tuning the HealthModel on training cancer samples and defined the difference between the HealthModel- and DiseaseModel-encoded latent features as the dysregulation quantification of a given cancer sample. The multitask framework optimized the main task of survival analysis and the auxiliary task of gene expression prediction based on the latent features encoded by the two GAT models. The experimental data showed that DQSurv outperformed both machine learning- and deep learning-based survival analysis models on 10 popular benchmark cancer types and an independent dataset.

Figure 1

The workflow of the study. (A) The pretrained GAT architecture HealthModel and (B) the multitask survival prediction framework DQSurv, including the survival prediction task and the gene expression prediction task. The gray boxes with the letters A/B/C represent the three major modules to be described in detail in the ‘Overview of the study’ section.

Open in new tab Download slide

The main contributions of this study include the following: (i) a multitask GAT framework was proposed to simultaneously integrate the intergenic information of both healthy and cancerous samples; (ii) the HealthModel was trained using the healthy samples and served as the reference gene regulatory network; (iii) the difference between the GAT-modeled gene regulatory networks of a cancer sample and the HealthModel could represent the dysregulatory distance of this cancer sample. The pretrained model and the source codes were released to facilitate similar studies by other researchers.

This study started with this Introduction section. The next section, ‘Materials and methods’, described the details of the datasets, the proposed framework and experimental settings. Then, all the experimental results are given and discussed in the ‘Results and discussion’ section. The concluding remarks and the limitations of this study are summarized in the ‘Conclusion’ section. The pretrained models, the code and the supplementary data are given in the sections ‘Data availability’ and ‘Supplementary data’, respectively. The other information required by the journal is given in the respective sections.

MATERIALS AND METHODS

Dataset and preprocessing

The RNA-sequencing (RNA-seq) gene expression data were downloaded from the UCSC Xena platform [29], which reanalyzed the RNA-seq raw data from The Cancer Genome Atlas (TCGA) [30] and the Genotype-Tissue Expression (GTEx) programs [31]. The reanalysis pipeline realigned the RNA-seq reads to the human genome version hg38 and recalculated the gene expression levels using RSEM [32] and Kallisto [33]. This reanalysis aimed to eliminate the batch effects of different computational processing procedures to the maximal extent and to compare the normal and cancer samples directly. Table 1 summarizes the datasets used in this study.

Healthy samples were retrieved from the GTEx dataset [31] in the UCSC Xena platform. A tissue with fewer than 10 samples was removed. A total of 7373 healthy samples from 47 tissues were collected (Supplementary Table S1). Features with >15% missing values in the healthy samples were removed, as similar in another pan-cancer analysis study [14]. A K-nearest neighbor imputation was used to fill in the remaining missing values. Z-Normalization was also done for each feature.

Cancer samples were retrieved from TCGA [30] dataset in the UCSC Xena platform. This study compared the proposed framework with two studies i.e. Cox-nnet [12] and VAECox [14]. Thus, we used the same 10 cancer types as in these two studies (Table 1). The cancer samples were labeled as ‘Censored’ and ‘Dead’ based on the annotation ‘OS’, and these numbers are summarized in Table 1. The same preprocessing strategy was used for each cancer type.

There were 16 074 transcriptomic features overlapping between the 11 datasets. The expression levels of many genes did not have large variations. Such genes would not have positive discriminative power and would significantly increase the computational requirement of training the artificial intelligence models. Therefore, 8000 features with the largest variances were kept for further analysis.

Table 1

Open in new tab

Summary of the datasets used in this study

ID	Dataset	# Censored	# Dead	# All
1	Training	–	–	7373
2	BLCA	237	188	425
3	BRCA	1012	198	1210
4	HNSC	310	253	563
5	KIRC	403	200	603
6	LGG	391	130	521
7	LIHC	258	162	420
8	LUAD	356	209	565
9	LUSC	300	242	542
10	OV	264	160	424
11	STAD	276	167	443

ID	Dataset	# Censored	# Dead	# All
1	Training	–	–	7373
2	BLCA	237	188	425
3	BRCA	1012	198	1210
4	HNSC	310	253	563
5	KIRC	403	200	603
6	LGG	391	130	521
7	LIHC	258	162	420
8	LUAD	356	209	565
9	LUSC	300	242	542
10	OV	264	160	424
11	STAD	276	167	443

The first column lists the ‘Training’ dataset and the datasets of 10 cancer types. The last column ‘# All’ is the total number of samples in each dataset.

Table 1

Open in new tab

Summary of the datasets used in this study

ID	Dataset	# Censored	# Dead	# All
1	Training	–	–	7373
2	BLCA	237	188	425
3	BRCA	1012	198	1210
4	HNSC	310	253	563
5	KIRC	403	200	603
6	LGG	391	130	521
7	LIHC	258	162	420
8	LUAD	356	209	565
9	LUSC	300	242	542
10	OV	264	160	424
11	STAD	276	167	443

ID	Dataset	# Censored	# Dead	# All
1	Training	–	–	7373
2	BLCA	237	188	425
3	BRCA	1012	198	1210
4	HNSC	310	253	563
5	KIRC	403	200	603
6	LGG	391	130	521
7	LIHC	258	162	420
8	LUAD	356	209	565
9	LUSC	300	242	542
10	OV	264	160	424
11	STAD	276	167	443

The first column lists the ‘Training’ dataset and the datasets of 10 cancer types. The last column ‘# All’ is the total number of samples in each dataset.

Regulatory factors

We predicted the expression levels of messenger RNAs (mRNAs) by integrating two types of regulatory factors, TFs and lincRNAs. TFs have widely been used as the regulatory factors in constructing gene regulatory networks [34, 35].

Recent studies have suggested that long noncoding RNAs (lncRNAs) are also actively involved in gene regulations and are increasingly being recognized as important regulatory factors of disease pathogenesis [36]. Zhang et al. showed that the antisense lncRNA FOXC2-AS1 upregulated FOXC2 and induced doxorubicin resistance in osteosarcoma [37]. Li et al. characterized the lncRNAs’ regulations of the immune system across multiple cancer types, showing that they could serve as potential oncogenic biomarkers [38]. lncRNAs in the intergenic regions (lincRNAs) have shown their regulatory contribution to the in vivo stress responses of de-differentiation and cell cycle [39]. Considering the difficulty in separating the RNA-seq reads of the overlapping mRNAs and lncRNAs, this study focused on the lincRNAs.

We separated the transcriptomic features into three groups, namely, TF, lincRNA and mRNA. The TF features were defined by the annotations of the Human Protein Atlas [40, 41]. The GTEx dataset provided the annotations of lincRNAs. We curated the genes in the third-level pathways in the KEGG database [42]. An mRNA feature was relocated to the TF class if it has a regulatory role for a target gene in the TRRUST database [43]. Supplementary Table S2 summarizes the KEGG pathways with >50 TFs or >100 mRNA features, which were investigated in this study. The KEGG database provides very limited information about lincRNAs. Therefore, we used all the lincRNA features as the regulatory factors of these pathways.

The final dataset had 425 TFs, 375 lincRNAs and 3062 mRNAs for the 88 KEGG pathways for further analysis.

Construction of the regulatory factor network

We formulated the complex correlations among the TFs and lincRNAs as a graph with the input features as the vectors V and an edge matrix E. A sample has K regulatory features, represented by V^K × 1. There were 425 TFs and 375 lincRNAs, and the dimension of a sample was K = 425 + 375 = 800. WGCNA [44] was used to calculate the edge matrix E. The pairwise weight between two nodes v_i and v_j was calculated as the Pearson correlation coefficient. A gene regulatory network has some hub genes with essential roles and demonstrates the scale-free topology [45]. We used the topological overlap measure (TOM) to calculate the correlation between two features or nodes v_i and v_j as follows:

$$\begin{equation} {a}_{ij}= power\left({S}_{ij},\beta \right) \end{equation}$$

(1)

$$\begin{equation} {TOM}_{ij}=\frac{\sum_{u\ne i,j}{a}_{iu}{a}_{uj}+{a}_{ij}}{\min \left({k}_i,{k}_j\right)+1-{a}_{ij}} \end{equation}$$

(2)

where the soft threshold β is calculated using the ‘pickSoftThreshold’ function in the WGCNA package; a_ij is the power exponential function; u is a gene, excluding i and j; and |$k=\sum_u{a}_{iu}$| is the node connectivity. WGCNA clustered the expression profiles of the transcriptomic features into multiple modules based on the commonly used dissimilarity measure (1 − TOM ≥ 0.15) as in a previous study [46].

The constructed network G(V^K × 1, E^K × K) was fed into the GAT layer, which is an advanced graph convolutional layer [47]. The GAT layer encodes a node as the weighted combination of its neighboring nodes and the node itself. The attention coefficient between the two nodes v_i and v_j was calculated as follows:

$$\begin{equation} {a}_{ij}=\frac{\exp \left( LeakyReLU\left({\overrightarrow{a}}^T\left[W{\overrightarrow{v}}_i\Big\Vert W{\overrightarrow{v}}_j\right]\right)\right)}{\sum_{k\in{N}^i}\exp \Big( LeakyReLU\left({\overrightarrow{a}}^T\left[W{v}_i\Big\Vert W{\overrightarrow{v}}_j\right]\right)} \end{equation}$$

(3)

where W is the trainable weight parameter of a fully connected layer, || denotes the concatenation operation and LeakyReLU is the activation function used in this network.

Overview of the study

Figure 1A shows that the GAT-based network of regulatory factors was pretrained on the healthy samples from the GTEx dataset. This was denoted as HealthModel, which had 800 regulatory features as its input vector and predicted the expression levels of 3062 mRNA features. The mean square error (MSE) was used as the loss function for the pretraining process. The HealthModel represented the quantitative relationships between the two types of regulatory factors (TFs and lincRNAs) to mRNAs.

The multitask framework DQSurv is shown in Figure 1B. DQSurv included a feature extraction module for the regulatory factors (A), a feature extraction module for the distance quantitative description (B) and an auxiliary gene expression prediction module (C). The weights of the HealthModel were used to initialize the DQSurv’s GAT layer. Module B learned the regulatory representation difference between the control health (HealthModel) and the cancer status (DiseaseModel). The auxiliary task (module C) optimized the mRNAs’ expression predictions through the GAT-encoded latent features. The main task of DQSurv was to measure the dysregulation quantification between the HealthModel and DiseaseModel for the survival analysis task.

Details of the HealthModel

The HealthModel used 800 regulatory features and predicted the expression levels of 3062 mRNA features based on the node-encoded latent features. Let the graph network of the HealthModel be G(V^K × 1, E^K × K). The GAT network had K = 800 regulatory features as the nodes and each node had an expression level in the query sample S. The complex associations among these nodes were diffused across the network. Each node had eight hidden variables. Therefore, the sample S was encoded by 6400 latent features (8 hidden variables × 800 nodes). A fully connected layer was used for the regression model of the 3062 mRNA features’ expression levels, instead of the pooling layer in the traditional graph classification task [48]. The dropout rate of 0.3 was used for both the GAT and the fully connected layers. The Adam optimizer was used to optimize the GAT network, and the learning rate was set at 1e−5. The MSE was used as the loss function for the regression task. |$\hat{y}$| denotes the predicted expression levels of the 3062 mRNA features.

Multitask survival prediction model DQSurv

The DQSurv model used the node-encoding features of the GAT network to optimize the two tasks. The pretrained HealthModel extracted the regulatory representations across healthy tissues and was fine-tuned by the two tasks of DQSurv on training cancer samples (denoted as DiseaseModel). The difference between the latent features encoded by the models HealthModel and DiseaseModel was used to represent the overall regulatory change of a given sample S relative to the reference network:

$$\begin{equation} RegChange(S)= HealthModel\ (S)- DiseaseModel\ (S) \end{equation}$$

(4)

where HealthModel(S) and DiseaseModel(S) are the latent features of S encoded by the HealthModel and DiseaseModel, respectively.

The main task of survival prediction integrates RegChange(S) and S as its input and outputs the log hazard ratio for cancer patients. We balanced the two tasks using the Loss-Balanced Task Weighting strategy [49], which balances the influence of the task-specific weights by considering the loss ratio between the current loss and the initial loss. We evaluated the hyperparameter loss ratio α at 0.1 and 0.5, as described in Liu et al. [49]. Supplementary Figure S1 shows that a better performance was achieved by setting the loss ratio at 0.5, which was used as the default value of this hyperparameter in the following experiments.

Loss functions

HealthModel is a GAT-based regression model to predict the expression levels of target genes. The loss function of the pretrained HealthModel used the mean square error loss (MSELoss) [50].

The loss function of the DQSurv model consisted of three parts i.e. the objective function of the Cox-PH model, the Mean Square Error loss function of the auxiliary regression task and the regularization term.

The first component of the loss function was based on the Cox-PH model, which predicts a patient’s hazard ratio [51]. The model was defined as follows:

$$\begin{equation} h\left(t|{x}_i\right)={h}_0(t)\exp{\theta}_i \end{equation}$$

(5)

$$\begin{equation} {\theta}_i={x}_i^T\beta \end{equation}$$

(6)

where X is the gene expression profile of n patients and p genes, |${x}_i=\left({x}_{i1},\dots, {x}_{ip}\right)$|⁠; θ is the log hazard ratio for patient i; and β is the trainable parameter of the model. The objective function of the Cox-PH model was a negative partial log-likelihood defined as follows:

$$\begin{equation} \hat{\beta}=\underset{\beta }{\mathrm{argmin}}-{\sum}_{C(i)=1}\left({\theta}_i-\log{\sum}_{t_j\ge{t}_i}\exp \left({\theta}_j\right)\right) \end{equation}$$

(7)

where t is the observed or censored survival time for the |${i}^{\mathrm{th}}$| patient. The condition |${t}_j\ge{t}_i$| is used to select the patients whose survival times are longer than that of the |${i}^{\mathrm{th}}$| patient. C(i) is the indicator of whether the death event of the |${i}^{\mathrm{th}}$| patient is observed. This study defined the log hazard ratio as follows:

$$\begin{equation} {\theta}_i={\left(W{x}_i+b\right)}^{\mathrm{T}}\beta \end{equation}$$

(8)

where W is the weight coefficient matrix between the input layer and the hidden layer, and b is the bias term of each hidden layer node.

The second component of the overall loss function was the MSE loss [Equation (9)] for the auxiliary task. The third component was the regularization term. Therefore, the overall loss was defined as shown in Equation (10).

$$\begin{equation} {L}_{\mathrm{MSE}}=\frac{1}{N}\sum_i{\left(W{x}_i-{y}_i\right)}^2 \end{equation}$$

(9)

$$\begin{equation} {L}_{\mathrm{total}}=\hat{\beta}+{L}_{\mathrm{MSE}}+\lambda \left({\left|\left|\beta \right|\right|}_2+{\left\Vert W\right\Vert}_2\right) \end{equation}$$

(10)

Experimental settings

The GAT-based HealthModel was trained on 7373 healthy tissue samples from the GTEx dataset. This dataset was randomly divided into the training, validating and testing subsets at the ratio of 8:1:1. The hyperparameters were evaluated using the Bayesian optimization parameter adjustment of the PyTorch ax package on the training set. The details of the computational settings and the value ranges of parameters are listed in the Supplementary Table S3. Dropout and early stopping strategies were used to avoid overfitting. The training of HealthModel takes around 30 min and 38 GB GPU memory.

To conduct a fair comparison of the survival analysis problem, we evaluated our framework on the same 10 cancer types as the two previous studies, Cox-nnet [12] and VAECox [14]. The 10 datasets were also randomly divided into the training, validating and testing subsets at the ratio of 8:1:1. The final performance results were obtained on the testing sets. Ten random runs of the entire process on each cancer type were conducted to avoid sample selection bias.

All experiments were implemented using the PyTorch version 1.12.0 framework.

Performance evaluation metrics

The survival prediction model was evaluated using the commonly used metric, the concordance index (C-index) [52]. A larger C-index value suggests a better model. C-Index quantifies the model’s ability to correctly provide a reliable ranking of the survival time based on the individual risk scores:

$$\begin{equation} C- index=\frac{1}{\left|k\right|}{\sum}_{i=1}^m{\sum}_{j:{t}_i<{t}_j}I\left(P\left({x}_i\right)<P\left({x}_j\right)\right), \end{equation}$$

(11)

where k denotes the set of validly orderable pairs when |${t}_i<{t}_j$|⁠, and |k| represents the number of comparable pairs among them. I is the indicator function of whether the condition in parentheses is satisfied or not. P(x) is the predicted score from the model.

The overall performance on the 10 datasets was measured by the micro-average of the 10 C-index values, which takes into account the different numbers of samples in the 10 datasets [14]. The performance of the regression task across the 10 datasets can also be evaluated by the micro-average of the MSE values as follows:

$$\begin{equation} microC=\frac{\sum_{i=1}^{10}\left({n}_i\times{c}_i\right)}{\sum_{i=1}^{10}{n}_i} \end{equation}$$

(12)

$$\begin{equation} microMSE=\frac{\sum_{i=1}^{10}\left({n}_i\times{m}_i\right)}{\sum_{i=1}^{10}{n}_i} \end{equation}$$

(13)

where |${n}_i$| is the number of samples, |${c}_i$| is the C-index of the i^th cancer type and |${m}_i$| is the MSE value of the i^th cancer type.

RESULTS AND DISCUSSION

Contributions of lincRNAs and TFs to gene expression prediction

Many lincRNAs have shown their functional involvement in the gene regulatory networks [36–38], but they are rarely used in the investigations of the gene regulatory networks or gene expression predictions. We evaluated the lincRNA’s contributions to the gene expression prediction task (Figure 2). Through the comparative experiment, we measured how much improvement the regression model for the gene expression prediction task could obtain by including lincRNAs in the list of regulatory factors.

Figure 2

Evaluation of lincRNAs as the regulatory factors for the GAT-based gene expression prediction task. Two settings of regulatory factors were evaluated, namely, TF and TF + lincRNA. (A) The microMSE values (vertical axis) of the GAT models using TF or TF + lincRNA as the regulatory factors. Five values (0.150, 0.175, 0.200, 0.225 and 0.250) were evaluated as the TOM thresholds (horizontal axis). (B) The numbers of edges (Edges) and the average degrees of the constructed GAT graph using different TOM thresholds. The nodes with no edges were excluded from the calculation of the average degrees.

Open in new tab Download slide

The expression prediction task of the mRNA features achieved the microMSE values between 0.200 and 0.350 using only TF features as the regulatory factors (Figure 2A). If we also included lincRNAs as the regulatory factors, the prediction models under all five TOM threshold values improved. The largest improvement (48.1%) was achieved at the TOM threshold of 0.225, and the overall smallest microMSE (0.171) was achieved at the same TOM threshold value.

Figure 2B shows a consistent trend of more edges under a smaller TOM threshold value. The GAT network consisted of 752 edges and 5.612 average degrees when it achieved the smallest microMSE (0.171). This setting was used in the following sections.

Evaluation of DQSurv for gene expression prediction

Gene expression prediction is crucial for disease-related gene regulation detection, and this part compared the gene expression prediction performance of the proposed multitask framework DQSurv with those of five regression methods, including linear regression (LR) [53], Lasso [54, 55], Multi-Layer Perceptron (MLP) [56], Neighbor Connection Neural Network (NCNN) [57] and a new long short-term memory (LSTM)-based gene expression prediction (L-GEPM) [58]. LR and Lasso are classic methods for regression. The Library of Integrated Network-based Cell-Signature program selects 1000 landmark genes to predict the remaining gene expression value using simple LR method [59]. Lasso achieves feature selection simultaneously with regression. Chen et al. used MLP to capture the non-linear regulation between the genes, which outperformed the LR model in their study [56]. Wang et al. [58] proposed a gene expression prediction model L-GEPM using the latent features learned by the LSTM neural network.

Graph neural network can be used to extract the complex hidden patterns within the gene regulatory network [60]. Li et al. proposed a novel NCNN framework to improve the gene expression prediction performance using the knowledge of adjacent genes in the gene interaction network [57]. The proposed DQSurv model was compared with this latest method NCNN. The ‘Simplified DQSurv’ version with only the GAT backbone was also evaluated to show whether the multitask framework (DQSurv) positively contributed to the gene expression prediction task.

The multitask framework DQSurv achieved the smallest microMSE (0.429) among all the evaluated methods (Figure 3). DQSurv also achieved the best MSE values on 5 of the 10 cancer datasets, and achieved the second-best MSE values on 3 datasets. Simplified DQSurv and MLP were ranked the best on two different datasets. NCNN achieved better overall performance in microMSE than L-GEPM, and even achieved the best MSE of 0.490 on the LUSC dataset. The overall regression performance of Simplified DQSurv was 0.434 in microMSE, with 0.020 smaller than that of NCNN. This might be due to that NCNN had more parameters than DQSurv and did not perform well on the small datasets in this study. The regression methods Lasso and LR performed the worst for the gene expression prediction task across the 10 datasets. The experimental data suggest that the latent features encoded by the GAT nodes of DQSurv achieve satisfying regression performance for the gene expression prediction task, and the multitask setting may further improve the regression performance.

Figure 3

Performance comparison of multiple regression methods for the gene expression prediction task. Each method was randomly run 10 times. The legend shows each method’s name with the average MSE value. The last group is the microMSE across the 10 cancer datasets.

Open in new tab Download slide

Figure 4

Evaluation of different module combinations. The horizontal axis lists the datasets, and the last data series ‘microC’ is the micro-average of the C-index values over the 10 datasets. The vertical axis gives the values of C-index and microC. Each dataset was randomly run 10 times. The terms R and T in the brackets represent the GAT network initialized by randomization or transferred weights from the pretrained HealthModel, respectively. Our final survival prediction model is ‘DQSurv (T)’, which is highlighted in bold.

Open in new tab Download slide

Ablation study for the DQSurv model

As the Figure 4 shown, the final survival prediction model DQSurv was represented by ‘DQSurv (T)’ and achieved the overall best microC value (0.661). The individual modules ‘A’ and ‘B (R)’ did not rely on the transferred weights from the pretrained HealthModel, and they only achieved the microC values of 0.631 and 0.635, respectively. We further evaluated the contribution of the auxiliary task (module C) to the multitask framework. The B + C module improved the B-only network by 0.005 and 0.016 in microC under random initialization and transferred weights, respectively. This confirms the positive contribution of the auxiliary task to the survival prediction model DQSurv.

The final network used all three A/B/C modules and achieved the survival prediction performance of 0.653 (with random initialization) and 0.661 (with transfer learning) in microC. The data support the positive contributions of the three modules and the transfer learning strategy. The best module ‘DQSurv (T)’ was abbreviated as the final DQSurv model for further comparison.

Comparison with the other survival prediction studies

Our DQSurv model was compared with seven studies for the survival prediction task (Figure 5). RFCox and GBCox are the survival prediction models based on the random survival forest [10] and gradient boosted models [61]. LassoCox is the classic statistical Cox regression model with L1 regularization [62]. Cox-nnet extends the Cox regression model with an artificial neural network [12]. VAECox uses a variational autoencoder network on the RNA-seq data with the transferred weights from the pretrained model on the other cancer datasets [14]. VAECox also verifies that the survival prediction task can be improved through the transfer learning strategy. Xing et al. constructs a gene-coexpression network to explore intergene relational information and proposes a MLA-GNN for disease diagnosis and prognosis [22]. Another graph-based model glmGCN embeds the graph learning module in the graph neural network [23].

Figure 5

Performance comparison for the survival prediction task. The last data series ‘microC’ is the micro-average of the C-index values over the 10 datasets. Each method was randomly run 10 times. The VAECox model does not release the C-index value for each dataset, and its performance is only illustrated in the last group ‘microC.’

Open in new tab Download slide

LassoCox outperformed the two ensemble methods, RFCox and GBCox, by 0.027 and 0.016 in microC, respectively, and the three neural network-based models achieved better survival prediction performances than the three machine learning-based Cox models i.e. LassoCox, RFCox and GBCox (Figure 5). The glmGCN model got better performance (0.607 in microC) than RFCox (0.604 in microC). MLA-GNN achieved a microC value of 0.639 and achieved the best C-index values on the LUSC and STAD datasets. MLA-GNN performed worse than Cox-nnet and VAECox in the overall performance metric microC. This might be due to that MLA-GNN only selected the top-ranked differentially expression genes to construct the network. Both VAECox and our DQSurv models used the transfer learning strategy and outperformed the Cox-nnet model by 0.001 and 0.013 in the metric microC, respectively. This suggests the positive contributions of the pretrained models on the training datasets. Both Cox-nnet and LassoCox outperformed our DQSurv model on the BRCA dataset. This might be because BRCA was the only dataset with >1000 samples, and transfer learning usually provides helpful pretrained information for small datasets [63, 64].

In summary, our DQSurv model achieved the best survival prediction performances on 5 of the 10 cancer datasets and the best overall microC value of 0.661.

Evaluation of different distance metrics

We evaluated how the survival model was affected by different quantitation ways for the dysregulation distance of the embedding features between healthy and cancer samples. The subtraction of the encoded features in the DQSurv model between healthy and normal samples was calculated as the representation of regulation changes (denoted as RegChange). We evaluated cosine distance (Cos) and Euclidean distance (EU) in replacing the subtraction method in the DQSurv model. Figure 6 shows the regression performance of different distance metrics across the datasets. The DQSurv model using Euclidean distance (DQSurv-Euc) achieved the overall performance microC of 0.648, better than that (0.633) of the DQSurv model using cosine distance (DQSurv-Cos). The DQSurv-Euc model outperformed the default DQSurv model on three datasets (BLCA, OV and STAD), while the DQSurv-Cos model achieved better performance than DQSurv on another three datasets (KIRC, LGG and LUAD). However, DQSurv achieved the best C-index values on four datasets and the best overall microC value of 0.661.

Figure 6

Evaluation of different distance metrics for the survival prediction task. The last data series ‘microC’ is the micro-average of the C

index values over the 10 datasets. Each method was randomly run for 10 times. The DQSurv-Euc model used Euclidean distance as the distance metric, and DQSurv-Cos used cosine distance. The default DQSurv model used the subtraction approach.

Open in new tab Download slide

The subtraction method of DQSurv delivered a feature vector, which could be fed to the downstream prediction tasks, while Euclidean distance and cosine distance generated one distance value between two input feature vectors. Considering both the easy-to-embed subtraction feature vector and the best regression performance, we kept using the subtraction approach in the DQSurv model.

Independent evaluation of the DQSurv framework

We further evaluated our DQSurv framework on an independent dataset POG570. This cohort is a comprehensive OMIC dataset from Canada’s Michael Smith Genome Sciences Center [65]. The cohort consists of advanced and metastatic tumor patients treated in a tertiary cancer clinic, with 415 deceased and 155 living patients. The TPM-based expression levels of the RNA-seq data are available at http://bcgsc.ca/downloads/POG570/. The POG570 dataset includes 58 051 features, among which 21 253 carry missing values in fewer than 15% samples, as similar in another pan-cancer analysis study [14], and were kept for further analysis. Then, we extracted only the features corresponding to the 800 regulators and 3062 target genes, as annotated in the ‘Regulatory factors’ section. Three (ENSG00000261685, ENSG00000229729, ENSG00000250588) of the 3862 genes are absent from the POG570 dataset. Two additional genes (ENSG00000183054, ENSG00000261584) have the expression values zeros in all the samples. We padded these five genes using 0.0. The remaining data were processed by log-transformation, KNN imputation and Z-normalization. The obtained feature matrix and edge adjacency matrix were fed into the model.

Both versions of our DQSurv model outperformed the other models on the POG570 dataset (Figure 7). The detailed data of Figure 7 are given in the Supplementary Table S4. The DQSurv model with the randomized initialization (‘DQSurv (R)’) outperformed the next best model Cox-nnet by 0.001 in the metric C-index. If the pretrained HealthModel was used to initialize the DQSurv network, the ‘DQSurv (T)’ model was further improved to achieve a C-index of 0.712. The POG570 dataset has > 30 cancer types. The best survival prediction performances by both versions of DQSurv suggest the importance of capturing the cancer regulatory changes against the healthy references. We also evaluated the gene expression prediction task of DQSurv with the five regression methods, as in Table 2. DQSurv achieved the best performance with the averaged MSE of 0.439, and the Simplified DQSurv was ranked the second best with the averaged MSE 0.466.

Figure 7

Evaluation of the survival prediction models on the independent dataset POG570. The horizontal axis gives the model names. The denotations ‘DQSurv (T)’ and ‘DQSurv (R)’ are the DQSurv models with the initial weights from the pretrained HealthModel and randomization, respectively. The vertical axis is the C-index value.

Open in new tab Download slide

Table 2

Open in new tab

Performance comparison of multiple regression methods for the gene expression prediction task on the POG570 dataset

Methods	Avg ± Std
Lasso	1.013 ± 0.089
LR	0.786 ± 0.065
MLP	0.472 ± 0.056
NCNN	0.526 ± 0.042
L-GEPM	0.506 ± 0.041
Simplified DQSurv	0.466 ± 0.036
DQSurv	0.439 ± 0.039

Methods	Avg ± Std
Lasso	1.013 ± 0.089
LR	0.786 ± 0.065
MLP	0.472 ± 0.056
NCNN	0.526 ± 0.042
L-GEPM	0.506 ± 0.041
Simplified DQSurv	0.466 ± 0.036
DQSurv	0.439 ± 0.039

The metric MSE was used to evaluate the models, and each method was randomly run 10 times. Avg and Std refer to the average and standard deviation values of the 10 random runs of each method.

Table 2

Open in new tab

Performance comparison of multiple regression methods for the gene expression prediction task on the POG570 dataset

Methods	Avg ± Std
Lasso	1.013 ± 0.089
LR	0.786 ± 0.065
MLP	0.472 ± 0.056
NCNN	0.526 ± 0.042
L-GEPM	0.506 ± 0.041
Simplified DQSurv	0.466 ± 0.036
DQSurv	0.439 ± 0.039

Methods	Avg ± Std
Lasso	1.013 ± 0.089
LR	0.786 ± 0.065
MLP	0.472 ± 0.056
NCNN	0.526 ± 0.042
L-GEPM	0.506 ± 0.041
Simplified DQSurv	0.466 ± 0.036
DQSurv	0.439 ± 0.039

The metric MSE was used to evaluate the models, and each method was randomly run 10 times. Avg and Std refer to the average and standard deviation values of the 10 random runs of each method.

CONCLUSIONS

We introduced a GAT network to describe the complex synergetic interactions among TFs and lincRNAs in the gene regulatory network and a multitask framework DQSurv for cancer survival prediction. By fully considering the evolutionary relationship between health and cancer, we trained the HealthModel using the healthy tissues and transferred it to the fine-tuning on the training cancer samples as the DiseaseModel. This fine-tuned model learned the specific characteristics of cancers, and we defined the difference between the HealthModel- and DiseaseModel-encoded latent features as the dysregulation distance. We hypothesized that this dysregulation distance fused with the expression levels of the regulatory factors represented a more complete view of cancers. The final survival prediction model DQSurv also benefited from the auxiliary expression prediction task. The experimental data supported the contributions of the main modules and the transfer learning strategy. The pretrained models and the source codes were released to support the feature encodings and survival analysis of transcriptome-based future studies, especially on small datasets.

The main limitation of this proof-of-principle study was the dependence on high-quality transcriptomes, and substantial improvements may be needed to handle the single-cell and spatial transcriptomes with missing values. The exclusion of features with >15% missing values could also be evaluated of how the threshold (15% in this study) affects the pretrained Health model and its applications on the downstream prediction tasks.

Key Points

A pretrained health model is proposed to describe the healthy gene regulation network for survival analysis.
lincRNAs are important regulatory factors in addition to TFs.
The regulations between regulators (TFs and lincRNAs) and target genes can be encoded by the graph attention model.
The distance of regulation relationships between healthy samples and cancer samples is beneficial for survival analysis.
The multitask model combines survival prediction and gene expression prediction to achieve state-of-the-art survival analysis.

ACKNOWLEDGEMENTS

The insightful comments from the four anonymous reviewers have substantially improved the convincingness and reproducibility of this study and are greatly appreciated.

FUNDING

This work was supported by the Senior and Junior Technological Innovation Team (20210509055RQ), the National Natural Science Foundation of China (62072212 and U19A2061), the Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC) and the Fundamental Research Funds for the Central Universities, JLU.

DATA AVAILABILITY

The Python implementation and the pretrained Health model and DQSurv model are freely available at http://www.healthinformaticslab.org/supp/resources.php.

Author Biographies

Meiyu Duan is a PhD student in College of Computer Science and Technology, Jilin University. Her research interests include bioinformatics algorithm design and development.

Yueying Wang is a PhD student in College of Computer Science and Technology, Jilin University. Her research interests include biomedical big data modeling.

Dong Zhao is an undergraduate student in School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University. His research interest is about the biomedical predictions.

Hongmei Liu is a professor in School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University. Her research interests focus on the wet and dry laboratory investigations of microbial genomics.

Gongyou Zhang is an assistant professor in School of Biology and Engineering, and Engineering Research Center of Medical Biotechnology, Guizhou Medical University. His research focuses on the bioinformatics and in vitro investigations of microbial transcription regulations.

Kewei Li is an undergraduate student in College of Computer Science and Technology, Jilin University. The research interests include bioinformatics algorithm design and development.

Haotian Zhang is an undergraduate student in College of Computer Science and Technology, Jilin University. The research interests include bioinformatics algorithm design.

Lan Huang is a professor at College of Computer Science and Technology, Jilin University, Changchun, Jilin, China. Her research interests include bioinformatics and systems biology.

Ruochi Zhang is a PhD student in School of Artificial Intelligence, Jilin University. His research interests include bioinformatics algorithm design and biomedical big data.

Fengfeng Zhou is a professor at College of Computer Science and Technology, Jilin University, Changchun, Jilin, China. His research interests include biomedical big data.

REFERENCES

1.

Kourou

K

,

Exarchos

TP

,

Exarchos

KP

, et al.

Machine learning applications in cancer prognosis and prediction

.

Comput Struct Biotechnol J

2015

;

13

:

8

–

17

.

2.

Cruz

JA

,

Wishart

DS

.

Applications of machine learning in cancer prediction and prognosis

.

Cancer Inform

2006

;

2

:

117693510600200

.

Google Scholar

Crossref

WorldCat

3.

Reid

N

,

Cox

D

.

Analysis of Survival Data

.

Chapman and Hall/CRC, Boca Raton, Fla. London

,

2018

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

4.

Cox

D

.

Regression models and life tables

.

J R Stat Soc B

1975

;

34

:

187

–

220

Partial Likelihood, Biometrika 1975, 1972;62:269–76

.

Google Scholar

OpenURL Placeholder Text

WorldCat

5.

Zhang

W

,

Ota

T

,

Shridhar

V

, et al.

Network-based survival analysis reveals subnetwork signatures for predicting outcomes of ovarian cancer treatment

.

PLoS Comput Biol

2013

;

9

:

e1002975

.

6.

Hoerl

AE

,

Kennard

RW

.

Ridge regression: biased estimation for nonorthogonal problems

.

Dent Tech

1970

;

12

:

55

–

67

.

Google Scholar

OpenURL Placeholder Text

WorldCat

7.

Pawitan

Y

,

Bjöhle

J

,

Wedren

S

, et al.

Gene expression profiling for prognosis using Cox regression

.

Stat Med

2004

;

23

:

1767

–

80

.

8.

Efron

B

,

Hastie

T

,

Johnstone

I

,

Tibshirani

R

.

Least angle regression

.

Ann Stat

2004

;

32

:

407

–

99

.

Google Scholar

Crossref

WorldCat

9.

Sohn

I

,

Kim

J

,

Jung

S-H

,

Park

C

.

Gradient lasso for Cox proportional hazards model

.

Bioinformatics

2009

;

25

:

1775

–

81

.

10.

Ishwaran

H

,

Kogalur

UB

,

Blackstone

EH

,

Lauer

MS

.

Random survival forests

.

Ann Appl Stat

2008

;

2

:

841

–

60

.

Google Scholar

Crossref

WorldCat

11.

Ma

B

,

Yan

G

,

Chai

B

,

Hou

X

.

XGBLC: an improved survival prediction model based on XGBoost

.

Bioinformatics

2022

;

38

:

410

–

8

.

12.

Ching

T

,

Zhu

X

,

Garmire

LX

.

Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data

.

PLoS Comput Biol

2018

;

14

:

e1006076

.

13.

Hanczar

B

,

Bourgeais

V

,

Zehraoui

F

.

Assessment of deep learning and transfer learning for cancer prediction based on gene expression data

.

BMC Bioinformatics

2022

;

23

:

262

.

14.

Kim

S

,

Kim

K

,

Choe

J

, et al.

Improved survival analysis by learning shared genomic information from pan-cancer data

.

Bioinformatics

2020

;

36

:

i389

–

98

.

15.

Furlong

LI

.

Human diseases through the lens of network biology

.

Trends Genet

2013

;

29

:

150

–

9

.

16.

Barabasi

A-L

,

Oltvai

ZN

.

Network biology: understanding the cell’s functional organization

.

Nat Rev Genet

2004

;

5

:

101

–

13

.

17.

Zhao

W

,

Langfelder

P

,

Fuller

T

, et al.

Weighted gene coexpression network analysis: state of the art

.

J Biopharm Stat

2010

;

20

:

281

–

300

.

18.

Hu

H

,

Zhang

Q

,

Hu

FF

, et al.

A comprehensive survey for human transcription factors on expression, regulation, interaction, phenotype and cancer survival

.

Brief Bioinform

2021

;

22

:1–13.

Google Scholar

OpenURL Placeholder Text

WorldCat

19.

Yang

Y

,

Han

L

,

Yuan

Y

, et al.

Gene co-expression network analysis reveals common system-level properties of prognostic genes across cancer types

.

Nat Commun

2014

;

5

:

1

–

9

.

Google Scholar

OpenURL Placeholder Text

WorldCat

20.

Li

X

,

Li

M

,

Xiang

J

, et al.

SEPA: signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data

.

Bioinformatics

2022

;

38

:

2536

–

43

.

21.

Ghosh Roy

G

,

Geard

N

,

Verspoor

K

,

He

S

.

MPVNN: mutated pathway visible neural network architecture for interpretable prediction of cancer-specific survival risk

.

Bioinformatics

2022

;

38

:

5026

–

32

.

22.

Xing

X

,

Yang

F

,

Li

H

, et al.

Multi-level attention graph neural network based on co-expression gene modules for disease diagnosis and prognosis

.

Bioinformatics

2022

;

38

:

2178

–

86

.

23.

Su

R

,

Zhu

Y

,

Zou

Q

,

Wei

L

.

Distant metastasis identification based on optimized graph representation of gene interaction patterns

.

Brief Bioinform

2022

;

23

:

bbab468

.

24.

Ramirez

R

,

Chiu

Y-C

,

Hererra

A

, et al.

Classification of cancer types using graph convolutional neural networks

.

Front Phys

2020

;

8

:

203

.

25.

Hu

S

,

Yuan

H

,

Li

Z

, et al.

Transcriptional response profiles of paired tumor-normal samples offer novel perspectives in pan-cancer analysis

.

Oncotarget

2017

;

8

:

41334

–

47

.

26.

Huang

X

,

Stern

DF

,

Zhao

HY

.

Transcriptional profiles from paired normal samples offer complementary information on cancer patient survival - evidence from TCGA pan-cancer data

.

Sci Rep

2016

;

6

:

6

.

Google Scholar

PubMed

OpenURL Placeholder Text

WorldCat

27.

Frost

HR

.

Analyzing cancer gene expression data through the lens of normal tissue-specificity

,

PLoS Comput Biol

2021

;

17

:

e1009085

.

28.

Uhlen

M

,

Zhang

C

,

Lee

S

, et al.

A pathology atlas of the human cancer transcriptome

.

Science

2017

;

357

:

660

.

Google Scholar

Crossref

WorldCat

29.

Goldman

MJ

,

Craft

B

,

Hastie

M

, et al.

Visualizing and interpreting cancer genomics data via the Xena platform

.

Nat Biotechnol

2020

;

38

:

675

–

8

.

30.

Liu

J

,

Lichtenberg

T

,

Hoadley

KA

, et al.

An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics

.

Cell

2018

;

173

:

400

–

416.e11

.

31.

Consortium GT

.

The genotype-tissue expression (GTEx) project

.

Nat Genet

2013

;

45

:

580

–

5

.

Crossref

PubMed

WorldCat

32.

Li

B

,

Dewey

CN

.

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

.

BMC Bioinformatics

2011

;

12

:

323

.

33.

Du

Y

,

Huang

Q

,

Arisdakessian

C

, et al.

Evaluation of STAR and Kallisto on single cell RNA-Seq data alignment

.

G3

2020

;

10

:

1775

–

83

.

34.

Kamimoto

K

,

Adil

MT

,

Jindal

K

, et al.

Gene regulatory network reconfiguration in direct lineage reprogramming

.

Stem Cell Reports

2023

;

18

:

97

–

112

.

35.

Dey

A

,

Sen

S

,

Maulik

U

.

Study of transcription factor druggabilty for prostate cancer using structure information, gene regulatory networks and protein moonlighting

.

Brief Bioinform

2022

;

23

:1–13.

Google Scholar

OpenURL Placeholder Text

WorldCat

36.

Nath

A

,

Geeleher

P

,

Huang

RS

.

Long non-coding RNA transcriptome of uncharacterized samples can be accurately imputed using protein-coding genes

.

Brief Bioinform

2020

;

21

:

637

–

48

.

37.

Zhang

CL

,

Zhu

KP

,

Ma

XL

.

Antisense lncRNA FOXC2-AS1 promotes doxorubicin resistance in osteosarcoma by increasing the expression of FOXC2

.

Cancer Lett

2017

;

396

:

66

–

75

.

38.

Li

Y

,

Jiang

T

,

Zhou

W

, et al.

Pan-cancer characterization of immune-related lncRNAs identifies potential oncogenic biomarkers

.

Nat Commun

2020

;

11

:

1000

.

39.

See

K

,

Tan

WLW

,

Lim

EH

, et al.

Single cardiomyocyte nuclear transcriptomes reveal a lincRNA-regulated de-differentiation and cell cycle stress-response in vivo

.

Nat Commun

2017

;

8

:

225

.

40.

Uhlen

M

,

Oksvold

P

,

Fagerberg

L

, et al.

Towards a knowledge-based human protein atlas

.

Nat Biotechnol

2010

;

28

:

1248

–

50

.

41.

Hernaez

M

,

Blatti

C

,

Gevaert

O

.

Comparison of single and module-based methods for modeling gene regulatory networks

.

Bioinformatics

2020

;

36

:

558

–

67

.

42.

Kanehisa

M

,

Furumichi

M

,

Sato

Y

, et al.

KEGG for taxonomy-based analysis of pathways and genomes

.

Nucleic Acids Res

2023

;

51

:

D587

–

92

.

43.

Han

H

,

Cho

JW

,

Lee

S

, et al.

TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions

.

Nucleic Acids Res

2017

;

46

:

D380

–

6

.

Google Scholar

Crossref

WorldCat

44.

Langfelder

P

,

Horvath

S

.

WGCNA: an R package for weighted correlation network analysis

.

BMC Bioinformatics

2008

;

9

:

559

.

45.

Sulaimanov

N

,

Kumar

S

,

Burdet

F

, et al.

Inferring gene expression networks with hubs using a degree weighted lasso approach

.

Bioinformatics

2019

;

35

:

987

–

94

.

46.

Kogelman

LJ

,

Kadarmideen

HN

.

Weighted interaction SNP hub (WISH) network method for building genetic networks for complex diseases and traits using whole genome genotype data

.

BMC Syst Biol

2014

;

8

:

S5

.

47.

Velikovi

P

,

Cucurull

G

,

Casanova

A

, et al.

Graph attention networks

. arXiv preprint arXiv:1710.10903,

2017

.

OpenURL Placeholder Text

WorldCat

48.

Liu

L

,

Wang

Y-P

,

Wang

Y

, et al.

An enhanced multi-modal brain graph network for classifying neuropsychiatric disorders

.

Med Image Anal

2022

;

81

:

102550

.

49.

Liu

S

,

Liang

Y

,

Gitter

A

.

Loss-balanced task weighting to reduce negative transfer in multi-task learning

.

Natl Conf Artif Intell

2019

;

33

:

9977

–

8

.

Google Scholar

OpenURL Placeholder Text

WorldCat

50.

Wang

J

,

Zou

Q

,

Lin

C

.

A comparison of deep learning-based pre-processing and clustering approaches for single-cell RNA sequencing data

.

Brief Bioinform

2022

;

23

:

1–19

.

Google Scholar

OpenURL Placeholder Text

WorldCat

51.

Sagar

D

,

Aggarwal

P

,

Farswan

A

, et al.

GCRS: a hybrid graph convolutional network for risk stratification in multiple myeloma cancer patients

.

Comput Biol Med

2022

;

149

:

106048

.

52.

Helmreich

JE

.

Regression modeling strategies with applications to linear models, logistic and ordinal regression and survival analysis

(2nd edn).

J Stat Softw

2016

;

70

:

1

–

3

.

Google Scholar

Crossref

WorldCat

53.

Mrak

D

,

Sieghart

D

,

Simader

E

, et al.

Heterologous vector versus homologous mRNA COVID-19 booster vaccination in non-seroconverted immunosuppressed patients: a randomized controlled trial

.

Nat Commun

2022

;

13

:

5362

.

54.

Yagin

FH

,

Cicek

IB

,

Alkhateeb

A

, et al.

Explainable artificial intelligence model for identifying COVID-19 gene biomarkers

.

Comput Biol Med

2023

;

154

:

106619

.

55.

St-Pierre

J

,

Oualkacha

K

,

Bhatnagar

SR

.

Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data

.

Bioinformatics

2023

;

39

:

39

.

Google Scholar

Crossref

WorldCat

56.

Chen

Y

,

Li

Y

,

Narayan

R

, et al.

Gene expression inference with deep learning

.

Bioinformatics

2016

;

32

:

1832

–

9

.

57.

Li

X

,

Zhang

X

,

He

W

, et al.

Gene expression prediction based on neighbour connection neural network utilizing gene interaction graphs

.

PLoS One

2023

;

18

:

e0281286

.

58.

Wang

H

,

Li

C

,

Zhang

J

, et al.

A new LSTM-based gene expression prediction model: L-GEPM

.

J Bioinform Comput Biol

2019

;

17

:

1950022

.

59.

Subramanian

A

,

Narayan

R

,

Corsello

SM

, et al.

A next generation connectivity map: L1000 platform and the first 1,000,000 profiles

.

Cell

2017

;

171

:

1437

–

1452.e17

.

60.

Zeng

Y

,

Wei

Z

,

Yu

W

, et al.

Spatial transcriptomics prediction from histology jointly through transformer and graph neural networks

.

Brief Bioinform

2022

;

23

:

23

.

Google Scholar

OpenURL Placeholder Text

WorldCat

61.

Hothorn

T

,

Buhlmann

P

,

Dudoit

S

, et al.

Survival ensembles

.

Biostatistics

2006

;

7

:

355

–

73

.

62.

Hu

Y

,

Tang

C

,

Zhu

W

, et al.

Identification of chromosomal instability-associated genes as hepatocellular carcinoma progression-related biomarkers to guide clinical diagnosis, prognosis and therapy

.

Comput Biol Med

2022

;

148

:

105896

.

63.

Maray

N

,

Ngu

AH

,

Ni

J

, et al.

Transfer learning on small datasets for improved fall detection

.

Sensors

2023

;

23

:1105.

Google Scholar

OpenURL Placeholder Text

WorldCat

64.

Gupta

V

,

Choudhary

K

,

Tavazza

F

, et al.

Cross-property deep transfer learning framework for enhanced predictive analytics on small materials data

.

Nat Commun

2021

;

12

:

6595

.

65.

Pleasance

E

,

Titmuss

E

,

Williamson

L

, et al.

Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes

.

Nat Cancer

2020

;

1

:

452

–

68

.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model ( https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)

Download all slides

Month:	Total Views:
July 2023	90
August 2023	76
September 2023	30
October 2023	28
November 2023	17
December 2023	19
January 2024	72
February 2024	49
March 2024	71
April 2024	103
May 2024	76
June 2024	38
July 2024	40
August 2024	14
September 2024	67
October 2024	48
November 2024	30
December 2024	53
January 2025	34
February 2025	47
March 2025	43
April 2025	30
May 2025	8

Article Contents

Orchestrating information across tissues via a novel multitask GAT framework to improve quantitative gene regulation relation modeling for survival analysis

Abstract

INTRODUCTION

MATERIALS AND METHODS

Dataset and preprocessing

Regulatory factors

Construction of the regulatory factor network

Overview of the study

Details of the HealthModel

Multitask survival prediction model DQSurv

Loss functions

Experimental settings

Performance evaluation metrics

RESULTS AND DISCUSSION

Contributions of lincRNAs and TFs to gene expression prediction

Evaluation of DQSurv for gene expression prediction

Ablation study for the DQSurv model

Comparison with the other survival prediction studies

Evaluation of different distance metrics

Independent evaluation of the DQSurv framework

CONCLUSIONS

ACKNOWLEDGEMENTS

FUNDING

DATA AVAILABILITY

Author Biographies

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Orchestrating information across tissues via a novel multitask GAT framework to improve quantitative gene regulation relation modeling for survival analysis

Abstract

INTRODUCTION

MATERIALS AND METHODS

Dataset and preprocessing

Regulatory factors

Construction of the regulatory factor network

Overview of the study

Details of the HealthModel

Multitask survival prediction model DQSurv

Loss functions

Experimental settings

Performance evaluation metrics

RESULTS AND DISCUSSION

Contributions of lincRNAs and TFs to gene expression prediction

Evaluation of DQSurv for gene expression prediction

Ablation study for the DQSurv model

Comparison with the other survival prediction studies

Evaluation of different distance metrics

Independent evaluation of the DQSurv framework

CONCLUSIONS

ACKNOWLEDGEMENTS

FUNDING

DATA AVAILABILITY

Author Biographies

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only