-
PDF
- Split View
-
Views
-
Cite
Cite
Jung Oh Lee, Sung Soo Ahn, Kyu Sung Choi, Junhyeok Lee, Joon Jang, Jung Hyun Park, Inpyeong Hwang, Chul-Kee Park, Sung Hye Park, Jin Wook Chung, Seung Hong Choi, Added prognostic value of 3D deep learning-derived features from preoperative MRI for adult-type diffuse gliomas, Neuro-Oncology, Volume 26, Issue 3, March 2024, Pages 571–580, https://doi.org/10.1093/neuonc/noad202
- Share Icon Share
Abstract
To investigate the prognostic value of spatial features from whole-brain MRI using a three-dimensional (3D) convolutional neural network for adult-type diffuse gliomas.
In a retrospective, multicenter study, 1925 diffuse glioma patients were enrolled from 5 datasets: SNUH (n = 708), UPenn (n = 425), UCSF (n = 500), TCGA (n = 160), and Severance (n = 132). The SNUH and Severance datasets served as external test sets. Precontrast and postcontrast 3D T1-weighted, T2-weighted, and T2-FLAIR images were processed as multichannel 3D images. A 3D-adapted SE-ResNeXt model was trained to predict overall survival. The prognostic value of the deep learning-based prognostic index (DPI), a spatial feature-derived quantitative score, and established prognostic markers were evaluated using Cox regression. Model evaluation was performed using the concordance index (C-index) and Brier score.
The MRI-only median DPI survival prediction model achieved C-indices of 0.709 and 0.677 (BS = 0.142 and 0.215) and survival differences (P < 0.001 and P = 0.002; log-rank test) for the SNUH and Severance datasets, respectively. Multivariate Cox analysis revealed DPI as a significant prognostic factor, independent of clinical and molecular genetic variables: hazard ratio = 0.032 and 0.036 (P < 0.001 and P = 0.004) for the SNUH and Severance datasets, respectively. Multimodal prediction models achieved higher C-indices than models using only clinical and molecular genetic variables: 0.783 vs. 0.774, P = 0.001, SNUH; 0.766 vs. 0.748, P = 0.023, Severance.
The global morphologic feature derived from 3D CNN models using whole-brain MRI has independent prognostic value for diffuse gliomas. Combining clinical, molecular genetic, and imaging data yields the best performance.
The 3D CNN model extracts global morphological prognostic information in diffuse gliomas, independent of local pathology-based information.
The optimal survival prediction was achieved when combining clinical, molecular genetic, and imaging data.
This study develops a deep learning-based survival prediction model for diffuse glioma patients using a 3D convolutional neural network (CNN). The model extracts spatial features from whole-brain MRI without tumor segmentation or slice selection, retaining nontumoral changes such as structural distortion, which may influence the prognosis. Integrating the deep learning-based prognostic index (DPI) demonstrated significant added prognostic value to models incorporating clinical and molecular genetic markers in overall survival prediction. The DPI provides independent prognostic value, offering a novel approach for risk stratification and emphasizing the potential for enhanced MRI data usage in clinical trials and practice. The 3D-CNN model extracts global morphological prognostic information from whole-brain MRI, in addition to local pathology limitations. The best prognostic performance is achieved by combining clinical, molecular genetic, and imaging data. This may represent a significant advancement, particularly for deep learning-based prediction, as it fully utilizes the rich 3D information extracted from multicenter large-scale MRI data.
Gliomas, the most prevalent primary brain tumor type, demonstrate a remarkable degree of heterogeneity in both intertumoral and even “intratumoral” microenvironments, significantly influencing clinical behavior and prognostic outcomes.1 Pathology-based assessments are inherently localized due to this complexity and thus may be incapable of providing a comprehensive understanding of gross tumor morphology through pathological characterization alone.
MRI, a noninvasive imaging modality, plays a pivotal role in glioma diagnosis and treatment monitoring in clinical practice. Multiple studies2–12 have underscored the prognostic value of machine learning-derived MRI radiomic features in glioma patients. Remarkably, Itakura et al.13 identified 3 distinct imaging phenotypes in glioblastoma using only preoperative MRI; moreover, these phenotypes correlated with survival and were associated with molecular pathways, suggesting the unique prognostic value of MRI. However, most studies, which have primarily been conducted with small, single- or two-center samples, require further validation for generalizability; additionally, they have primarily focused on two-dimensional (2D) MRI models, using only information from segmented tumor areas or slices including tumors,4,9–11 failing to fully exploit the rich, three-dimensional (3D) information present in nontumor areas, such as structural distortion.
Several clinical and molecular genetic markers, including age, sex, Karnofsky performance status (KPS), isocitrate dehydrogenase (IDH) mutation, O6-methylguanine-DNA methyltransferase promoter methylation (mMGMT), and 1p/19q codeletion status,14–16 have established prognostic significance for glioma patients. Nonetheless, further research is required to fully integrate these factors with imaging data for a comprehensive analysis of prognosis.
In light of this, recent developments in deep learning, specifically 3D convolutional neural networks (CNNs), have shown great promise in applications using various medical imaging data ranging from whole slide pathological images to chest CT scans.17–23 By training sufficiently large neural networks with large datasets for many epochs, useful features can be extracted, and effective representations can be obtained.24 The emergence of benchmark datasets, such as the UCSF-PDGM25 and UPenn-GBM26 cohorts, which include survival information, now enables large-scale, multicenter studies of diffuse gliomas to be conducted.
In this study, we aim to bridge this gap by investigating the prognostic significance of deep learning features derived from 3D whole-brain MRI scans in predicting glioma patient survival. Leveraging a multicenter large-scale dataset, we applied a 3D CNN model to create a deep learning-based prognostic index (DPI). We subsequently assessed the DPI’s performance as an independent prognostic factor in external test sets, including a subgroup analysis of glioblastoma.
Methods and Materials
Patients and Clinical/Genetic Variables
In this retrospective study, we collected data from a patient population diagnosed with diffuse glioma from 5 distinguished datasets: the Seoul National University Hospital (Dataset 1, n = 708), Severance Hospital (Dataset 2, n = 132), UCSF-PDGM (UCSF, n = 500),25 UPenn-GBM cohort (UPenn, n = 425),26 and TCGA/TCIA (TCGA, n = 160) datasets.27
Two tertiary centers have been included as independent external test sets; for the data from these centers, data on established markers are available, including molecular markers for glioblastoma, namely, 1p/19q codeletion, phosphatase and tensin homolog (PTEN) mutation, epidermal growth factor receptor (EGFR) amplification, telomerase reverse transcriptase gene promoter (pTERT) mutation, combination of gain of chromosome 7 and loss of chromosome 10 (7+/10−), and histone 3 lysine 27-to-methionine (H3 K27M) mutation. More specifically, for Dataset 1, from November 2014 to July 2021, we retrospectively identified patients who met the following criteria: (1) histopathologically defined adult-type diffuse gliomas according to WHO CNS tumor classification 2021; (2) molecularly confirmed IDH mutation status and 1p/19q deletion; (3) available preoperative MRI including T2-weighted (T2), fluid-attenuated inversion recovery (FLAIR) images, 3D precontrast T1-weighted (T1), and 3D postcontrast T1-weighted (T1C); (4) history of surgical treatment; and (5) available survival data (ie death and duration) after at least 3 months of follow-up. We excluded individuals under 18 years of age and those with corrupted or missing preoperative MRI scans or survival data. Further details regarding patient inclusion/exclusion are available in Supplementary Figure 1. For Dataset 2, the same inclusion and exclusion criteria were applied for enrolling glioma patients diagnosed from December 2016 to June 2018. The Institutional Review Board of both the Dataset 1 and Dataset 2 institutions approved the study, and a waiver for informed consent was obtained. Here, clinical and genetic variables were collected for each patient, including age at diagnosis, sex, KPS, extent of surgical resection (EOR), WHO grade, pathology, IDH mutation state, and mMGMT status. The pathology classifications were conducted in adherence to the WHO CNS tumor classification of 2021. Overall survival (OS) was defined as the duration from the date of glioblastoma diagnosis to the date of death attributed to the disease or the end of follow-up. The 1-year OS rate is defined as the proportion of patients who survive for at least 1 year from the date of their diagnosis, with survival being measured until the date of death due to disease progression. Patients who were alive at the time of analysis or lost to follow-up were censored at the last known date of contact or the last MR examination.
We considered age as a continuous variable, and the EOR was categorized into gross total resection (GTR) and non-GTR for consistency across the datasets. KPS and EOR were not available in the TCGA/TCIA dataset, and KPS was also unavailable in the UCSF dataset.
The three largest public datasets for diffuse gliomas are included as training and validation sets: the UCSF, UPenn, and TCGA datasets. The UCSF dataset is part of the Glioblastoma Precision Medicine Program and includes a searchable database containing clinical, pathologic, and genomic data for all included patients with WHO grade 4 glioblastoma.25 The UPenn-GBM cohort includes advanced MRI, clinical, genomics, and radiomics data.26 The TCGA-GBM dataset provides computed tomography (CT) and MRI data of 160 glioblastoma patients.27 All the datasets included the same 4 structural preoperative MRI sequences and survival data.
Image Preprocessing
All preoperative MR images were acquired from 1.5-T or 3.0-T scanners. Scan parameters are detailed in Supplementary Table 1. For preprocessing, all the sequences were coregistered to the highest-resolution 3D T1C image with 240 × 240 × 155 dimensions after skull stripping; then, after 2 mm isotropic downsampling, histogram normalization was performed. After channelwise concatenation of the standardized images, the 4-channel 3D whole-brain images were processed by a 3D CNN model without slice selection or tumor segmentation to predict the survival of diffuse glioma patients. During training only, we augmented the data by performing random cropping and adding random Gaussian noise.
Neural Network and Evaluation
Upon evaluating various model architectures, 3D SE-ResNeXt, a 3-dimensional adaptation of the SE-ResNeXt architecture, was selected as our model of choice (Supplementary Table 2)28,29 and implemented with Pytorch (https://github.com/kyuchoi/3D_MRI_survival_glioma). We integrated the 3D convolutional neural network with the nnet-survival model,21 a discrete-time survival model for neural networks. The model was structured to predict hazards across 19 equal quantile intervals for survival days. The model performance was evaluated using the 1-year survival probability as a metric for patient prognosis in subsequent survival analyses; 1-year survival probability was computed based on the predicted hazard. This metric is referred to as the deep learning-based prognostic index (DPI). For model interpretation, 3D volumetric Gradient-Weighted Class Activation Mapping (Grad-CAM)30 was generated, which can examine the model interpretability for not only axial but also coronal and sagittal axes simultaneously.
For model training, the negative log-likelihood of patient survival over all predetermined intervals was utilized as the loss function. To improve generalizability, sharpness-aware minimization,31 which is specifically designed for learning with noisy labels, was used for the optimizer, with AdamP32 as a base optimizer. The training process used a learning rate of 10−4 and incorporated weight decay with a value of 10−6. Early stopping was included to prevent overfitting. After training, the model with the highest Harrell’s concordance index (C-index) of the DPI in the validation set was chosen for subsequent analysis in the test set.
We performed survival analyses not only on the 2 external datasets but also on more uniform subgroups, specifically glioblastoma patients, within each external dataset. To ascertain whether the model employs mere volumetric information of the tumor, we calculated the volume and diagonal length of the enhancing/nonenhancing portions of glioma using HD-GLIO33,34 in these datasets and carried out additional survival analyses. These additional analyses were conducted to further assess the predictive power of the DPI. The overall methodology of the experiment is detailed in the methods section of Supplementary Materials, and configuration of the model architecture is depicted in Supplementary Figure 2.
Statistical Analysis
During the evaluation phase, we first visualized the efficacy of the DPI using Kaplan‒Meier curves for the test sets, which was divided into low-risk and high-risk groups based on their DPIs. The universal cutoff value was set at the median of the DPIs in the validation set. We performed a statistical comparison of the Kaplan‒Meier curves for both groups using the log-rank test. In the following survival analysis, we calculated the hazard ratio (HR) and associated P values for all clinical and genetic variables, as well as the DPI, using a univariate Cox proportional hazards (PH) model. Multivariate Cox regression analyses were conducted to establish whether the DPI was an independent prognostic indicator when combined with other well-established predictive variables. We employed a brute-force approach using the Akaike information criterion for model selection. The efficacy of the continuous DPI and multivariate Cox PH model was evaluated using 2 metrics: the C-index with 95% confidence interval (CI) and the Brier score. C-indices for the selected models were compared to C-indices of the other model selected using only clinical and genetic variables through Student’s t-test for dependent samples.35 Exploratory data analysis for DPI and time-to-death duration was performed, including correlation analysis, histograms, and kernel density estimation, using seaborn 0.11.2, Python package.
In the subgroup analysis, the same statistical methodology was applied to the subgroup of IDH-wild-type glioblastoma from external test sets. Statistical analysis was performed using R version 4.1.1 and the following packages: survival 3.2-13, dplyr 1.0.7, ggplot2 3.3.5, and ggfortify 0.4.13. A P value of less than 0.05 was considered to indicate a statistically significant difference.
Results
Patient Characteristics
Among 2177 patients of all 5 datasets, 1925 (88.4%) met the inclusion criteria and were analyzed in this study, as detailed in Supplementary Figure 1. The study participants were divided into 3 sets: 1085 patients to the training/validation set combining 3 datasets (mean age, 58.3 ± 14.5 years; median OS, 468 days (interquartile range (IQR), 233–831 days)); 708 patients to external test set 1 (mean age, 53.8 ± 15.2; median OS 799 (IQR, 402–2243 days)); and 132 patients to external test set 2 (mean age, 58.3 ± 13.6; median OS of 629 (IQR, 389–1362 days)). The median tumor sizes were 31.4 mm (range, 3.6–186.8 mm), and 32.8 mm (range, 2.2–80.3 mm) in the both external test set 1 and 2, respectively. The characteristics of these 3 datasets are presented in Table 1. Details of all 5 datasets can be found in Supplementary Table 3.
Clinical and molecular genetic variables of patients in the training/validation and external test sets
Variables . | Training/validation set (n = 1085) . | External test set 1 (n = 708) . | External test set 2 (n = 132) . | P value . |
---|---|---|---|---|
Age (years), mean ± SD | 58.3 ± 14.5 | 53.8 ± 15.2 | 58.3 ± 13.6 | *<0.001 |
Sex | 0.943 | |||
Male, n (%) | 636 (58.6) | 410 (57.9) | 76 (57.6) | |
Female, n (%) | 449 (41.4) | 298 (42.1) | 56 (42.4) | |
KPS | <0.001 | |||
≥80, n (%) | 52 (4.8) | 492 (69.5) | 77 (58.3) | |
<80, n (%) | 14 (1.3) | 124 (17.5) | 55 (41.7) | |
Not available, n (%) | 1019 (93.9) | 92 (13.0) | 0 (0) | |
Extent of surgical resection | <0.001 | |||
GTR, n (%) | 467 (43.0) | 418 (59.0) | 89 (67.4) | |
Non-GTR, n (%) | 440 (40.6) | 290 (41.0) | 43 (32.6) | |
Not available, n (%) | 178 (16.4) | 0 (0) | 0 (0) | |
WHO grades | <0.001 | |||
Grade 4, n (%) | 887 (81.8) | 443 (62.6) | 111 (84.1) | |
Grade 3, n (%) | 98 (9.0) | 187 (26.4) | 13 (9.8) | |
Grade 2, n (%) | 100 (9.2) | 78 (11.0) | 8 (6.1) | |
Glioma pathology | <0.001 | |||
Glioblastoma, IDH wild type, n (%) | 848 (78.2) | 396 (55.9) | 103 (78.0) | |
Astrocytoma, IDH-mutant, n (%) | 115 (10.6) | 112 (15.8) | 19 (14.4) | |
Oligodendroglioma, IDH-mutant, and 1p/19q-codeleted, n (%) | 49 (4.5) | 77 (10.9) | 6 (4.5) | |
Others, n (%) | 73 (6.7) | 123 (17.4) | 4 (3.0) | |
IDH mutation | <0.001 | |||
Wild type, n (%) | 893 (82.3) | 514 (72.6) | 112 (84.8) | |
Mutated, n (%) | 192 (17.7) | 194 (27.4) | 20 (15.2) | |
mMGMT | <0.001 | |||
Unmethylated, n (%) | 342 (31.5) | 354 (50.0) | 75 (56.8) | |
Methylated, n (%) | 496 (45.7) | 346 (48.9) | 57 (43.2) | |
Not available, n (%) | 247 (22.8) | 8 (1.1) | 0 (0) | |
Median overall survival, days (Q1–Q3) | 468 (233–831) | 799 (402–2243) | 629 (389–1362) | **<0.001 |
Variables . | Training/validation set (n = 1085) . | External test set 1 (n = 708) . | External test set 2 (n = 132) . | P value . |
---|---|---|---|---|
Age (years), mean ± SD | 58.3 ± 14.5 | 53.8 ± 15.2 | 58.3 ± 13.6 | *<0.001 |
Sex | 0.943 | |||
Male, n (%) | 636 (58.6) | 410 (57.9) | 76 (57.6) | |
Female, n (%) | 449 (41.4) | 298 (42.1) | 56 (42.4) | |
KPS | <0.001 | |||
≥80, n (%) | 52 (4.8) | 492 (69.5) | 77 (58.3) | |
<80, n (%) | 14 (1.3) | 124 (17.5) | 55 (41.7) | |
Not available, n (%) | 1019 (93.9) | 92 (13.0) | 0 (0) | |
Extent of surgical resection | <0.001 | |||
GTR, n (%) | 467 (43.0) | 418 (59.0) | 89 (67.4) | |
Non-GTR, n (%) | 440 (40.6) | 290 (41.0) | 43 (32.6) | |
Not available, n (%) | 178 (16.4) | 0 (0) | 0 (0) | |
WHO grades | <0.001 | |||
Grade 4, n (%) | 887 (81.8) | 443 (62.6) | 111 (84.1) | |
Grade 3, n (%) | 98 (9.0) | 187 (26.4) | 13 (9.8) | |
Grade 2, n (%) | 100 (9.2) | 78 (11.0) | 8 (6.1) | |
Glioma pathology | <0.001 | |||
Glioblastoma, IDH wild type, n (%) | 848 (78.2) | 396 (55.9) | 103 (78.0) | |
Astrocytoma, IDH-mutant, n (%) | 115 (10.6) | 112 (15.8) | 19 (14.4) | |
Oligodendroglioma, IDH-mutant, and 1p/19q-codeleted, n (%) | 49 (4.5) | 77 (10.9) | 6 (4.5) | |
Others, n (%) | 73 (6.7) | 123 (17.4) | 4 (3.0) | |
IDH mutation | <0.001 | |||
Wild type, n (%) | 893 (82.3) | 514 (72.6) | 112 (84.8) | |
Mutated, n (%) | 192 (17.7) | 194 (27.4) | 20 (15.2) | |
mMGMT | <0.001 | |||
Unmethylated, n (%) | 342 (31.5) | 354 (50.0) | 75 (56.8) | |
Methylated, n (%) | 496 (45.7) | 346 (48.9) | 57 (43.2) | |
Not available, n (%) | 247 (22.8) | 8 (1.1) | 0 (0) | |
Median overall survival, days (Q1–Q3) | 468 (233–831) | 799 (402–2243) | 629 (389–1362) | **<0.001 |
Abbreviations: SD, standard deviation; CI, confidence interval; IDH, isocitrate dehydrogenase; MGMT, O6-methylguanine-DNA methyltransferase.
*Indicates P value for significant difference in age using ANOVA.
**Indicates P value for significant difference in overall survival using the log-rank test.
Other P values are for significant differences using the chi-square test.
Clinical and molecular genetic variables of patients in the training/validation and external test sets
Variables . | Training/validation set (n = 1085) . | External test set 1 (n = 708) . | External test set 2 (n = 132) . | P value . |
---|---|---|---|---|
Age (years), mean ± SD | 58.3 ± 14.5 | 53.8 ± 15.2 | 58.3 ± 13.6 | *<0.001 |
Sex | 0.943 | |||
Male, n (%) | 636 (58.6) | 410 (57.9) | 76 (57.6) | |
Female, n (%) | 449 (41.4) | 298 (42.1) | 56 (42.4) | |
KPS | <0.001 | |||
≥80, n (%) | 52 (4.8) | 492 (69.5) | 77 (58.3) | |
<80, n (%) | 14 (1.3) | 124 (17.5) | 55 (41.7) | |
Not available, n (%) | 1019 (93.9) | 92 (13.0) | 0 (0) | |
Extent of surgical resection | <0.001 | |||
GTR, n (%) | 467 (43.0) | 418 (59.0) | 89 (67.4) | |
Non-GTR, n (%) | 440 (40.6) | 290 (41.0) | 43 (32.6) | |
Not available, n (%) | 178 (16.4) | 0 (0) | 0 (0) | |
WHO grades | <0.001 | |||
Grade 4, n (%) | 887 (81.8) | 443 (62.6) | 111 (84.1) | |
Grade 3, n (%) | 98 (9.0) | 187 (26.4) | 13 (9.8) | |
Grade 2, n (%) | 100 (9.2) | 78 (11.0) | 8 (6.1) | |
Glioma pathology | <0.001 | |||
Glioblastoma, IDH wild type, n (%) | 848 (78.2) | 396 (55.9) | 103 (78.0) | |
Astrocytoma, IDH-mutant, n (%) | 115 (10.6) | 112 (15.8) | 19 (14.4) | |
Oligodendroglioma, IDH-mutant, and 1p/19q-codeleted, n (%) | 49 (4.5) | 77 (10.9) | 6 (4.5) | |
Others, n (%) | 73 (6.7) | 123 (17.4) | 4 (3.0) | |
IDH mutation | <0.001 | |||
Wild type, n (%) | 893 (82.3) | 514 (72.6) | 112 (84.8) | |
Mutated, n (%) | 192 (17.7) | 194 (27.4) | 20 (15.2) | |
mMGMT | <0.001 | |||
Unmethylated, n (%) | 342 (31.5) | 354 (50.0) | 75 (56.8) | |
Methylated, n (%) | 496 (45.7) | 346 (48.9) | 57 (43.2) | |
Not available, n (%) | 247 (22.8) | 8 (1.1) | 0 (0) | |
Median overall survival, days (Q1–Q3) | 468 (233–831) | 799 (402–2243) | 629 (389–1362) | **<0.001 |
Variables . | Training/validation set (n = 1085) . | External test set 1 (n = 708) . | External test set 2 (n = 132) . | P value . |
---|---|---|---|---|
Age (years), mean ± SD | 58.3 ± 14.5 | 53.8 ± 15.2 | 58.3 ± 13.6 | *<0.001 |
Sex | 0.943 | |||
Male, n (%) | 636 (58.6) | 410 (57.9) | 76 (57.6) | |
Female, n (%) | 449 (41.4) | 298 (42.1) | 56 (42.4) | |
KPS | <0.001 | |||
≥80, n (%) | 52 (4.8) | 492 (69.5) | 77 (58.3) | |
<80, n (%) | 14 (1.3) | 124 (17.5) | 55 (41.7) | |
Not available, n (%) | 1019 (93.9) | 92 (13.0) | 0 (0) | |
Extent of surgical resection | <0.001 | |||
GTR, n (%) | 467 (43.0) | 418 (59.0) | 89 (67.4) | |
Non-GTR, n (%) | 440 (40.6) | 290 (41.0) | 43 (32.6) | |
Not available, n (%) | 178 (16.4) | 0 (0) | 0 (0) | |
WHO grades | <0.001 | |||
Grade 4, n (%) | 887 (81.8) | 443 (62.6) | 111 (84.1) | |
Grade 3, n (%) | 98 (9.0) | 187 (26.4) | 13 (9.8) | |
Grade 2, n (%) | 100 (9.2) | 78 (11.0) | 8 (6.1) | |
Glioma pathology | <0.001 | |||
Glioblastoma, IDH wild type, n (%) | 848 (78.2) | 396 (55.9) | 103 (78.0) | |
Astrocytoma, IDH-mutant, n (%) | 115 (10.6) | 112 (15.8) | 19 (14.4) | |
Oligodendroglioma, IDH-mutant, and 1p/19q-codeleted, n (%) | 49 (4.5) | 77 (10.9) | 6 (4.5) | |
Others, n (%) | 73 (6.7) | 123 (17.4) | 4 (3.0) | |
IDH mutation | <0.001 | |||
Wild type, n (%) | 893 (82.3) | 514 (72.6) | 112 (84.8) | |
Mutated, n (%) | 192 (17.7) | 194 (27.4) | 20 (15.2) | |
mMGMT | <0.001 | |||
Unmethylated, n (%) | 342 (31.5) | 354 (50.0) | 75 (56.8) | |
Methylated, n (%) | 496 (45.7) | 346 (48.9) | 57 (43.2) | |
Not available, n (%) | 247 (22.8) | 8 (1.1) | 0 (0) | |
Median overall survival, days (Q1–Q3) | 468 (233–831) | 799 (402–2243) | 629 (389–1362) | **<0.001 |
Abbreviations: SD, standard deviation; CI, confidence interval; IDH, isocitrate dehydrogenase; MGMT, O6-methylguanine-DNA methyltransferase.
*Indicates P value for significant difference in age using ANOVA.
**Indicates P value for significant difference in overall survival using the log-rank test.
Other P values are for significant differences using the chi-square test.
Deep Survival Model Performance and Risk Stratification
For the deep survival prediction model (ie 3D CNN model) with image data only, the C-indices in training, validation, and 2 external sets are described in Table 2. The high-risk group was defined as the set of individuals with a DPI below the median DPI in the validation set (0.569), while the low-risk group comprised those with a DPI above the median. The log-rank test revealed a significant difference in survival outcomes between the low-risk and high-risk groups for both test sets: P < 0.001 for test set 1; P = 0.002 for test set 2. Kaplan‒Meier survival curves for the 2 distinct groups, low-risk, and high-risk, for each test set are also shown (Figure 1).
Comparison of model performance using different training/validation and test sets
Training . | Validation . | External test set 1 . | External test set 2 . | ||||
---|---|---|---|---|---|---|---|
C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . |
0.747 (0.725, 0.772) | 0.195 | 0.731 (0.664, 0.802) | 0.199 | 0.709 (0.683, 0.735) | 0.142 | 0.677 (0.581, 0.766) | 0.215 |
Training . | Validation . | External test set 1 . | External test set 2 . | ||||
---|---|---|---|---|---|---|---|
C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . |
0.747 (0.725, 0.772) | 0.195 | 0.731 (0.664, 0.802) | 0.199 | 0.709 (0.683, 0.735) | 0.142 | 0.677 (0.581, 0.766) | 0.215 |
The numbers in the parentheses represent the 95% confidence interval.
Comparison of model performance using different training/validation and test sets
Training . | Validation . | External test set 1 . | External test set 2 . | ||||
---|---|---|---|---|---|---|---|
C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . |
0.747 (0.725, 0.772) | 0.195 | 0.731 (0.664, 0.802) | 0.199 | 0.709 (0.683, 0.735) | 0.142 | 0.677 (0.581, 0.766) | 0.215 |
Training . | Validation . | External test set 1 . | External test set 2 . | ||||
---|---|---|---|---|---|---|---|
C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . | C-index . | Brier score . |
0.747 (0.725, 0.772) | 0.195 | 0.731 (0.664, 0.802) | 0.199 | 0.709 (0.683, 0.735) | 0.142 | 0.677 (0.581, 0.766) | 0.215 |
The numbers in the parentheses represent the 95% confidence interval.

Kaplan‒Meier curves of the low-risk group and high-risk group stratified according to DPI in (A) external test set 1 and (B) external test set 2. The P value in each figure is the result of the log-rank test between the survival curves.
Model Interpretation
Figures 2 and 3 demonstrate the 3D volumetric Grad-CAM output overlaid on FLAIR and T1C images, reflecting the model’s areas of attention in Datasets 1 and 2, respectively. The model was not given any information about tumor location, but the attention maps demonstrate the model’s ability to locate tumor areas (Figure 2). Interestingly, the model attended not only to enhancing tumors but also to nonenhancing tumors, specifically for cases of IDH-wt glioblastoma that showed subtle enhancement (Figure 3); additionally, for solid and cystic tumors, the model allocated more attention to the solid portion of the tumor (Supplementary Figure 5). Other representative cases with 3D Grad-CAM visualizations are provided in Supplementary Figure 6, and a video demonstration of the 3D Grad-CAM output from Figure 3 is provided in Supplementary Video 1.

Grad-CAM visualization for model interpretation in the following representative case from Dataset 1: a 60-year-old woman with WHO grade 4 (KPS = 90) IDH-wt glioblastoma who died 1366 days after gross total resection. The model correctly predicted no death within a year (DPI, 0.483). Both enhancing tumor and surrounding nonenhancing tumor areas were attended, as shown in the overlay on (A) 3D postcontrast T1-weighted and (B) T2 FLAIR images.

Grad-CAM visualization for model interpretation in the following representative case from Dataset 2: a 27-year-old man with WHO grade 4 (KPS = 100) IDH-mut astrocytoma who died 2017 days after gross total resection. The model correctly predicted no death within a year (DPI, 0.479). The tumor showed only subtle enhancement, and relatively T2 hypointense nonenhancing tumor areas were shown as active in the Grad-CAM overlay on (A) 3D postcontrast T1-weighted and (B) T2 FLAIR images.
Prognostic Value of DPI
Table 3 presents the results of both univariate and multivariate Cox regression analyses, which included clinical and molecular variables in addition to the DPI. In the multivariate Cox regression, the DPI was consistently identified as an independent prognostic factor in both test sets: HR = 0.032 (95% CI, 0.010–0.103), P = 0.001 for external test set 1; and HR = 0.036 (95% CI, 0.004–0.352), P < 0.001 for external test set 2. Multivariate Cox PH models that incorporated DPIs showed higher C-indices than those fitted exclusively with clinical and molecular variables; the C-indices for each comparison were as follows: (1) 0.774 (95% CI, 0.751–0.796) vs. 0.783 (95% CI, 0.761–0.804) for test set 1 (P < 0.001); and (2) 0.748 (95% CI, 0.696–0.799) vs. 0.766 (95% CI, 0.718–0.813) for test set 2 (P = 0.023), respectively. Additional results of the univariate analyses can be found in Supplementary Table 4. Further exploratory data analysis for DPI and time-to-death duration is given in Supplementary Figure 4.
. | External test set 1 . | External test set 2 . | ||||||
---|---|---|---|---|---|---|---|---|
. | Multivariate analysis without DPI . | Multivariate analysis with DPI . | Multivariate analysis without DPI . | Multivariate analysis with DPI . | ||||
Variables . | HR . | P value . | HR . | P value . | HR . | P value . | HR . | P value . |
Age | 1.020 (1.012, 1.028) | <0.001 | 1.011 (1.003, 1.020) | 0.010 | 1.024 (1.003, 1.045) | 0.026 | 1.018 (0.998, 1.040) | 0.084 |
Sex (F = 1) | 0.818 (0.663, 1.010) | 0.061 | 0.792 (0.643, 0.976) | 0.029 | 0.499 (0.322, 0.774) | 0.002 | 0.480 (0.310, 0.744) | 0.001 |
Total resection | 0.846 (0.683, 1.049) | 0.127 | 0.546 (0.354, 0.843) | 0.006 | 0.596 (0.382, 0.929) | 0.022 | ||
KPS | 0.978 (0.971, 0.986) | <0.001 | 0.983 (0.975, 0.990) | <0.001 | 0.979 (0.967, 0.992) | 0.001 | 0.975 (0.961, 0.988) | <0.001 |
WHO grades | 1.551 (1.111, 2.166) | 0.010 | 1.523 (1.087, 2.133) | 0.014 | ||||
Pathology—astrocytoma, IDH-mutant | 0.576 (0.368, 0.903) | 0.016 | 0.628 (0.399, 0.991) | 0.045 | ||||
Pathology—oligodendroglioma, IDH-mutant, and 1p19q-codeleted | 0.105 (0.035, 0.318) | <0.001 | 0.114 (0.038, 0.347) | <0.001 | ||||
Pathology—others | 0.833 (0.547, 1.269) | 0.396 | 0.883 (0.581, 1.343) | 0.562 | ||||
IDH | 0.050 (0.007, 0.377) | 0.004 | 0.040 (0.005, 0.307) | 0.002 | ||||
MGMT | 0.501 (0.401, 0.627) | <0.001 | 0.508 (0.405, 0.636) | <0.001 | ||||
DPI | - | - | 0.032 (0.010, 0.103) | <0.001 | - | - | 0.036 (0.004, 0.352) | 0.004 |
C-index | 0.774 (0.751, 0.796) | 0.783 (0.761, 0.804) | 0.748 (0.696, 0.799) | (0.718, 0.813) | ||||
P value | 0.001 | 0.023 |
. | External test set 1 . | External test set 2 . | ||||||
---|---|---|---|---|---|---|---|---|
. | Multivariate analysis without DPI . | Multivariate analysis with DPI . | Multivariate analysis without DPI . | Multivariate analysis with DPI . | ||||
Variables . | HR . | P value . | HR . | P value . | HR . | P value . | HR . | P value . |
Age | 1.020 (1.012, 1.028) | <0.001 | 1.011 (1.003, 1.020) | 0.010 | 1.024 (1.003, 1.045) | 0.026 | 1.018 (0.998, 1.040) | 0.084 |
Sex (F = 1) | 0.818 (0.663, 1.010) | 0.061 | 0.792 (0.643, 0.976) | 0.029 | 0.499 (0.322, 0.774) | 0.002 | 0.480 (0.310, 0.744) | 0.001 |
Total resection | 0.846 (0.683, 1.049) | 0.127 | 0.546 (0.354, 0.843) | 0.006 | 0.596 (0.382, 0.929) | 0.022 | ||
KPS | 0.978 (0.971, 0.986) | <0.001 | 0.983 (0.975, 0.990) | <0.001 | 0.979 (0.967, 0.992) | 0.001 | 0.975 (0.961, 0.988) | <0.001 |
WHO grades | 1.551 (1.111, 2.166) | 0.010 | 1.523 (1.087, 2.133) | 0.014 | ||||
Pathology—astrocytoma, IDH-mutant | 0.576 (0.368, 0.903) | 0.016 | 0.628 (0.399, 0.991) | 0.045 | ||||
Pathology—oligodendroglioma, IDH-mutant, and 1p19q-codeleted | 0.105 (0.035, 0.318) | <0.001 | 0.114 (0.038, 0.347) | <0.001 | ||||
Pathology—others | 0.833 (0.547, 1.269) | 0.396 | 0.883 (0.581, 1.343) | 0.562 | ||||
IDH | 0.050 (0.007, 0.377) | 0.004 | 0.040 (0.005, 0.307) | 0.002 | ||||
MGMT | 0.501 (0.401, 0.627) | <0.001 | 0.508 (0.405, 0.636) | <0.001 | ||||
DPI | - | - | 0.032 (0.010, 0.103) | <0.001 | - | - | 0.036 (0.004, 0.352) | 0.004 |
C-index | 0.774 (0.751, 0.796) | 0.783 (0.761, 0.804) | 0.748 (0.696, 0.799) | (0.718, 0.813) | ||||
P value | 0.001 | 0.023 |
Empty cells are eliminated variables in the final multivariate Cox proportional hazard model using Akaike information criteria. The numbers in the parentheses represent the 95% confidence interval. The HR for pathological categorial variables is calculated relative to a reference category, specifically glioblastoma, IDH-wild type. The P values at the bottom indicate the comparison of C-indices between models, with and without DPI.
Abbreviations: HR, hazard ratio; KPS, Karnofsky performance status; IDH, isocitrate dehydrogenase; MGMT, O6-methylguanine-DNA methyltransferase.
. | External test set 1 . | External test set 2 . | ||||||
---|---|---|---|---|---|---|---|---|
. | Multivariate analysis without DPI . | Multivariate analysis with DPI . | Multivariate analysis without DPI . | Multivariate analysis with DPI . | ||||
Variables . | HR . | P value . | HR . | P value . | HR . | P value . | HR . | P value . |
Age | 1.020 (1.012, 1.028) | <0.001 | 1.011 (1.003, 1.020) | 0.010 | 1.024 (1.003, 1.045) | 0.026 | 1.018 (0.998, 1.040) | 0.084 |
Sex (F = 1) | 0.818 (0.663, 1.010) | 0.061 | 0.792 (0.643, 0.976) | 0.029 | 0.499 (0.322, 0.774) | 0.002 | 0.480 (0.310, 0.744) | 0.001 |
Total resection | 0.846 (0.683, 1.049) | 0.127 | 0.546 (0.354, 0.843) | 0.006 | 0.596 (0.382, 0.929) | 0.022 | ||
KPS | 0.978 (0.971, 0.986) | <0.001 | 0.983 (0.975, 0.990) | <0.001 | 0.979 (0.967, 0.992) | 0.001 | 0.975 (0.961, 0.988) | <0.001 |
WHO grades | 1.551 (1.111, 2.166) | 0.010 | 1.523 (1.087, 2.133) | 0.014 | ||||
Pathology—astrocytoma, IDH-mutant | 0.576 (0.368, 0.903) | 0.016 | 0.628 (0.399, 0.991) | 0.045 | ||||
Pathology—oligodendroglioma, IDH-mutant, and 1p19q-codeleted | 0.105 (0.035, 0.318) | <0.001 | 0.114 (0.038, 0.347) | <0.001 | ||||
Pathology—others | 0.833 (0.547, 1.269) | 0.396 | 0.883 (0.581, 1.343) | 0.562 | ||||
IDH | 0.050 (0.007, 0.377) | 0.004 | 0.040 (0.005, 0.307) | 0.002 | ||||
MGMT | 0.501 (0.401, 0.627) | <0.001 | 0.508 (0.405, 0.636) | <0.001 | ||||
DPI | - | - | 0.032 (0.010, 0.103) | <0.001 | - | - | 0.036 (0.004, 0.352) | 0.004 |
C-index | 0.774 (0.751, 0.796) | 0.783 (0.761, 0.804) | 0.748 (0.696, 0.799) | (0.718, 0.813) | ||||
P value | 0.001 | 0.023 |
. | External test set 1 . | External test set 2 . | ||||||
---|---|---|---|---|---|---|---|---|
. | Multivariate analysis without DPI . | Multivariate analysis with DPI . | Multivariate analysis without DPI . | Multivariate analysis with DPI . | ||||
Variables . | HR . | P value . | HR . | P value . | HR . | P value . | HR . | P value . |
Age | 1.020 (1.012, 1.028) | <0.001 | 1.011 (1.003, 1.020) | 0.010 | 1.024 (1.003, 1.045) | 0.026 | 1.018 (0.998, 1.040) | 0.084 |
Sex (F = 1) | 0.818 (0.663, 1.010) | 0.061 | 0.792 (0.643, 0.976) | 0.029 | 0.499 (0.322, 0.774) | 0.002 | 0.480 (0.310, 0.744) | 0.001 |
Total resection | 0.846 (0.683, 1.049) | 0.127 | 0.546 (0.354, 0.843) | 0.006 | 0.596 (0.382, 0.929) | 0.022 | ||
KPS | 0.978 (0.971, 0.986) | <0.001 | 0.983 (0.975, 0.990) | <0.001 | 0.979 (0.967, 0.992) | 0.001 | 0.975 (0.961, 0.988) | <0.001 |
WHO grades | 1.551 (1.111, 2.166) | 0.010 | 1.523 (1.087, 2.133) | 0.014 | ||||
Pathology—astrocytoma, IDH-mutant | 0.576 (0.368, 0.903) | 0.016 | 0.628 (0.399, 0.991) | 0.045 | ||||
Pathology—oligodendroglioma, IDH-mutant, and 1p19q-codeleted | 0.105 (0.035, 0.318) | <0.001 | 0.114 (0.038, 0.347) | <0.001 | ||||
Pathology—others | 0.833 (0.547, 1.269) | 0.396 | 0.883 (0.581, 1.343) | 0.562 | ||||
IDH | 0.050 (0.007, 0.377) | 0.004 | 0.040 (0.005, 0.307) | 0.002 | ||||
MGMT | 0.501 (0.401, 0.627) | <0.001 | 0.508 (0.405, 0.636) | <0.001 | ||||
DPI | - | - | 0.032 (0.010, 0.103) | <0.001 | - | - | 0.036 (0.004, 0.352) | 0.004 |
C-index | 0.774 (0.751, 0.796) | 0.783 (0.761, 0.804) | 0.748 (0.696, 0.799) | (0.718, 0.813) | ||||
P value | 0.001 | 0.023 |
Empty cells are eliminated variables in the final multivariate Cox proportional hazard model using Akaike information criteria. The numbers in the parentheses represent the 95% confidence interval. The HR for pathological categorial variables is calculated relative to a reference category, specifically glioblastoma, IDH-wild type. The P values at the bottom indicate the comparison of C-indices between models, with and without DPI.
Abbreviations: HR, hazard ratio; KPS, Karnofsky performance status; IDH, isocitrate dehydrogenase; MGMT, O6-methylguanine-DNA methyltransferase.
In supplementary survival analyses examining volumetric information of enhancing/nonenhancing portions of gliomas, DPI emerged as a significant prognostic factor in both external test sets: HR = 0.042 (95% CI, 0.012–0.140), P < 0.001 for external test set 1; and HR = 0.036 (95% CI, 0.004–0.352), P < 0.001 for external test set 2. In test set 1, the volume of the enhancing tumor was also a significant prognostic factor (HR: 1.011, 95% CI: 1.005–1.017, P < 0.001), unlike in test set 2, where no volumetric factors were significant. Detailed results of these survival analyses are presented in Supplementary Table 5.
In external test set 1, additional molecular markers such as 1p/19q codeletion, PTEN mutation, EGFR amplification, pTERT mutation, 7+/10−, and H3 K27M mutation were available for 466 out of 708 patients. In the multivariate analysis for these 466 patients, the DPI consistently appeared as an independent factor with an HR of 0.05 (95% CI, 0.014–0.179) and a P value of <0.001. Along with DPI, age, IDH mutation, mMGMT, KPS, 1p/19q codeletion, and H3 K27M mutation were also chosen as independent risk factors. The pTERT mutation was significant in the univariate analysis but was not revealed to be an independent risk factor in the multivariate analysis. The HRs of molecular variables in univariate analysis were as follows: pTERT, 1.388 (95% CI, 1.043–1.847), P = 0.024; EGFR, 0.94 (95% CI, 0.732–1.207), P = 0.63; PTEN, 1.061 (95% CI, 0.709–1.588), P = 0.77; 1p/19q codeletion, 0.075 (95% CI, 0.033–0.169), P < 0.001; H3 K27M, 4.661 (95% CI, 2.06–10.542), P < 0.001. There were no cases with a 7+/10− mutation. Detailed results of both univariate and multivariate analyses with these molecular markers for external test set 1 are shown in Supplementary Table 6.
Subgroup Analysis
Subgroup analysis was conducted based on the glioblastoma (IDH wild type, WHO grade 4) groups.
In the glioblastoma subgroup, the DPI from the model trained on diffuse glioma patients was consistently identified as an independent risk factor. In external test set 1, a total of 304 patients had glioblastoma, and the selected model from the multivariate Cox analysis included age, mMGMT status, KPS, sex, and the DPI as variables. The HR for the DPI was 0.025 (95% CI, 0.005–0.129), and the model exhibited a higher C-index of 0.688 (95% CI, 0.653–0.723) compared to that of the model fitted with molecular variables, 0.675 (95% CI, 0.639–0.711); however, the significance was marginal (P = 0.052). In external test set 2, the multivariate Cox analysis of the 100 glioblastoma patients revealed DPI as an independent risk factor along with age, sex, KPS, and extent of surgical resection: HR for DPI = 0.054 (95% CI, 0.005–0.572), P = 0.015. The 0.683 C-index (95% CI, 0.627–0.74) achieved by the Cox model with DPI was nominally higher than that of the model fitted with only clinical/molecular variables, 0.664 (95% CI, 0.603–0.724), but the difference was not significant (P = 0.067). Detailed results of the subgroup analysis can be found in Supplementary Table 7.
Discussion
To our knowledge, this study using 3D CNN in patients with adult-type diffuse gliomas marks the first large-scale multicenter study showing that whole-brain MRI has independent prognostic value and validating this finding with 2 independent external test sets. In terms of prognostication, the 3D CNN can extract good global spatial features with large datasets; these features compensate for information about the clinical factors of patients and local information from molecular tumor pathology. In multivariate Cox analyses, the DPI was found to be a reliable and independent prognostic factor alongside traditional prognostic variables, including age, sex, EOR, KPS, IDH, and mMGMT.16,36,37 The fact that many of the traditional variables were selected in the multivariate Cox model demonstrates the credibility of this study.
This study specifically investigates whether a “deep” radiomics feature can be extracted using a 3D CNN model. With our data, it was possible to confirm that the DPI is an independent prognostic factor from IDH and to compare its efficacy quantitatively through the HR, which was 3.11 times higher for the higher DPI group than that of the lower DPI group and 6.47 times higher for the IDH wild-type group compared to IDH-mutant group; similar results were obtained in 2 independent external test sets. To this end, multivariate Cox regression with not only IDH but also established markers was performed. Given that MRI is the only noninvasive monitoring modality used in routine clinical practice, our findings may have a profound impact on patient management, allowing clinicians to better leverage all the information that can be obtained from MRI.
Many radiomics studies2–7,9–11,38,39 have examined the association between MRI features and the survival of patients with diffuse glioma. These studies2–6,10,39 have consistently demonstrated the presence of additional MRI information that is independent of molecular genetic and clinical variables. However, some of these studies6,38,39 have employed supplementary sequences such as diffusion tensor imaging, perfusion imaging, or resting-state functional MRI, which require additional acquisition time. Others have solely relied on segmented tumor images or selected slices from whole-brain MRI scans.4,9–11 Although advancements in automatic brain tumor segmentation,33 models utilizing 2D tumor images require a preprocessing such as selecting arbitrary slices which may lack generalizability, complicating their application in clinical practice. Contrarily, the model in this study used only T1, T1C, T2, and FLAIR sequences incorporating all slices in the sequences without tumor segmentation. In addition, most previous studies have utilized fewer than several hundred MRI scans, and this potentially raises concerns about the validity of the models, despite some of their high performance. In contrast, our study utilized a dataset of 1925 patients, with 840 allocated for test sets, thereby solidifying the validity of examining the prognostic power of MRI features more robustly than previous papers.
In terms of patient characteristics, all variables except sex exhibited statistically significant differences. Particularly notable variables were the median overall survival, the presence of IDH mutations, and mMGMT status, which are pivotal factors in survival analysis among glioma patients.40–43 The differences in overall survival may be partially attributed to varying proportions of glioblastoma cases across the different datasets. The survival curves of the glioblastoma subgroup in each dataset demonstrated a more similar shape, as illustrated in Supplementary Figure 3. Despite the significant disparity in the distributions of several variables among the datasets, when executing a range of multivariate regression analyses incorporating varied sets of variables, we consistently found that the DPI remained a significant predictive factor across all models (Table 3 and Supplementary Table 3). This robust finding indicates the strong influence of the DPI in each regression model, regardless of the variable combination used.
The DPI, which was generated solely from images via the deep learning model, demonstrated fair performance with a C-index ranging between 0.67 and 0.71 across the test sets. The results derived from 3D Grad-CAM definitively demonstrated that the model reasonably incorporated the tumor portion of the image into its survival prediction. From the Grad-CAM results, we determined that the model successfully located the tumor areas without requiring any other information about the tumor itself; only whole-brain MRI was provided (Figure 2). Moreover, nonenhancing tumors were activated when the tumor had no enhancement, which supports that multimodal MRI sequences are required for survival prediction using MRI in diffuse gliomas (Figure 3), and the solid portion showed more attention than the cystic portion in enhancing tumors (Supplementary Figure 5).
Moreover, performing risk stratification by the DPI revealed significant differences among the stratified groups in both test sets. This indicates the model’s robustness and efficacy in differentiating risk levels.
Most of the variables analyzed in this study emerged as significant risk factors in the univariate analyses. However, a few variables, such as pTERT, EOR of GTR, and mMGMT, were sporadically omitted from the selected variables in the multivariate analyses. In contrast, DPI consistently emerged as an independent risk factor in multivariate analyses. Moreover, models incorporating DPI significantly outperformed those that did not include DPI in terms of the C-index. DPI also manifested as a significant prognostic factor, even with the tumor’s volumetric information, suggesting that DPI provides information beyond merely picking up contrast enhancement or volume of the tumors. These findings suggest that the DPI, derived from MRI data, carries additional information that could complement conventional clinical factors and molecular markers, ultimately implying that the integration of DPI into predictive models could enhance their prognostic accuracy.
In the subgroup analyses on glioblastoma, models that incorporated the DPI consistently outperformed those that did not include the DPI. Glioblastoma, which was recently redefined molecularly based on the presence of the IDH mutation, has been rarely studied in its purest form—the IDH wild type. Survival studies on this distinct group remain scarce.16,37,44 To our knowledge, no existing study has validated that features derived from deep learning could have predictive power specifically for IDH wild-type glioblastoma. Given this context, the DPI risk stratification analysis in the molecularly defined glioblastoma subgroup demonstrates the significant potential of deep learning using MRI data to predict survival outcomes.
This study has some limitations. First, in patients with glioma, survival outcomes are largely dependent on the efficacy of treatment. We enrolled patients who underwent surgical intervention and received temozolomide as adjuvant therapy. However, due to the inherent challenges in assembling a well-controlled study population, our investigation was not designed as a prospective study. To account for this limitation, we incorporated the EOR as an additional variable in the multivariate Cox analysis. Further improvement can be obtained by reflecting the variable’s effect for each center into the Cox PH or deep survival prediction model or by performing data harmonization. Second, the performance of the DPI was evaluated in the context of Cox analyses. Utilizing nonproportional models or deep learning models to construct a new model combining clinical/genetic variables with the DPI could further enhance the performance of predicting patient outcomes. Finally, we used 1-year survival probability as the DPI, while a more comprehensive analysis considering various timepoints could provide additional insights into the prognostic performance of the DPI.
In conclusion, our study utilizing large, validated, and publicly accessible datasets such as TCIA, UCSF, and UPenn presents compelling evidence for the efficacy of deep learning features derived from preoperative whole-brain MRI scans as an independent survival prognostic factor considering extensive clinical and molecular variables in adult-type diffuse glioma, including glioblastoma.
Acknowledgments
The authors thank Eun Jung Choi, Ga Hyun Kim, and He Ryun Sohn for their invaluable assistance with data collection and analysis.
Conflict of interest statement
None declared.
Funding
This work was supported by the SNUH Research Fund (No. 04-2022-0520) (K.S.C.), the Phase III (Postdoctoral fellowship) grant of the SPST (SNU-SNUH Physician Scientist Training) Program (K.S.C.), the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (No. RS-2023-00251022) (K.S.C), the Technology Innovation Program (20011878, Development of Diagnostic Medical Devices with Artificial Intelligence Based Image Analysis Technology) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea) (J.W.C.); the Bio & Medical Technology Development Program of National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2021M3E5D2A01022493) (I.H.)
Author contributions
Study design: K.S.C., S.S.A., and J.O.L. Data collection, analysis, interpretation: K.S.C., S.S.A., S.H.C., J.W.C., J.O.L., S.H.P., and C.K.P. Figures: J.O.L., K.S.C., and S.S.A. Manuscript writing: J.O.L., K.S.C., and S.S.A. All authors revised and approved the final version of the manuscript.
Data availability
The data supporting the findings of this study are available within the paper and its Supplementary Materials. The raw data of the study are available from the corresponding author, upon reasonable request. The code for experiments and the 3D model in this paper are available in Github at: https://github.com/kyuchoi/3D_MRI_survival_glioma.
References
Author notes
Jung Oh Lee and Sung Soo Ahn contributed equally to this work.