Abstract

BACKGROUND

The majority of children with ulcerative colitis present with extensive colitis at diagnosis, but the response to therapy is heterogenous. Identifying the optimal window for biologic treatment and risk stratification of patients remains an unmet need.

AIM

To develop a pathology based histomic model to predict corticosteroid free clinical remission (CSF) with mesalamine alone at one year.

METHODS

292 hematoxylin and eosin diagnostic treatment naïve rectal mucosal biopsies from the multi-center PROTECT study were digitized. Whole slide images (WSIs) underwent two-step pre-processing a)stain normalization and b)informative patch selection (size 512x512). We trained machine learning (ML) models using 250 histomic features (texture, color, histogram and nuclei features) with 5-fold cross-validation, for patch-level classification. Feature importance was determined by the Gini index. We re-trained the classifier using the top features. Slide-level prediction was defined by threshold voting. Performance metrics at the patch and WSI level was evaluated.

A total of 161 patients underwent high-throughput RNA sequencing to define rectal gene expression. We undertook unsupervised weighted gene co-expression network analysis (WGCNA) to discover networks of co-expressed genes with shared biologic functions correlated with histomic features, histological traits, and outcomes.

RESULTS

187571 informative patches from 292 patients (Male:55%; Age:12.7y (IQR:11-15); CSF remission:41%) were trained on 23 ML classifiers. The best model trained on 250 features at the patch-level was random forest (RF). At a remission ratio threshold of 0.48, WSI area under the receiver operator curve (AUROC) was 0.90 (95%CI:0.70, 1.00), accuracy 90.4%, precision 90.9% and recall 84.7%. 18 top features were identified and trained, and the corresponding WSI AUROC was 0.87 (95%CI:0.72, 1.00), accuracy 90.1%, precision 89.4%, and recall 83.9% (Fig. 1). We re-trained the 18 features on an independent real-world dataset of 131 UC patients and the model WSI AUROC was 0.85 (95%CI:0.74, 1.00) and an accuracy of 88.5%.

Histomic feature importance represented by the SHapley Additive exPlanations (SHAP) values. The figure shows the direction of the relationship between a variable and outcome. Positive SHAP-values are indicative of clinical remission. As demonstrated by the color bar, values with higher importance are shown in red, while lower values are shown in blue.
Figure 1:

Histomic feature importance represented by the SHapley Additive exPlanations (SHAP) values. The figure shows the direction of the relationship between a variable and outcome. Positive SHAP-values are indicative of clinical remission. As demonstrated by the color bar, values with higher importance are shown in red, while lower values are shown in blue.

Of the 13 modules identified by WGCNA analysis, six modules significantly correlated with clinical and/or histomic features (Fig. 2). Two of the gene co-expression modules were negatively associated with baseline clinical, endoscopic, and histologic measures of severity, and positively associated with nuclei features (Otsu area, perimeter, and equivalent diameter) and the outcome measure of CSF remission. Intersection of genes with adult single cell RNA seq data demonstrated enrichment for enterocytes (SLC26A3) and extracellular matrix (IHH)

Module trait relationship for selected histomic, histological and phenotypic traits with outcome variables
Figure 2:

Module trait relationship for selected histomic, histological and phenotypic traits with outcome variables

CONCLUSION

We developed a predictive model for UC disease course using histomic features from standard of care pre-treatment pathology images. Characterization of the underlying molecular basis of the histomic features is ongoing.

This content is only available as a PDF.
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)