-
PDF
- Split View
-
Views
-
Cite
Cite
Arno R Bourgonje, Evaluating Discriminative Accuracy of Biomarkers in Relation to Binary Study Outcomes: First Validate, Then Celebrate?, Journal of Crohn's and Colitis, Volume 17, Issue 1, January 2023, Page 146, https://doi.org/10.1093/ecco-jcc/jjac117
- Share Icon Share
With interest I read the study by Swaminathan et al., in which the performance of faecal myeloperoxidase [fMPO] as biomarker of endoscopic disease activity was assessed in patients with inflammatory bowel disease [IBD].1 Additionally, performance was assessed in comparison with C-reactive protein [CRP] and faecal calprotectin [fCal], being established biomarkers of disease activity. The study concluded that fMPO is an accurate biomarker of endoscopic activity in IBD and predicted a more complicated IBD course during follow-up.
Patients with IBD suffer from fluctuating disease activity that is difficult to detect and monitor. To avoid repeated endoscopies, other means to monitor disease activity are widely used, eg, clinical indices, CRP, and fCal. However, these often demonstrate inconsistent associations and lack specificity,2 emphasising the need for additional biomarkers capable of reflecting intestinal inflammation. To assess the performance of fMPO, fCal, and CRP in predicting endoscopic disease activity, the authors leveraged receiver operating characteristic [ROC] statistics and ROC curves, the latter visualising performance of a biomarker/model across different classification thresholds.
Although ROC curves are commonly used to assess biomarker performance, this assessment can be optimistically biased if based on a single evaluation of data.3 Therefore, testing the ability of a biomarker/model to generalise to new cases, ie, a sample [‘test sample’] independent of the sample used to predict the dependent variable [‘training sample’], is recommended. Using all cases from the original analysis, instead of this train/test-approach, may result in overly optimistic estimates of biomarker performance. To overcome this issue, external validation of biomarker performance in independent cohorts is pivotal to determine generalisability. In addition, internal validation or cross-validation methods [eg, leave pair out cross-validation [LPO], k-fold cross-validation, or bootstrap resampling] can be used to generate more realistic estimates, yielding more variance to build broader confidence intervals. These methods split the original ‘training’ dataset into multiple subsets, which are then iteratively used as ‘training’ and ‘test’ subsets, either randomly or not, and averaged to roughly indicate how biomarker performance is affected by changes in ‘training’ data.
Aside from cross-validation, there are also accepted methods to effectively compare correlated area under the ROC curve [AUROC] estimates, eg, DeLong’s test,4 and to quantify the maximised sensitivity/specificity balance for ‘optimal cut-off points’, eg, Youden’s J-statistic.5 Although the presented results do hint at these methods, the authors do not clearly state what criteria were used to compare fMPO with fCal and CRP or to assess optimal cut-off points. Finally, standard errors or confidence intervals were lacking, which could facilitate readers to better judge biomarker performances.
The efforts of Swaminathan et al. in identifying novel biomarkers for disease activity are important and can only be applauded. However, validation—whether internally or externally performed or both—remains critical to precisely determine the clinical utility of biomarkers.
Funding
No funding was received for this work.
Conflict of Interest
ARB has no conflict of interest to declare.