More samples from multicentre data from different platforms improve the detection of cancer samples. (A, C) Confusion matrices of the RF model generated from GSE16443 (100%) and GSE47862 (50%) integrated data in an independent validation cohort in GSE11545 data (A) and the remaining 50% of the GSE47862 (n = 161) data. (C) ROC curves of RF diagnostics of an independent validation cohort in GSE11545 data (n = 20) (B) and the remaining 50% of the GSE47862 data (D).