Abstract

Background

Machine learning (ML) approaches using functional brain measures have been widely used to develop advanced diagnostic tools of schizophrenia. Although numerous ML studies have improved the accuracy of classification differentiating patients with schizophrenia (PSZ) from healthy individuals (HI), their utilities as a truly diagnostic tool are still limited. Also, to our knowledge, ML has not been used to identify individuals with the genetic liability of schizophrenia (i.e., biological relatives of PSZ [RSZ]). Toward the development of such a reliable and accurate tool to identify endophenotypic biomarkers of schizophrenia, we conducted an ML study using high-density EEG collected with a cognitive task that is known to measure endophenotypic markers of cognitive deficits in schizophrenia and source-level EEG functional connectivity analysis to improve spatial-temporal-frequency features of brain network dynamics.

Methods

We collected 64 or 128 channel EEG from 21 PSZ, 20 RSZ, and 30 HI while they perform a stimulus response-reversal task (SRRT) that is used to evaluate context processing, an executive process that guides adaptive behaviors according to goals and stored contextual information. Source signals of 78 whole cortical regions of interests (ROIs) were calculated using a weighted minimum norm algorithm and realistic head model obtained from individual structural MRIs. Phase locking value (PLV) based on the Hilbert transform was calculated between all possible 3003 pairs of the 78 ROIs at 4 frequency bands (theta, alpha, beta, and gamma), 4-time windows, and 3 task conditions to measure the source-level interregional functional connectivity. Principal component analysis (PCA) was used to reduce the dimensions of PLVs and extract the brain networks involved in context processing. Also, a penalized multinomial regression was bootstrapped 100 times to find optimal features effectively. Finally, ML methods were applied to classify three pairs of two groups using the extracted source-level PLV features. To avoid the overfitting problem, we used two different types of feature selection methods (unsupervised feature selection with adaptive structure learning [FSASL] and Fisher’s score). The classification accuracy was evaluated using a support vector machine (SVM) classifier with a leave-one-out cross-validation method for each classification pair.

Results

Spatial-domain PCA of the 3003 PLVs extracted 7 functional connectivity networks for the 4 frequency, 4 time, and 3 task conditions, resulting in 336 PLV PC features. The bootstrapping procedure of the penalized multinomial regression further extracted 42 PLV PC features that were found to be informative to classify the three groups in more than 50% of the iterations. Compared to Fisher’s score method, FSASL-based feature selection improved the classification accuracy by about 2% on average. The best group classification results were achieved as follows: 1) PSZ vs. HI: 94.12 %; 2) PSZ vs. RSZ: 75.61 %; 3) RSZ vs. HI: 88 %.

Discussion

The cortical source-level features of context processing brain network greatly enhanced the PSZ vs. HI classification accuracy compared to previous studies. The new approach that parsimoniously extracted functional brain network features of entire cortical regions might also contribute to the improvements by eliminating uninformative features. The strength of task-based brain functional features will also be discussed in comparisons with a similar ML approach using resting state brain features. Future studies employing multiple task-based and resting-state brain features might further contribute to more accurate group-classifications and identification of endophenotypic biomarkers.

This content is only available as a PDF.
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)