Abstract

The emergence of multidrug-resistant bacteria is a critical global crisis that poses a serious threat to public health, particularly with the rise of multidrug-resistant Staphylococcus aureus. Accurate assessment of drug resistance is essential for appropriate treatment and prevention of transmission of these deadly pathogens. Early detection of drug resistance in patients is critical for providing timely treatment and reducing the spread of multidrug-resistant bacteria. This study aims to develop a novel risk assessment framework for S. aureus that can accurately determine the resistance to multiple antibiotics. The comprehensive 7-year study involved ˃20 000 isolates with susceptibility testing profiles of six antibiotics. By incorporating mass spectrometry and machine learning, the study was able to predict the susceptibility to four different antibiotics with high accuracy. To validate the accuracy of our models, we externally tested on an independent cohort and achieved impressive results with an area under the receiver operating characteristic curve of 0. 94, 0.90, 0.86 and 0.91, and an area under the precision–recall curve of 0.93, 0.87, 0.87 and 0.81, respectively, for oxacillin, clindamycin, erythromycin and trimethoprim-sulfamethoxazole. In addition, the framework evaluated the level of multidrug resistance of the isolates by using the predicted drug resistance probabilities, interpreting them in the context of a multidrug resistance risk score and analyzing the performance contribution of different sample groups. The results of this study provide an efficient method for early antibiotic decision-making and a better understanding of the multidrug resistance risk of S. aureus.

INTRODUCTION

Multidrug-resistant bacteria have become a global crisis that requires urgent attention. The overuse of antibiotics has led to the rise of these pathogens, presenting significant challenges to improving clinical cure rates and the need for effective prevention. Among these pathogens, methicillin-resistant S. aureus (MRSA) is the most notorious [1, 2]. MRSA is resistant to multiple classes of antibiotics and poses a significant threat to global public health [3–5]. MRSA infections can be devastating, resulting in prolonged hospital stays, increased healthcare costs and mortality. Therefore, it is essential to assess drug resistance to ensure appropriate treatment and prevent the spread of these deadly pathogens. Timely detection of drug resistance in patients is critical to providing appropriate treatment and reducing the transmission of multidrug-resistant bacteria.

Multidrug-resistant Gram-positive bacteria, such as MRSA, are resistant to multiple classes of antibiotics. S. aureus can acquire resistance to any antibiotic [6], which has facilitated the emergence of accurate and rapid antibiotic susceptibility testing (AST) for this pathogen [7–9]. The current gold-standard AST can take up to 48–72 h to provide results [10, 11]. Yet, newer approaches can provide rapid detection of drug-resistant S. aureus with the advantage of a faster turnaround time [12–14]. In addition, clinical specimens can be used directly for susceptibility testing, providing convenience for specimen prep procedures [15–17]. Nevertheless, molecular detection methods that rely on gene targets can lead to false-negative results [18]. The introduction of matrix-associated laser desorption and ionization/time-of-flight (MALDI-TOF) mass spectrometry (MS) instrumentation benefits rapid pathogen detection within 2 h from subcultured colonies [19]. Although previous research has primarily focused on differentiating MRSA from methicillin-susceptible S. aureus (MSSA) [20–24], a rapid diagnostic tool that accurately detects resistance and guides appropriate prescribing is urgently needed to combat this serious public health threat.

The advent of machine learning (ML) has revolutionized the field of clinical data analysis, making it possible to analyze vast amounts of data in a highly accurate and efficient manner. The potential of this technique is particularly evident in the wealth of MS spectra data available from clinical samples of patients infected with S. aureus, which can be used to develop highly accurate predictive models for MRSA diagnosis. For example, Yu et al. [25] developed a light gradient-boosting machine-based algorithm to identify differential methicillin-resistant S. aureus patterns in MS data collected from clinical isolates. In another study, Johnson et al. [26] applied AMRQuest ML to predict methicillin resistance of S. aureus in MS-based antimicrobial susceptibility testing [26]. However, existing predictive models for S. aureus antibiotic resistance are limited in their ability to accurately predict susceptibility to a variety of antibiotics, which is a significant concern in the context of multidrug resistant S. aureus. The urgent need for a more comprehensive assessment of drug resistance in multidrug-resistant S. aureus cannot be overstated. It is critical to provide effective therapies and improve patient outcomes, especially in light of the rapidly escalating global epidemic of antibiotic resistance. This underscores the importance of the primary objective of this study, which is to develop a risk assessment framework for multidrug-resistant S. aureus that can accurately predict resistance to a variety of antibiotics.

To address this critical need, the primary objective of this study is to develop a risk assessment framework for multidrug-resistant S. aureus that can accurately predict resistance to a variety of antibiotics. Through the use of a 7-year longitudinal study and state-of-the-art ML methods, we aim to construct predictive models that will guide clinicians in the treatment of infected patients within the next 24 h. Our ML-personalized multidrug resistance risk score will provide immediate suggestions for prescribing the appropriate antibiotics and reducing overall antibiotic use in the long term, contributing to the global campaign to curb the epidemic of antibiotic resistance. Through the development of a more comprehensive and accurate approach to drug resistance assessment, this study has the potential to transform the field of clinical data analysis and revolutionize patient care for multidrug-resistant S. aureus.

MATERIALS AND METHODS

Data collection

The aim of this study was to develop reliable prediction models to assess the risk of multidrug-resistant S. aureus. To achieve this goal, data were collected consecutively from two branches of Chang Gung Memorial Hospital (CGMH), Linkou and Kaohsiung. The discovery population, which served as the training dataset, was collected in the Linkou branch from 2013 to 2019, while the independent replication, which served as the independent test dataset, was collected in the Kaohsiung branch from 2015 to 2017. In both branches, the clinical microbiology laboratory of CGMH performed microbiological culture and AST. The bacterial cultures obtained from patients infected with S. aureus were collected from various sites depending on the suspected infection, such as wounds, respiratory tract, blood, urine, sterile body fluids or other body parts.

The cultured bacterial isolates were used for both AST and MALDI-TOF MS. AST was performed by disk diffusion and broth microdilution to determine susceptibility or resistance to PEN, OXA, ERY, SXT and FA. Our ground truth is derived from the gold-standard AST, which is considered to be almost 100% accurate. On the other hand, mass spectrometry data were obtained by MALDI-TOF mass spectrometry. The study was designed as a retrospective cohort study to develop and validate a clinical prediction model. It was conducted as a two-cohort study in both medical centers to evaluate the impact of mass spectrometry-based AST on multidrug-resistant S. aureus. To investigate the relationship between MS spectrum and AST, the study was retrospective and focused only on this aspect. No diagnosis or treatment was involved, and the Institutional Review Board of Chang Gung Medical Foundation approved a waiver of informed consent (No. 202100008B1).

A state-of-the-art Bruker Microflex LT MALDI-TOF system was used to analyze the microbial strains, as previously described [8]. The instrument was operated according to the manufacturer’s instructions to ensure the highest standards of precision and accuracy. To prepare samples for analysis, freshly cultured isolates were delicately smeared by the operator onto a 96-well MALDI steel target plate. In the next step, the sample was extracted with formic acid (1 μl, 70%) on a thin layer. The target plate was then dried at 25°C before being covered with a matrix solution consisting of a solvent mixture (1% α-cyano-4-hydroxycinnamic acid in 50% acetonitrile containing 2.5% trifluoroacetic acid). After the samples were dried at room temperature, the target plate was loaded into the analyzer for analysis. The analyzer was operated in the linear ionization mode with the acceleration voltage set at 20 kV, the nitrogen laser frequency at 60 Hz and 240 laser shots. The spectra were recorded with the mass-to-charge ratio (m/z) ranging from m/z 2000 to 20 000. Peak patterns were analyzed using R and the R package MALDIquant [27]. A three-step procedure was used to process the raw spectra. First, a baseline correction (top hat filter) was applied. Then, peaks were identified by calculating the median absolute deviation (MAD) with a half-window size of 10. Finally, peaks with a signal-to-noise ratio (SNR) of at least 5 were collected for further analysis. After processing the raw MS spectra, a peak list was generated containing m/z values and intensities that represented the mass fingerprint of a given sample.

MS data preprocessing

The mass fingerprint of a sample consists of the ion m/z and the raw intensity values of all peaks. To analyze these data, preprocessing steps were applied to convert the original ion intensities into relative pseudo-ion abundances. This three-step process was modified from a previously described method [8]. First, ion peaks with an intensity of 100 or less were removed from the analysis to ensure that only high-quality data were used. Second, the m/z axis of each sample with a range of 2000 to 20 000 was divided into equal intervals with a bin size of 20, generating 900 vectors that were designated as pseudo-ions for further analysis [8, 21]. Third, normalization was applied within each interval vector by calculating the l1-norm of the interval vector divided by its l0-norm and then adding its l2-norm to obtain the normalized intensity for the pseudo-ion.

The following definition of pj corresponds to the 900 interval vectors, while |$\overrightarrow{p_j}$| with j = 1, … , 900 are the normalized intensities of the 900 pseudo-ions. The symbols ‖•‖0, ‖•‖1 and ‖•‖2 represent the l0-norm, l1-norm and l2-norm, respectively. Finally, all samples underwent the same preprocessing steps to obtain their normalized pseudo-ion intensities.

A pseudo-ion matrix table was then created that included all cohorts and their corresponding preprocessed mass spectrometry data across the entire study. For each antibiotic, a drug-resistant or drug-susceptible group was defined as samples identified by the susceptibility result, that is, those labeled susceptible or not. This procedure allowed the analysis of the correlation between the normalized pseudo-ion intensities and the susceptibility of the samples to different antibiotics.

MS produces data in the form of peak intensities of the m/z, but the number of peaks varies between different spectra. A fixed-length feature representation is needed to apply ML methods. Therefore, a data preprocessing step was used in which equal-sized intervals (bin size 20) were used to generate 900 intensities of pseudo-ions for each m/z interval. If some m/z intervals had no peaks, the corresponding intensities were set to zero. The resulting 900 features were fed into the full models for the four drug resistance prediction classifiers.

Development of antibiotics prediction models

To identify the resistance of different antibiotics, classifiers were developed using an efficient extreme gradient boosting (XGBoost) algorithm [28] with the tree booster. XGBoost is an excellent tool for handling sparse data and is implemented using the Python module xgboost. The booster was set to gbtree, the objective to binary: logistic and the evaluation metric to area under the receiver operating characteristic curve (AUROC). However, the ratios of PEN-resistant and FA-susceptible samples were extremely high in both the discovery and replication populations, as shown in Supplementary Table S2. Therefore, we excluded drug resistance prediction for PEN and FA. For the other four antibiotics, training involved four parallel runs with four settings in which the models could be used in practice.

The XGBoost classifier relies on its optimized parameters, which include the number of iterations (nrounds), the learning rate (eta) and the maximum depth of a tree (max_depth). We performed a grid search with the following search range: nrounds ranging from 40 to 200, with an interval of 10, and used values for eta including 0.0001, 0.001, 0.01, 0.05, 0.1, 0.2, 0.25, 0.3 and 0.5 and for max_depth including 2, 4, 6, 8, 10 and 12. The hyper-parameters were chosen after a cross-validated grid search, and the chosen settings are shown in Supplementary Table S3. To evaluate the performance of the XGBoost classifiers, we trained them on part of the discovery population of the Linkou cohorts and apply them to the remaining data to evaluate their ability to classify new data. The sample size is 70% of the training set. The purpose of cross-validation was not to build models, but to assess the stability of model performance. We performed 5-fold cross-validation on the Linkou dataset to assess the performance of our model. This process involved dividing the dataset into five equal parts or folds. The model was trained and evaluated five times, each time using a different fold as the validation set and the remaining folds as the training set. For each fold, we used the selected final hyper-parameters and calculated the average AUC (area under the curve) for both the validation set and the training set. To provide a comprehensive view of the model’s performance, we also included error bars to indicate the variability in the AUC values across the five folds. Supplementary Figure S6 provides a measure of the variability in the performance of the models. The results depicted in the plots provide insights into the robustness and generalization potential of the models on the Linkou dataset under varying training and validation conditions. Finally, we built the final models on the entire discovery population based on the trained hyper-parameters.

External validation with data from Kaohsiung

To determine the effectiveness of the constructed classifiers for predicting drug resistance, we used an independent patient cohort from Kaohsiung for external validation. We followed the same data preparation and preprocessing steps as the Linkou cohort discovery population to ensure consistency. However, we encountered a few disproportionately resistant samples for the drugs PEN and FA, which we had to account for by applying the trained XGBoost classification models to the Kaohsiung data. To predict drug resistance for four drugs, namely, OXA, ERY, CLI and SXT, we used standardized classifier parameters and thresholds inferred from the same training set. Using this information, we created a set of classification models for each drug resistance label and evaluated their accuracy and discriminative power using AUROC and area under the precision–recall curve (AUPRC) [29].

In evaluating the developed classifier, we focused on its accuracy and discriminative power. To visualize the classification ability of our binary classifier system as its prediction threshold varied from 1 to 0, we used a receiver operating characteristic curve (ROC curve). The ROC curve is constructed by plotting the true-positive rate (sensitivity) against the false-positive rate (1-specificity). The AUROC is a metric that ranges between 0 and 1, with a random selection resulting in 0.5 and an excellent classifier resulting in 1.0. By using the AUROC, we were able to determine the diagnostic accuracy of our classification models, which ultimately helps us identify the best classifier to use in practice.

Investigations of feature importance

To gain a deeper understanding of the importance of the features and how the individual features are related to the models, we used the Shapley additive explanation (SHAP) values [30]. This approach allowed us to measure how the impact of each variable interacts with other variables, as well as to assess the contribution of each feature to the overall model performance. The SHAP values are a useful tool for the evaluation of feature importance and the determination of the impact of each feature on the complex model. This approach comes from the game theory and involves decomposing the prediction outcome of each sample into the composition of the feature contributions.

To illustrate this process, let x be the input pseudo-ion vector for a given sample, and let f(x) be the predicted result generated by the classifier f(·). The Shapley analysis can be expressed by the following equation:

Here, ϕ0 = E[f(x)] represents the base Shapley score generated by calculating the expectation of the model output over the entire training set. The Shapley scores associated with the L pseudo-ion features are represented by ϕl = ϕl(f, x), where l is 1, 2, ⋯, L. These values can be used to assess the impact of each feature on the predicted outcome f(x). By using the SHAP values, we were able to gain a better understanding of the importance of the features and the contribution of each feature to the overall model performance. This information is critical for improving the accuracy and effectiveness of our models and can help us identify key features that are essential for accurate predictions.

For the purpose of developing predictive models for drug resistance and drug susceptibility, the mean absolute SHAP values of the discovery population are calculated. Based on these values, a threshold was set for each antibiotic and features with values greater than 0.1 were selected for the compact models. A total of four XGBoost classifiers were built on the discovery population, with the features based on the SHAP values forming the compact models. Two XGBoost classifiers were developed for each drug—a full model and a compact model. To generate accurate predictions of drug resistance or susceptibility, 900 features were generated from the MS spectra data of the discovery population, which were then ranked according to the mean absolute SHAP value [31]. This ranking allowed the identification of the features that were most influential in the prediction models. The selected features were then used to develop models that could accurately predict drug resistance or susceptibility. Overall, by using the mean absolute SHAP values and selecting the most important features for the models, the study was able to develop accurate prediction models that could be used to identify drug resistance in S. aureus infections.

Development of risk assessment for multidrug-resistant S. aureus

In this study, an innovative approach to multidrug-resistant S. aureus risk assessment was developed. This approach used four XGBoost classifiers trained on four different drugs to predict the probability of susceptibility for each drug in the training and test sets, expressed as an estimated probability. Multiple linear regression (MLR) is then used to determine the level of multidrug resistance using the estimated probabilities for each drug and the number of drugs to which each sample was resistant. MLR is a statistical technique that uses multiple explanatory variables to predict the outcome of a response variable.

For each sample, the equation

was applied, where represents the number of drugs to which the sample was resistant, while xi1, xi2, xi3 and xi4 are the resistance probabilities predicted by the XGBoost classifiers for OXA, CLI, ERY and SXT, respectively. Regression coefficients β1, β2, β3 and β4 and intercept |${\beta}_0$|were obtained by regression analysis. Using the calculated regression coefficients, the predictor of the regression model was then transformed into a more user-friendly MDR score. The MDR score for each sample was calculated using the formula

where |$w0,$|w1, w2, w3 and w4 are the weights assigned to the predicted probability of each antibiotic. This risk assessment score provides valuable information to healthcare providers for antibiotics therapy decision-making, enabling patients to be prescribed tailored treatment regimens. To facilitate the diagnosis and treatment of multidrug-resistant S. aureus infections, this novel approach has great potential. The use of MLR and XGBoost classifiers provides a more accurate assessment of drug susceptibility, while the MDR score translates this data into an easily interpretable format. Violin plots generated using the ggpubr library in R visualize the distribution of the MDR score across samples, helping to identify potential patterns and trends in drug resistance.

RESULTS

Figure 1 illustrates our workflow for developing a risk assessment framework for multidrug-resistant Staphylococcus aureus. First, we collected MS spectra and AST results for the clinical isolates, as shown in Figure 1A. Our analysis of the data revealed that penicillin (PEN) and fusidic acid (FA) were unsuitable for model construction due to their disproportionate rates of resistance and strong drug-specific associations, as shown in Supplementary Tables S1 and S2. The MS spectra were then preprocessed by binning normalization and converted into peak lists, which were used to generate features for the prediction models, as shown in Figure 1B. The resulting features included 900 intensities of pseudo-ions for each sample. We then used a multi-model approach to construct four separate prediction models to identify resistance to oxacillin (OXA), erythromycin (ERY), clindamycin (CLI) and trimethoprim-sulfamethoxazole (SXT), as shown in Figure 1C. This allowed us to make more targeted and accurate predictions for each antibiotic. Finally, we determined the risk of multidrug-resistant S. aureus by integrating the drug resistance probabilities derived from the individual prediction models through linear regression, as shown in Figure 1D. Overall, our workflow represents a significant advance in the fight against multidrug-resistant bacteria and provides an efficient method for early antibiotic decision-making.

Overview of the model development process. (A) Data collection and preparation. Clinical specimens were collected from hospitals and sent to the microbiology laboratory where they were cultured, and AST was performed using conventional methods. Simultaneously, MALDI-TOF MS spectra were generated for each sample. (B) Data preprocessing. MS data were normalized using an intensity norm averaging algorithm, and a 20 Da grid was used for m/z values. (C) Drug resistance prediction model development. All features, including drug resistance information representing the susceptibility of each sample to different antibiotics, were used to construct complete models for the binary classification problem. The features were annotated according to their absolute mean Shapley additive explanations value, and four feature sets were formed, resulting in four compact models. The classifiers were trained on the discovery population using the four feature label sets, and an XGBoost model was selected as the final classifier. The model was then evaluated on the replication population to validate its performance. (D) Multidrug resistance risk assessment score. Note: AST: antibiotic susceptibility testing; MS: mass spectrometry; m/z: mass-to-charge ratio; Da: dalton.
Figure 1

Overview of the model development process. (A) Data collection and preparation. Clinical specimens were collected from hospitals and sent to the microbiology laboratory where they were cultured, and AST was performed using conventional methods. Simultaneously, MALDI-TOF MS spectra were generated for each sample. (B) Data preprocessing. MS data were normalized using an intensity norm averaging algorithm, and a 20 Da grid was used for m/z values. (C) Drug resistance prediction model development. All features, including drug resistance information representing the susceptibility of each sample to different antibiotics, were used to construct complete models for the binary classification problem. The features were annotated according to their absolute mean Shapley additive explanations value, and four feature sets were formed, resulting in four compact models. The classifiers were trained on the discovery population using the four feature label sets, and an XGBoost model was selected as the final classifier. The model was then evaluated on the replication population to validate its performance. (D) Multidrug resistance risk assessment score. Note: AST: antibiotic susceptibility testing; MS: mass spectrometry; m/z: mass-to-charge ratio; Da: dalton.

Overview of the data characteristics

Figure 2 illustrates a comprehensive overview of the cohorts in this study. As shown in Figure 2A, a total of 29 685 patients were included in the Linkou cohort between 2013 and 2019. After excluding 215 patients with negative screening results for Staphylococcus aureus, 83 patients with missing covariates and 2535 patients with missing AST results, a total of 26 852 patients remained for analysis. For external validation, 5303 patients from the Kaohsiung cohort were assigned to the replication population. In addition, an independent test set of 4955 patients was used for further external validation after excluding 76 specimens with negative screening results for S. aureus, 3 patients with missing covariates and 269 patients with missing AST results. To ensure the independence of the populations and accuracy of resistance rate comparisons, we removed any duplicate patient records, keeping only the data from the first occurrence of each patient.

Data and cohort characteristics. (A) Cohort selection. S. aureus-positive specimens were initially screened. Patients with missing age, sex covariates, and AST results were then excluded. Finally, the cohort was divided into training and validation sets as described in the Methods section. (B) Basic characteristics of cohort data. Pie charts show the distribution of data by sex and sample type. (C) Age distribution for both cohorts. (D) Multidrug-resistant S. aureus isolates in the two cohorts. The blue horizontal bar shows the number of specimens susceptible to PEN, ERY, OXA, CLI, SXT, and FA, respectively. The vertical bar shows the number of samples that are not susceptible to different combinations of these six antibiotics. (E) List of antibiotics analyzed in the study. (F) Venn diagram of the number of resistant samples under four conditions, excluding PEN and FA. Note: AST indicates antibiotic susceptibility testing; MS, mass spectrometry; m/z, mass-to-charge ratio; Da, dalton. PEN indicates penicillin; ERY indicates erythromycin; OXA indicates oxacillin; CLI indicates clindamycin; SXT indicates trimethoprim-sulfamethoxazole; and FA indicates fusidic acid.
Figure 2

Data and cohort characteristics. (A) Cohort selection. S. aureus-positive specimens were initially screened. Patients with missing age, sex covariates, and AST results were then excluded. Finally, the cohort was divided into training and validation sets as described in the Methods section. (B) Basic characteristics of cohort data. Pie charts show the distribution of data by sex and sample type. (C) Age distribution for both cohorts. (D) Multidrug-resistant S. aureus isolates in the two cohorts. The blue horizontal bar shows the number of specimens susceptible to PEN, ERY, OXA, CLI, SXT, and FA, respectively. The vertical bar shows the number of samples that are not susceptible to different combinations of these six antibiotics. (E) List of antibiotics analyzed in the study. (F) Venn diagram of the number of resistant samples under four conditions, excluding PEN and FA. Note: AST indicates antibiotic susceptibility testing; MS, mass spectrometry; m/z, mass-to-charge ratio; Da, dalton. PEN indicates penicillin; ERY indicates erythromycin; OXA indicates oxacillin; CLI indicates clindamycin; SXT indicates trimethoprim-sulfamethoxazole; and FA indicates fusidic acid.

Important characteristics of the patients and clinical specimens for both the Linkou cohort and the Kaohsiung cohort are shown in Figure 2B and C. Nearly 60% of infections in both cohorts were male, and any specimen type was allowed in the experimental design. As shown in Figure 2B, the most commonly processed form of S. aureus-positive clinical specimen was wound. The density plot of age in Figure 2C explicitly focuses on displaying the age density distributions of the two cohorts. The peaks of both density plots show that the ages are concentrated between 60 and 70 and close to 0, indicating a more susceptible population to S. aureus infections. When analyzing and interpreting the results of the proposed risk assessment framework for multidrug-resistant S. aureus, it is important to consider these key patient and clinical specimen characteristics.

Overview of antibiotic susceptibility and resistance patterns

The main objective of this study is to investigate the resistance of S. aureus to six antibiotics commonly prescribed in empirical treatment. The antibiotic categories and their target sites are varied and are presented in Supplementary Tables S1 and S2. Figure 2D illustrates the distribution of the isolate’s susceptibility to different combinations of the six antibiotics. In addition, Figure 2E provides details on the class and molecular target of the six antibiotics used in this study. In both the Linkou and Kaohsiung cohorts, the most common pattern of drug resistance was resistance only to PEN and susceptibility to four other antibiotics (PEN, OXA, CLI and ERY). It is noteworthy that in the large-scale comparison of drug resistance between susceptible cases and sensitive controls, multidrug resistance isolates were identified that were associated with both cohorts. This suggests, as shown in Figure 2D and F, that multidrug resistance in S. aureus has become a very common phenomenon and a potential obstacle to improving clinical treatment efficacy.

Individual antibiotic resistance rates were found to vary between the two cohorts, as shown in Supplementary Table S1. PEN was found to have the highest resistance rate of 93.7% in the Linkou cohort and 93.1% in the Kaohsiung cohort, while FA had the lowest resistance rate of 8.3% in the Linkou cohort and 5.4% in the Kaohsiung cohort. This suggests that the other four drugs deserve further attention in our research. However, there is limited information on the relationship between drug-resistant S. aureus and the four antibiotics. Overall, understanding the antibiotic resistance profiles of S. aureus is critical to developing appropriate treatment strategies that can help combat the increasing prevalence of multidrug-resistant strains.

Investigations for the prediction models and features

To develop accurate models for predicting drug resistance or susceptibility, the generated features were carefully evaluated and selected based on their mean absolute SHAP values. The compact models were formed using SHAP value–based features, and four XGBoost classifiers were built on the discovery population. As shown in Supplementary Figure S1, two XGBoost classifiers were built for each drug, one using the full set of 900 features and the other using the compact set of 24, 18, 22 and 9 features. Figure 3A shows a heat map of SHAP values from m/z 2000 to 10,000, listing the common features shared by the four compact models. Interestingly, we observed that the SHAP value files of five common pseudo-ions from the models were more similar in relevant m/z intervals than in other ranges, indicating similar patterns released by different antibiotics. In addition, the analysis of other features of different drug models in Figure 3B showed that some features appeared to be unique to some drugs, indicating potentially different diffusion characteristics of these antibiotics.

Visualization of model features. (A) The mean absolute SHAP values of the features shared by the four compact models were used to create a heat map. The features were converted from pseudo-ions to their corresponding m/z values for better visualization. (B) The pseudo ions, m/z ranges, and mean absolute SHAP values of the feature sets were summarized. SHAP, SHapley Additive exPlanations, is a method for explaining the output of ML models.
Figure 3

Visualization of model features. (A) The mean absolute SHAP values of the features shared by the four compact models were used to create a heat map. The features were converted from pseudo-ions to their corresponding m/z values for better visualization. (B) The pseudo ions, m/z ranges, and mean absolute SHAP values of the feature sets were summarized. SHAP, SHapley Additive exPlanations, is a method for explaining the output of ML models.

To further illustrate the importance of features in predicting drug resistance, summary plots were generated for the entire feature set of each compact model, as shown in Figure 4C and Supplementary Figures S1C, S2C and S3C. These plots show the relationship between the original values of the features and their corresponding importance. In addition, dependency curves were plotted to capture the association of drug resistance risk with any particular trait. Figure 4D and Supplementary Figures S1D, S2D and S3D show the relationship between the feature value and the SHAP value for the top-ranked features of the four drug models. Pseudo-ion 21 (m/z 2401–2420) and pseudo-ion 230 (m/z 6581–6600) were identified as risk factors contributing to drug resistance. Conversely, higher peak intensities shown at pseudo-ion 228 (m/z 6541–6560) decrease the risk of drug resistance. As expected, the risk of drug resistance increases as the intensities of certain m/z intervals (2441–2460, 3021–3040) increase [8, 21]. These findings highlight the importance of identifying and evaluating the features that contribute most to drug resistance and susceptibility prediction models.

Examining model features and performance. (A) The ROC curve for the binary classification task of OXA resistance was plotted to compare the full model (containing 900 features) and the compact model (containing 24 features). (B) The PRC was plotted for the full and compact classification models. (C) A summary plot of the SHAP values for the compact model feature set was created. The features were ranked according to their importance in making the final prediction. The SHAP value of the feature on the plot ranges from high predictor values to low predictor values for each point representing a particular sample. (D) Dependence plots for the top three features with the largest mean absolute SHAP value were generated to show peak intensities versus the feature’s SHAP value in the prediction model. OXA refers to oxacillin, and CLI, ERY and SXT are antibiotics. AUC is the area under the ROC curve, and AP is the area under the PRC.
Figure 4

Examining model features and performance. (A) The ROC curve for the binary classification task of OXA resistance was plotted to compare the full model (containing 900 features) and the compact model (containing 24 features). (B) The PRC was plotted for the full and compact classification models. (C) A summary plot of the SHAP values for the compact model feature set was created. The features were ranked according to their importance in making the final prediction. The SHAP value of the feature on the plot ranges from high predictor values to low predictor values for each point representing a particular sample. (D) Dependence plots for the top three features with the largest mean absolute SHAP value were generated to show peak intensities versus the feature’s SHAP value in the prediction model. OXA refers to oxacillin, and CLI, ERY and SXT are antibiotics. AUC is the area under the ROC curve, and AP is the area under the PRC.

Performance of external validation for the prediction models

To validate the accuracy of the prediction models developed based on the Linkou data, an external validation dataset from the Kaohsiung population was used. The test results indicate that the full and compact models perform well, with AUROC values ranging from 0.936 to 0.842 for OXA, CLI, ERY and SXT antibiotics, as shown in Figure 4A and Supplementary Figures S1A, S2A and S3A. In addition, precision–recall curves (PRCs) were used to assess the performance of the models for each antibiotic, as shown in Figure 4A and Supplementary Figures S1B, S2B and S3B. AUPRCs ranged from 0.928 to 0.795 for the full and compact models. These results suggest that our models can accurately classify samples as either resistant or susceptible to antibiotics.

Although the compact models showed slightly lower performance than the full models, they were still able to accurately classify samples and identify features, including the intensity of the m/z of unexplained resistance. It is worth noting that the SXT models had slightly lower recall rates compared to the other models due to the imbalanced dataset for SXT, as shown in Supplementary Table S2. In the Linkou cohort, 14.1% of the samples were resistant, while 85.9% were susceptible. In the Kaohsiung cohort, 8.6% of the samples were resistant and 91.4% were susceptible. Nevertheless, the overall performance of the models indicates that they can accurately predict drug resistance and susceptibility. Furthermore, we compared our method with several other similar classification algorithms to assess its performance (Supplementary Figure S4). The results of our analysis indicated that our method outperformed the other algorithms in terms of the ROC curves. Furthermore, our method exhibited superior robustness and stability based on the 5-fold cross validation (Supplementary Figure S5). The results depicted in the plots provide insights into the robustness and generalization potential of the models on the Linkou dataset under varying training and validation conditions.

Investigations of risk assessment for multidrug resistance

To help clinicians make informed decisions about antibiotic therapy, we developed a multidrug resistance risk assessment score based on a multiple linear regression model. The coefficients of the model were estimated, and the MDR score was calculated as the sum of the resistance probability for each drug multiplied by the corresponding weight. To make the score simple and user-friendly, we normalized it to a range of 0 to 1. We also incrementally increased the number of drugs to which each sample was resistant as the risk score increased, as shown in Figure 5A and B. The distributions of the scores over different sample groups are shown in these figures, along with a violin plot of each score over all resistant samples, which shows the full distributions of the scores. Our analysis has shown that a score of 0.5 or higher is highly sensitive for the detection of multidrug resistance.

Multidrug resistance risk assessment. (A–B) The intensity distribution of the multidrug resistance risk assessment score within different sample groups (separated by the number of drugs to which each sample is resistant) was visualized using a violin plot overlaid with a box plot. (C–D) The intensity distribution of the scores within different drug-resistant samples was also plotted using a violin plot. The top and bottom points of the box plot represent the maximum and minimum values, respectively, while the top and bottom horizontal lines of the box plot represent the 75th and 25th percentiles, respectively. The middle black line is the median. The surrounding density plot represents the probability density or a rotated kernel density plot.
Figure 5

Multidrug resistance risk assessment. (AB) The intensity distribution of the multidrug resistance risk assessment score within different sample groups (separated by the number of drugs to which each sample is resistant) was visualized using a violin plot overlaid with a box plot. (CD) The intensity distribution of the scores within different drug-resistant samples was also plotted using a violin plot. The top and bottom points of the box plot represent the maximum and minimum values, respectively, while the top and bottom horizontal lines of the box plot represent the 75th and 25th percentiles, respectively. The middle black line is the median. The surrounding density plot represents the probability density or a rotated kernel density plot.

In addition, we examined the performance of the multidrug resistance risk assessment score on OXA-resistant samples, CLI-resistant samples, ERY-resistant samples and SXT-resistant samples. By visualizing the underlying data distribution as a violin plot, as shown in Figure 5C and D, we observed that the scores for SXT-resistant samples were higher than the scores for the other samples. This indicates an increased risk of multidrug resistance in these samples, confirming and extending the findings of a previous study that SXT-resistant samples tend to be resistant to other drugs as well, as shown in Supplementary Table S2. In summary, our multidrug resistance risk assessment score is a simple, rapid and reliable tool that can help clinicians identify individuals at high risk for multidrug resistance. We believe that this simple risk score can be readily used by clinicians to assess individual patients’ risk of multidrug resistance and make informed decisions about antibiotic therapy.

DISCUSSION AND CONCLUSION

Our study presents a revolutionary susceptibility testing pipeline for the early detection of multidrug-resistant S. aureus in patients. By integrating information from MALDI-TOF MS spectra and ML methods, our approach offers several advantages over the current gold-standard AST assays. A key advantage of our pipeline is that it determines the resistance of S. aureus-infected patients to multiple antibiotic agents, rather than just a single drug. This provides a more complete and accurate understanding of the antibiotic resistance profile, allowing for more tailored and effective treatment options. In addition to its accuracy, our AST results can be obtained in as little as 24 h from the time of bacterial culture, compared to the 24–72 h required by conventional methods. This rapid turnaround time can save critical time in treatment decisions, leading to improved patient outcomes. Our ML approach to binary classification provides an actionable AST indication that can help clinicians make appropriate antibiotic choices. By analyzing the implications of the model features, we gain a better understanding of the common or unique characteristics of different antibiotic classes, enabling more informed and accurate treatment decisions. Finally, our ML-based multidrug resistance risk score provides a simple and effective tool to reduce unnecessary drug use and guide physicians in prescribing drugs. With these benefits, our study provides a rapid and reliable method for early detection of multidrug-resistant S. aureus, which can significantly improve patient care and outcomes.

A wide range of clinical specimens can be analyzed using our versatile experimental design. This means that almost any type of specimen infected with S. aureus, such as wounds, respiratory tract, blood, sterile body fluids and urinary tract, can be accepted for further subculture and MS spectra generation. In addition, we have developed and replicated binary classification models that can accurately predict the multidrug resistance of patients with S. aureus infections. Specifically, our models were able to predict OXA resistance, ERY resistance, CLI resistance and SXT resistance with an AUROC of 0.94, 0.90, 0.86 and 0.91, respectively, in the replication population.

One of the strengths of our approach is that it was built using a full set of features from MS spectra collected from a longitudinal cohort. This meant that long-term follow-up data could be fully exploited for all full models. The features were then narrowed down by the SHAP value, resulting in only a few features that contributed the most to form the compact models. Although the AUROC for the compact models decreased slightly compared to the full models, this classifier performance was clinically acceptable and the features continue to provide independent valuable characterizations of the resistance mechanism. The integration of MALDI-TOF with ML-based clinical diagnostics is a powerful tool that can help clinicians accurately prescribe drugs and reduce the misuse of unnecessary medications. By accurately predicting multidrug resistance, our approach can help reduce the use of ineffective antibiotics, guide clinicians in selecting appropriate treatment options and ultimately improve patient outcomes.

Although our study has some limitations, such as the single-area design, it is noteworthy that the longitudinal study recruited patients from two major hospital centers in Taiwan over a period of 6 years and included ˃20 000 patients. The applicability and reproducibility of our results in other geographic areas is enhanced by this large-scale validation of our findings. However, the characteristics examined in our study were not sufficient to identify the specific genes or proteins responsible for drug resistance. This is because only ions were obtained from the MALDI-TOF MS spectra based on our experimental design. Therefore, further development is needed to detect specific genes or proteins using MS/MS spectra and promising m/z intervals. This would help us better understand the mechanisms underlying drug resistance and increase our ability to fight against multi-resistant bacteria. In our study, we utilized a consistent MALDI-TOF instrument model within each hospital to ensure uniformity and minimize potential variations in data collection and analysis. However, future investigations should explore the generalizability of our framework across different MALDI-TOF instrument models to further validate its robustness.

In this study, we present a state-of-the-art risk assessment framework for S. aureus that can accurately predict resistance to multiple antibiotics. Our approach involves a comprehensive analysis framework consisting of data preprocessing, feature selection and interpretation and XGBoost hyper-parameter tuning and model building. This framework enables us to construct a rapid platform for AST that leverages the advantages of both MALDI-TOF and ML.

The results of our study demonstrate the good performance of the MALDI-TOF and ML-based AST methodologies and suggest that they have the potential to revolutionize drug prescription decisions. A major contribution of our study is the development of a multidrug resistance score that can accurately predict resistance to multiple antibiotics. This approach represents a crucial step toward rapid AST for multiple drugs in clinical trials, aiming to investigating early interventions for combating antibiotic resistance. Moreover, our ML-guided AST approaches can be implemented in the clinical settings and extended to different pathogen and antibiotic class. By doing so, we can reduce unnecessary medical cost and impede the looming epidemic of drug resistance. This breakthrough in fighting antibiotic resistance marks a significant advancement, and we firmly believe that our findings have the potential to make a real impact in the field.

It is anticipated that the more accurate and effective prediction of antimicrobial resistance will lead to a reduction in unnecessary antimicrobial usage, ultimately preventing the spread of drug-resistant infections. Our research holds significant potential, as ML-guided AST approaches can be seamlessly integrated into clinical practice and applied to various pathogens and antibiotic classes, thus minimizing the risks associated with inappropriate antibiotic use and the emergence of resistance.

Overall, our study provides a novel and robust framework for predicting multidrug resistance in S. aureus, contributing to the prevention of inappropriate antibiotic use and the mitigation of resistance risks.

Key Points
  • In this study, a risk assessment framework for S. aureus was developed, which can accurately predict resistance to multiple antibiotics using a comprehensive analytical framework. The framework constructs a rapid platform for antibiotic susceptibility testing (AST) that leverages the advantages of both matrix-associated laser desorption and ionization/time-of-flight mass spectrometry and machine learning (ML).

  • The ML-based AST approach demonstrated promising performance, with an area under the receiver operating characteristic curve of 0. 94, 0.90, 0.86 and 0.91 and an area under the precision-recall curve of 0.93, 0.87, 0.87 and 0.81, respectively, for oxacillin, clindamycin, erythromycin and trimethoprim-sulfamethoxazole.

  • The multidrug resistance scoring function introduced in this study allows for the risk assessment of resistance level to multiple antibiotics simultaneously. This scoring function may assist clinicians with a more comprehensive understanding of the antibiotic susceptibility of S. aureus and enable more informed treatment decisions.

  • The study also identified potential patterns that may be responsible for antibiotic resistance in S. aureus, paving the way for further research into the mechanisms of antibiotic resistance and the development of new treatment strategies.

  • The ML-guided AST approach has the potential to reduce unnecessary medical costs and impede the drug resistance epidemic by accurately predicting antibiotic susceptibility and enabling more informed treatment decisions. This approach can also be extended to other pathogens and antibiotic classes, demonstrating its potential as a valuable clinical decision support tool.

FUNDING

This work was supported by the Guangdong Province Basic and Applied Basic Research Fund (2021A1515012447), National Natural Science Foundation of China (32070659), Chang Gung Memorial Hospital (CMRPG3F1721, CMRPG3F1722, CMRPD3I0011), Natural Science Foundation of Guangdong (2023A1515011861) and the Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, China. This work was also financially supported by the Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B) from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project and Yushan Young Fellow Program (112C1N084C) by the Ministry of Education (MOE) and National Science and Technology Council (NSTC 112-2321-B-A49-016) in Taiwan.

AUTHOR CONTRIBUTIONS

Z.W., Y.X.P., C.-R.C., H.-Y.W. and T.-Y.L. conceived the project, designed and conducted the analyses, interpreted the results and wrote the manuscript and are listed in random order. H.-Y.W. and J.-J.L. collected the clinical samples and executed the microbiology experiments. Z.W. conducted the analyses and wrote the manuscript. Y.X.P. assisted with the machine learning analysis and visualization. C.-R.C., H.C., Y.-C.C. and J.-T.H. assisted with manuscript revision. J.-J.L. and T.-Y.L. supervised the study. All authors have read and approved the manuscript.

DATA AVAILABILITY

The following GitHub repository is available for the complete datasets analyzed in this study: https://github.com/xiaoxiaoxier/MRSA/tree/main/data.

CODE AVAILABILITY

All R scripts can be found in https://github.com/xiaoxiaoxier/MRSA.

Author Biographies

Zhuo Wang is now an associate researcher in Warshel Institute for Computational Biology the Chinese University of Hong Kong, Shenzhen, China. Her research focuses on the analysis of genomic and proteomic data, integrating artificial intelligence to discover clinical significance.

Yuxuan Pang is in Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, PR China, and also in the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, PR China. His research primarily centers around bioinformatics and machine learning.

Chia-Ru Chung is now an assistant professor in the Department of Computer Science and Information Engineering, National Central University. Her research interests focus on the intersection of bioinformatics, genomics, and proteomics, harnessing the power of artificial intelligence and machine learning to push the frontiers of these fields.

Hsin-Yao Wang is a medical doctor in the Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan City, Taiwan, and a student in the PhD. Program in Biomedical Engineering, Chang Gung University, Taoyuan City, Taiwan.

Haiyan Cui is now an assistant chief technician in the department of clinical laboratory, Longgang District People's Hospital of Shenzhen & The Second Affiliated Hospital of the Chinese University of Hong Kong, Shenzhen, China. Her research interests focus on rapid detection of pathogenic microorganisms.

Ying-Chih Chiang is currently an assistant professor at the Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, China. Her research interests focus on applying biophysics, computational biology, and bioinformatics to fight against antimicrobial resistance.

Jorng-Tzong Horng is now a Distinguished Professor in the Department of Computer Science and Information Engineering, National Central University, Taiwan. His research interests include database system, bioinformatics, computational biology, big data analytics, data mining and machine learning.

Jang-Jih Lu is a Professor in the Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan City, Taiwan.

Tzong-Yi Lee is now a professor in the Institute of Bioinformatics and Systems Biology, National Yang Ming ChiaoTung University. His research interests include bioinformatics, computational biology, systems biology, big data analytics, data mining and machine learning.

References

1.

Harkins
 
CP
,
Pichon
 
B
,
Doumith
 
M
, et al.  
Methicillin-resistant Staphylococcus aureus emerged long before the introduction of methicillin into clinical practice
.
Genome Biol
 
2017
;
18
(
1
):
130
.

2.

Derde
 
LPG
,
Cooper
 
BS
,
Goossens
 
H
, et al.  
Interventions to reduce colonisation and transmission of antimicrobial-resistant bacteria in intensive care units: an interrupted time series study and cluster randomised trial
.
Lancet Infect Dis
 
2014
;
14
(
1
):
31
9
.

3.

Lowy
 
FD
.
Staphylococcus aureus infections
.
N Engl J Med
 
1998
;
339
(
8
):
520
32
.

4.

Kreisel
 
KM
,
Stine
 
OC
,
Johnson
 
JK
, et al.  
USA300 methicillin-resistant Staphylococcus aureus bacteremia and the risk of severe sepsis: is USA300 methicillin-resistant Staphylococcus aureus associated with more severe infections?
 
Diagn Microbiol Infect Dis
 
2011
;
70
(
3
):
285
90
.

5.

Fabijan
 
AP
,
Lin
 
RC
,
Ho
 
J
, et al.  
Safety of bacteriophage therapy in severe Staphylococcus aureus infection
.
Nat Microbiol
 
2020
;
5
(
3
):
465
72
.

6.

Chambers
 
HF
,
Deleo
 
FR
.
Waves of resistance: Staphylococcus aureus in the antibiotic era
.
Nat Rev Microbiol
 
2009
;
7
(
9
):
629
41
.

7.

Bhattacharyya
 
RP
,
Bandyopadhyay
 
N
,
Ma
 
P
, et al.  
Simultaneous detection of genotype and phenotype enables rapid and accurate antibiotic susceptibility determination
.
Nat Med
 
2019
;
25
(
12
):
1858
64
.

8.

Wang
 
Z
,
Wang
 
H-Y
,
Chung
 
C-R
, et al.  
Large-scale mass spectrometry data combined with demographics analysis rapidly predicts methicillin resistance in Staphylococcus aureus
.
Brief Bioinform
 
2020
;
22
(4):bbaa293.

9.

Abrok
 
M
,
Lazar
 
A
,
Szecsenyi
 
M
, et al.  
Combination of MALDI-TOF MS and PBP2' latex agglutination assay for rapid MRSA detection
.
J Microbiol Methods
 
2018
;
144
:
122
4
.

10.

Felten
 
A
,
Grandry
 
B
,
Lagrange
 
PH
,
Casin
 
I
.
Evaluation of three techniques for detection of low-level methicillin-resistant Staphylococcus aureus (MRSA): a disk diffusion method with cefoxitin and moxalactam, the Vitek 2 system, and the MRSA-screen latex agglutination test
.
J Clin Microbiol
 
2002
;
40
(
8
):
2766
71
.

11.

Thornsberry
 
C
,
McDougal
 
LK
.
Successful use of broth microdilution in susceptibility tests for methicillin-resistant (heteroresistant) staphylococci
.
J Clin Microbiol
 
1983
;
18
(
5
):
1084
91
.

12.

Dupieux
 
C
,
Trouillet-Assant
 
S
,
Tasse
 
J
, et al.  
Evaluation of a commercial immunochromatographic assay for rapid routine identification of PBP2a-positive Staphylococcus aureus and coagulase-negative staphylococci
.
Diagn Microbiol Infect Dis
 
2016
;
86
(
3
):
262
4
.

13.

Xu
 
Z
,
Hou
 
Y
,
Peters
 
BM
, et al.  
Chromogenic media for MRSA diagnostics
.
Mol Biol Rep
 
2016
;
43
(
11
):
1205
12
.

14.

Palavecino
 
EL
. Clinical, epidemiologic, and laboratory aspects of methicillin-resistant Staphylococcus aureus infections. In:
Methicillin-Resistant Staphylococcus aureus (MRSA) Protocols
, edited by Yinduo Ji.
Springer: New York City
,
2014
,
1
24
.

15.

Perry
 
JD
.
A decade of development of chromogenic culture media for clinical microbiology in an era of molecular diagnostics
.
Clin Microbiol Rev
 
2017
;
30
(
2
):
449
79
.

16.

van
 
Belkum
 
A
,
Rochas
 
O
.
Laboratory-based and point-of-care testing for MSSA/MRSA detection in the age of whole genome sequencing
.
Front Microbiol
 
2018
;
9
:
1437
.

17.

Polisena
 
J
,
Chen
 
S
,
Cimon
 
K
, et al.  
Clinical effectiveness of rapid tests for methicillin resistant Staphylococcus aureus (MRSA) in hospitalized patients: a systematic review
.
BMC Infect Dis
 
2011
;
11
(
1
):
336
.

18.

Monecke
 
S
,
Konig
 
E
,
Earls
 
MR
, et al.  
An epidemic CC1-MRSA-IV clone yields false-negative test results in molecular MRSA identification assays: a note of caution, Austria, Germany, Ireland, 2020
.
Euro Surveill
 
2020
;
25
(
25
):
2000929
.

19.

Florio
 
W
,
Tavanti
 
A
,
Barnini
 
S
, et al.  
Recent advances and ongoing challenges in the diagnosis of microbial infections by MALDI-TOF mass spectrometry
.
Front Microbiol
 
2018
;
9
:
1097
.

20.

Vrioni
 
G
,
Tsiamis
 
C
,
Oikonomidis
 
G
, et al.  
MALDI-TOF mass spectrometry technology for detecting biomarkers of antimicrobial resistance: current achievements and future perspectives
.
Ann Transl Med
 
2018
;
6
(
12
):
240
.

21.

Wang
 
H-Y
,
Chung
 
C-R
,
Wang
 
Z
, et al.  
A large-scale investigation and identification of methicillin-resistant Staphylococcus aureus based on peaks binning of matrix-assisted laser desorption ionization-time of flight MS spectra
.
Brief Bioinform
 
2020
;
22
(3):bbaa138.

22.

Tang
 
W
,
Ranganathan
 
N
,
Shahrezaei
 
V
,
Larrouy-Maumus
 
G
.
MALDI-TOF mass spectrometry on intact bacteria combined with a refined analysis framework allows accurate classification of MSSA and MRSA
.
PloS One
 
2019
;
14
(
6
):
e0218951
.

23.

Chung
 
C-R
,
Wang
 
Z
,
Weng
 
J-M
, et al.  
MDRSA: a web based-tool for rapid identification of multidrug resistant Staphylococcus aureus based on matrix-assisted laser desorption ionization-time of flight mass spectrometry
.
Front Microbiol
 
2021
;
12
:766206.

24.

Zhang
 
J
,
Wang
 
Z
,
Wang
 
H-Y
, et al.  
Rapid antibiotic resistance serial prediction in Staphylococcus aureus based on large-scale MALDI-TOF data by applying XGBoost in multi-label learning
.
Front Microbiol
 
2022
;
13
:853775.

25.

Yu
 
J
,
Tien
 
N
,
Liu
 
Y-C
, et al.  
Rapid identification of methicillin-resistant Staphylococcus aureus using MALDI-TOF MS and machine learning from over 20,000 clinical isolates
.
Microbiol Spectr
 
2022
;
10
(
2
):
e00483
22
.

26.

Jeon
 
K
,
Kim
 
J-M
,
Rho
 
K
, et al.  
Performance of a machine learning-based methicillin resistance of Staphylococcus aureus identification system using MALDI-TOF MS and comparison of the accuracy according to SCC mec types
.
Microorganisms
 
2022
;
10
(
10
):
1903
.

27.

Gibb
 
S
,
Strimmer
 
K
.
MALDIquant: a versatile R package for the analysis of mass spectrometry data
.
Bioinformatics
 
2012
;
28
(
17
):
2270
1
.

28.

Chen
 
T
,
Guestrin
 
C
: Xgboost: A scalable tree boosting system. In:
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: 2016; Association for Computing Machinery:
 
New York, NY, United States
.
785
94
.

29.

Davis
 
J
,
Goadrich
 
M
. The relationship between Precision-Recall and ROC curves. In:
Proceedings of the 23rd International Conference on Machine Learning
:
2006
; Association for Computing Machinery: New York, NY, United States.
233
40
.

30.

Lundberg
 
SM
,
Lee
 
S-I
.
A unified approach to interpreting model predictions
. In:
Advances in Neural Information Processing Systems
 
2017
;
30
:
4765
74
.

31.

Abb
 
J
.
In vitro activity of linezolid, quinupristin-dalfopristin, vancomycin, teicoplanin, moxifloxacin and mupirocin against methicillin-resistant Staphylococcus aureus: comparative evaluation by the E test and a broth microdilution method
.
Diagn Microbiol Infect Dis
 
2002
;
43
(
4
):
319
21
.

Author notes

Zhuo Wang, Yuxuan Pang and Chia-Ru Chung contributed equally to this work.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)