Abstract

Background

Intravenous (IV) fluid contamination is a common cause of preanalytical error that can delay or misguide treatment decisions, leading to patient harm. Current approaches for detecting contamination rely on delta checks, which require a prior result, or manual technologist intervention, which is inefficient and vulnerable to human error. Supervised machine learning may provide a means to detect contamination, but its implementation is hindered by its reliance on expert-labeled training data. An automated approach that is accurate, reproducible, and practical is needed.

Methods

A total of 25 747 291 basic metabolic panel (BMP) results from 312 721 patients were obtained from the laboratory information system (LIS). A Uniform Manifold Approximation and Projection (UMAP) model was trained and tested using a combination of real patient data and simulated IV fluid contamination. To provide an objective metric for classification, an “enrichment score” was derived and its performance assessed. Our current workflow was compared to UMAP predictions using expert chart review.

Results

UMAP embeddings from real patient results demonstrated outliers suspicious for IV fluid contamination when compared with the simulated contamination's embeddings. At a flag rate of 3 per 1000 results, the positive predictive value (PPV) was adjudicated to be 0.78 from 100 consecutive positive predictions. Of these, 58 were previously undetected by our current clinical workflows, with 49 BMPs displaying a total of 56 critical results.

Conclusions

Accurate and automatable detection of IV fluid contamination in BMP results is achievable without curating expertly labeled training data.

Introduction

Medical errors are a common occurrence in the delivery of healthcare worldwide (1). Diagnostic errors affect an estimated 12 million outpatient adults and nearly 1% of inpatient admissions in the United States each year (2–4). Incorrect laboratory results account for many such diagnostic errors, which in turn can lead to dangerous, misguided treatment decisions. However, distinguishing erroneous results from those reflective of an underlying disease process can be difficult. Many extreme, or “critical,” results require immediate medical intervention which compounds the severity of this dilemma.

Most incorrect results arise from the “preanalytical” phase of the testing process, before the sample makes it onto the analyzer (5). One common cause of preanalytical error is contamination of the specimens by intravenous (IV) fluids that were infused through the same line from which the sample was drawn. Specimens contaminated by dextrose-containing fluids can display extreme elevations in glucose. However, patients experiencing life-threatening diabetic ketoacidosis (DKA) can show similarly extreme glucose results, especially in their initial presentation to care—removing the possibility of a delta check if a prior result is unavailable. Expedient administration of insulin is a mainstay of DKA management but could result in life-threatening hypoglycemia if that extreme glucose result was an artifact of dextrose contamination. Clinical laboratories must strike a careful balance between accuracy and turnaround time of critical results.

Current approaches to detecting preanalytical errors, such as delta checks (6, 7), leverage the observation that these errors often diverge significantly from a patient’s baseline, but return promptly to that baseline on subsequent draws. This “anomaly-with-resolution” pattern is effective for retrospective identification but requires both a prior result and a subsequent re-drawn specimen.

Advances in real-time detection of preanalytical errors have included multi-analyte delta checks (8, 9) and supervised machine learning algorithms (10, 11), both of which demonstrated superior performance compared to standard delta checks. However, all delta checks still require a prior result, and supervised machine learning requires often-impractical amounts of expert-labeled training data. The ideal solution would overcome both limitations.

Unsupervised learning methods aim to detect patterns of variation from unlabeled data. Common unsupervised methods include linear decompositions, such as principal component analysis, and nonlinear, nearest neighbor-based approaches, such as Uniform Manifold Approximation and Projection (UMAP). The UMAP algorithm aims to “flatten” a complex, high-dimensional set of data points onto a lower dimensional surface, or manifold. It starts by grouping similar points into local neighborhoods and building a graph, where each point is a vertex, and the edges are the probability that the points are truly neighbors on the manifold. A cost function is then applied such that a balance of the local and global structure of the data is as preserved as possible. Finally, the data is then projected onto the lower dimensional manifold that best approximates the original graph structure.

We hypothesize that UMAP can be used to detect IV contamination while avoiding the drawbacks of the other approaches summarized above. Our core assumption is that preanalytical errors result in patterns of results that are fundamentally distinguishable from physiological or pathophysiological results. We expect that preanalytical errors will appear as outliers when compared against real patient samples in a dimensionally reduced space.

IV fluid contamination was chosen to demonstrate the principle of the approach due to its clinical impact and relative simplicity. Contamination by a known IV fluid can be easily modeled in silico to produce the labeled data for training and performance assessment.

In this work, we present a computational approach that distinguishes variation between biological and contaminated results. By comparing the density of real patient results and simulated errors within a dimensionally reduced space, we describe a novel “enrichment score” (ES) metric, which can be tuned to a desired alarm rate. We then estimated the performance of the resulting contamination predictions. We find that this novel approach identifies contaminated specimens that displayed critically abnormal results which were not flagged by our current, delta check-reliant workflow.

Materials and Methods

Data collection and processing

This study was approved by the Washington University Institutional Review Board (IRB #202202030). All basic metabolic panel (BMP) measurements from inpatients at a single hospital were extracted directly from the laboratory information system (Cerner). At our institution, this includes sodium, chloride, potassium, total carbon dioxide (CO2), blood urea nitrogen (BUN), creatinine, calcium, and glucose, plus a calculated anion gap. This encompassed 25 747 291 chemistry results measured on 2 567 403 specimens drawn from 312 721 patients. Data from all measurements and technologist interactions were included in the initial extraction, including verification status, timestamps, and interpretive comments. Specimens drawn from patients receiving cardiopulmonary resuscitation were excluded. Results reported as above or below a threshold value were replaced with that threshold (e.g., a sodium reported as “>180mEq/L” was changed to 180mEq/L). The anion gap was calculated as sodium minus (chloride plus bicarbonate).

Partitioning training and testing sets

To avoid information leakage, the input data was grouped by unique patients, then split into an 80:20 partition such that no patient was present in both the training and test sets. The training set was then partitioned again, such that 80% was used to build the manifold, and the remaining 20% used to generate the ES grid and define a classification threshold, described in greater detail below. All performance assessments and figures were generated from the test set.

Simulating preanalytical errors

Specimens with IV fluid contamination were simulated by creating in silico mixtures of randomly selected results with the 10 most common fluids administered at our institution (online Supplemental Table 1). Mixture ratios ranged from 0.01 to 0.99—where a ratio of 0.50 indicates equal parts patient sample and IV fluid. Online Supplemental Fig. 1 presents a schematic overview of the approach. These simulated errors were validated with an in vitro mixing study identical to the one described in Choucair et al. (9), the results of which are presented in online Supplemental Fig. 2.

Building unsupervised models

UMAP was applied using the {tidymodels} (12) and {uwot} (13) implementations in R 4.2.3 using the {targets} (14) pipeline framework. A total of 1 620 275 real patient results from the training set were combined with randomly selected results mixed in silico to represent IV fluid contamination. Contamination was simulated in 540 000 randomly selected results and added to in the training and validation set for a final class balance set of approximately 3:1 (real:contaminated). Cosine similarity on center and scaled data was used as the distance metric for nearest neighbor calculation to emphasize the importance of the relationship between analytes as opposed to the magnitude of a single analyte, and hyperparameters that favor a more global view of the full data set were chosen (online Supplemental Table 2).

Developing an objective metric for quantifying anomalies

After transforming the validation set and simulated contamination using the trained UMAP model, 2-dimensional kernel density estimation (KDE) was applied to each using the {KernSmooth} package. Coordinates were binned into a 1000 × 1000 grid. The mean KDE in each bin from the contamination embedding was divided by that of the patient data, producing an enrichment score (ES) such that a point at any given location on the manifold represents IV fluid contamination. A classification threshold was defined as the ES that would lead to an identical alarm rate in the validation set as compared to our laboratory leadership’s estimate of the current workflow (3 per 1000 specimens), and all expert review was performed using this threshold. However, a more permissive and a more stringent threshold was also assessed for comparison.

Assessing the performance of the approach

The held-out test set (n = 506 341) was embedded onto the manifold, assigned an ES, and classified as real or contaminated. These predictions were compared to the technologist interpretive comments, representing our current workflow. To adjudicate disagreement between the UMAP approach and the current workflow, 50 results from each quadrant of the confusion matrix (concordant positive, concordant negative, UMAP positive + current workflow negative, UMAP negative + current workflow positive) were randomly sampled and subjected to expert review.

Manual review of the electronic medical record was performed by 2 separate expert reviewers. Cases in which these reviewers’ assessments were discordant were reviewed by a third expert. Prior and subsequent laboratory results, clinical history and presentation, and whether the patient was receiving an intravenous crystalloid infusion when the flagged result was drawn were considered in the final adjudication. Reviewers were blind to the model’s predictions and to the relative proportion of results in their review set that were expected to be contaminated.

To estimate positive predictive value (PPV), 100 consecutive results that were predicted as contaminated by the UMAP approach, with both a prior and subsequent result collected within 48 h of the flagged result, were aggregated from the held-out test set.

Accessibility, reproducibility, and replicability

In the interest of reducing the barrier to reproduction and implementation of this approach, the code for this project is provided publicly on GitHub at https://github.com/nspies13/bmp_umap_paper (or in static form at https://zenodo.org/doi/10.5281/zenodo.10083584). An anonymized, version-controlled copy of the data required to replicate this analysis can be found on FigShare at https://doi.org/10.6084/m9.figshare.23805456.v1 (15), and the final embedding model can be found at https://doi.org/10.6084/m9.figshare.23805531.v1 (16). A public Docker container to run the analysis can also be pulled from DockerHub at https://hub.docker.com/r/nspies13/bmp_umap_paper.

The Code Walkthrough in the GitHub repository provides an overview of the implementation details and can be used as a guide to replicate the findings below or reproduce them on new data.

All figures were generated using Fabio Crameri’s perceptually uniform, universally readable color maps (https://www.fabiocrameri.ch/colourmaps/) (17).

Results

The manifold of population-level variation in basic metabolic panels

A UMAP model was trained on a data set consisting of 1 620 275 real patient results and 540 000 results from simulated contamination across a variety of compositions and mixture ratios. These patients were 54% female by sex, 63% White, 29% Black by self-reported race, and of median age 59 years (Interquartile range: 41 to 71). Specimens were drawn from 57% medical or subspecialty wards, 16% intensive care units, 15% surgical wards, and 8% emergency department.

Figure 1 presents the manifold produced when this model was applied to the previously held-out testing set consisting of 506 341 real patient results. Each pixel represents at least one result, colored by the density within that coordinate on the manifold by KDE. The majority of BMPs embedded in one dense region of the manifold (white). However, distinct outlier projections extend off the left and right of the central density.

The manifold of human variation in basic metabolic panel (BMP) results. Uniform Manifold Approximation and Projection (UMAP) was applied to BMP results to capture high-level patterns of population-level variation within the data, creating the 2-dimensional manifold above. Pixels are colored by the density of results that embed in close proximity. The majority of results embed within the center of the manifold, with outliers forming projections outwards.
Fig. 1.

The manifold of human variation in basic metabolic panel (BMP) results. Uniform Manifold Approximation and Projection (UMAP) was applied to BMP results to capture high-level patterns of population-level variation within the data, creating the 2-dimensional manifold above. Pixels are colored by the density of results that embed in close proximity. The majority of results embed within the center of the manifold, with outliers forming projections outwards.

Exploring analyte-level variation across the manifold

Figure 2 displays the same manifold as above, now colored by analyte concentrations. Blue and pink points represent BMPs where that analyte is 3 SD below and above the mean, respectively.

Uniform Manifold Approximation and Projection (UMAP) captures analyte-level variation in basic metabolic panels (BMP). Each facet presents the original patient manifold colored by the normalized value of that analyte in the BMP. Extreme results tend to cluster along the outlier projections, with each set of outliers displaying a unique signature.
Fig. 2.

Uniform Manifold Approximation and Projection (UMAP) captures analyte-level variation in basic metabolic panels (BMP). Each facet presents the original patient manifold colored by the normalized value of that analyte in the BMP. Extreme results tend to cluster along the outlier projections, with each set of outliers displaying a unique signature.

The central density generally consists of measurements within 3 SD of the mean for each analyte. In contrast, the outlying projections exhibited distinct patterns of extreme analyte values. Sodium and chloride are both markedly elevated in the leftward outlier projection, while extreme elevations in glucose make up much of the bottom-right outliers. The upper-right projection is generally represented by extreme hypocalcemia. Potassium is markedly elevated in the bottom-most projection, and markedly decreased in a subset of the upper-right projection but is less uniformly concentrated than the aforementioned analytes. Bicarbonate (CO2) segregates along the UMAP1 axis, with extreme elevations concentrating along the left edge, and extreme decreases concentrating rightward. Creatinine and BUN are generally correlated, with extreme elevations concentrating in the bottom edge of the central density.

Embedding simulated IV Fluid contamination

Figure 3 illustrates the embeddings of the simulated IV fluid contamination onto the manifold. Unlike the patient results, which primarily embedded centrally, the simulated contamination tended to map onto the outlier projections.

Embedding simulated contamination provides insight into the outlier projections. Embeddings of simulated contamination for the common fluids. Abbreviations: NS, 0.9% (normal) saline; LR, lactated Ringer’s; D5, 5% dextrose; HalfNS, 0.45% saline; Hyper, 3% saline; +K 20 mEq/L potassium chloride. Facets are colored by (A) the contaminating fluid, (B) the mixture ratio of the simulated result to assess contamination severity, and (C) the novel enrichment score (ES) calculated by dividing the density in the contamination set by the density in the real patient set at each coordinate. (D) displays the enrichment scores for the real patient results.
Fig. 3.

Embedding simulated contamination provides insight into the outlier projections. Embeddings of simulated contamination for the common fluids. Abbreviations: NS, 0.9% (normal) saline; LR, lactated Ringer’s; D5, 5% dextrose; HalfNS, 0.45% saline; Hyper, 3% saline; +K 20 mEq/L potassium chloride. Facets are colored by (A) the contaminating fluid, (B) the mixture ratio of the simulated result to assess contamination severity, and (C) the novel enrichment score (ES) calculated by dividing the density in the contamination set by the density in the real patient set at each coordinate. (D) displays the enrichment scores for the real patient results.

When colored by the composition of the contaminating fluid (Fig. 3A), we observed that simulated 3% saline (hyperNS) contamination universally embedded within the leftward outlier projection of the manifold, where extreme elevations in sodium and chloride were observed in Fig. 2. Water, 0.45% saline (HalfNS), 0.9% (normal) saline (NS), and lactated Ringer’s (LR) all embedded within the bottom projection from the manifold, where extreme hypocalcemia was observed. Corresponding to the extreme elevations in glucose in Fig. 2, fluids containing 5% dextrose (D5)—D5 in water (D5W), D5 in 0.9% normal saline (D5NS), D5 in lactated Ringer’s (D5LR), and D5 in 0.45% saline with and without 20 mEq/L of potassium chloride added (D5halfNS, +K)—all embedded within the lower-right projection, before branching off into their respective fluids.

Figure 3B paints the embedding of simulated contamination by its severity (mixture ratio), with mild contamination in pale yellow, and near-pure fluids in deep blue. In general, even relatively mild mixture ratios embedded outside of the central density and towards the outlier projections described above. For the left- and rightward projections of hypertonic saline and D5-containing fluids, mixture ratios as low as 0.05 routinely embedded within their respective projections. LR, however, represented a stark exception. Even results with a mixture ratio of greater than 0.50 occasionally embedded within the central density of the manifold.

Figure 3C presents the manifold of simulated contamination colored by the novel ES, calculated by dividing the density in the simulated contamination set by that of the patient results for each coordinate. In similar fashion to the contamination severity, results with higher ESs (deep red) embedded towards the outlier projections, while those with lower likelihoods (off-white) embedded within the central density.

Figure 3D applies the same ES approach to real patient data from the held-out test set. Points that embedded within the outlier projections demonstrated a higher ES than those within the center of the manifold.

Estimating performance and alarm rates

Figure 4A displays the precision-recall (PR) curve for the approach of flagging IV fluid contamination using ES above a classification threshold as a binary (real vs contaminated) and expert review as the ground truth. The area under the PR curve is 0.89, demonstrating a high discriminatory capability. At ES thresholds of 10, 30, and 100, the corresponding sensitivities are 0.85, 0.71, and 0.48, respectively, while the PPVs are 0.71, 0.78, and 0.95, respectively.

Assessing the performance of the Uniform Manifold Approximation and Projection (UMAP) approach. (A), Predictions were compared to a set of expert-curated labels and a precision-recall curve was generated. Performance at 3 enrichment score thresholds was highlighted and (B) visualized on the embedding; (C), The estimated alarm rate for each of these thresholds (white, tan, red) is compared to the current flag rate (black).
Fig. 4.

Assessing the performance of the Uniform Manifold Approximation and Projection (UMAP) approach. (A), Predictions were compared to a set of expert-curated labels and a precision-recall curve was generated. Performance at 3 enrichment score thresholds was highlighted and (B) visualized on the embedding; (C), The estimated alarm rate for each of these thresholds (white, tan, red) is compared to the current flag rate (black).

Figure 4B displays the embedding with each point colored by the threshold at which it would be flagged as contaminated. All 3 thresholds flag only results that embed within the outlier projections, with the upper-right projection showing a less distinct separation than the left and right projections. This observation is quantified in online Supplemental Fig. 3. Simulated contamination by hypertonic saline and fluids containing D5 were much more sensitively detected than were 0.9% saline, 0.45% saline, and sterile water. LR was the most difficult to detect.

Figure 4C displays the expected alarm rates at each threshold as a monthly rolling average, with the current flag rate (black) overlayed for comparison. The dashed lines represent the observed rolling averages for each threshold, with the shaded area depicting the standard errors. As the decision threshold increases, the alarm rate decreases. However, each threshold produces an alarm rate near or above the current flag rate.

Summary of expert review

Figure 5 presents the summary of the expert review experiments. To estimate PPV, 100 consecutive predicted positives were reviewed, shown in fig. 5A . Of these, a total of 78 were confirmed as contaminated, 58 of which had not been detected by our current workflow. The bootstrapped 95% confidence interval for the PPV is 0.68–0.85. Figure 5B presents the distribution of ESs for each prediction class. False-positive results demonstrated a significantly lower median ES than did the true-positives missed by the current workflow, and by both methods (chi-squared test; P < 1e−12). Of the 58 BMPs that were correctly flagged by the UMAP approach but missed by our current workflow, 49 (85%) of them were observed to have a total of 56 abnormal results that were in the critical range.

Assessing the agreement of the approaches. (A), The positive predictive value (PPV) of the UMAP approach was assessed by expert review; (B), The distribution of enrichment scores for each scenario. Each dot is one basic metabolic panel (BMP), colored by increasing enrichment score. Boxplots are shown in gray at the base of the density plots, with the median represented by a circle and vertical line; (C), Results from the review set of 50 results per quadrant of the confusion matrix aimed to assess agreement between the approaches.
Fig. 5.

Assessing the agreement of the approaches. (A), The positive predictive value (PPV) of the UMAP approach was assessed by expert review; (B), The distribution of enrichment scores for each scenario. Each dot is one basic metabolic panel (BMP), colored by increasing enrichment score. Boxplots are shown in gray at the base of the density plots, with the median represented by a circle and vertical line; (C), Results from the review set of 50 results per quadrant of the confusion matrix aimed to assess agreement between the approaches.

To better assess the agreement between the approaches, a random sample of 50 results from each quadrant of the confusion matrix was reviewed (Fig. 5C). When both approaches agreed, they were 100% accurate. The frequency of fluid compositions detected by both approaches was 21 (42%) NS variants, 15 (30%) D5-containing fluids, 6 (12%) total parenteral nutrition, 6 (12%) potassium phosphate or potassium chloride, and 2 (4%) LR. For results flagged by UMAP, but not by the current approach, 76% were contaminated. The frequency of fluid compositions correctly classified by UMAP but missed by the current approach was 28 (56%) normal saline variants, 6 (12%) dextrose-containing fluids, 3 (6%) total parenteral nutrition, and 1 (2%) LR. For results flagged by the current approach, but not by UMAP, 48% were contaminated. Of results confirmed as contaminated and flagged by the current approach but not by UMAP, 11 (46%) were wrong tube errors (K2EDTA), 5 (22%) contained a fluid not represented in the simulation or calculation of the ES (total parenteral nutrition, calcium chloride, potassium chloride), and 8 (32%) were true misses for fluid types that were included in the simulation set (5 NS, 3 LR).

Discussion

The detection of IV fluid contamination is a difficult and universal problem for clinical laboratories. Current protocols rely on delta checks or manual technologist intervention. Proposed solutions involving supervised learning algorithms have historically required a substantial investment to curate expertly labeled training data. Additionally, supervised classifiers can only detect anomalies for which they are specifically trained. For example, the supervised classifiers trained by Baron et al. (10) focused solely on dextrose contamination. However, fluids without dextrose are more common (18) and more challenging because they do not result in such extreme changes to a single analyte.

In this work, we applied UMAP to capture biological and nonbiological variation across BMP results and simulated IV fluid contamination. The model produced a manifold whose outlier regions were consistent with IV contamination by multiple fluid types and easily identified. An ES was calculated by dividing densities from simulated contamination by densities from real patient results to create an objective metric for implementation into our laboratory information system (LIS) rule set. Comparison to expert review found acceptable performance and highlighted a potential deficiency in our current workflow.

To our knowledge, this is the first study to identify preanalytical errors using machine learning that does not require upfront expert labelling. While we have focused on IV fluid contamination as a proof of principle, our approach should generalize to any rare event that differs from biological variation, a feature of many preanalytical errors. Future studies will address the generalizability of the approach.

We envision patient results could be analyzed in real time using our method for the immediate flagging of a variety of errors and artifacts. High likelihood scores for an error could have an interpretive comment added without masking the results, allowing frontline providers to exercise appropriate clinical judgment, while reducing the operational impact that these errors exert on the clinical laboratory workflow. The auto-verification of results thought to be erroneous represents a stark departure from the current paradigm and would require extensive observation and validation prior to deployment. However, with the recent advances in machine learning development and deployment and LISs, combined with the nearly universal staffing shortages for laboratories and the fallibility of human adjudicators (19), the potential benefits to clinical operations and patient safety indicate the need for further investigation.

Our work has several limitations. First, a more robust prospective validation that can assess performance across many patient subgroups is necessary. Second, a more robust optimization of some key decision points made in this work could be beneficial prior to implementation within clinical practice, including tuning of UMAP’s hyperparameters, the relative proportions and fluid compositions of the simulated errors in the training set, and the threshold at which clinical decisions will be made. This was not done in the current study due to the need for additional expert-labeled data. Third, we have not compared the performance of the ES against other possible methods for quantification and detection of anomalies. It is likely that the incorporation of a similar approach that includes patient deltas, when available, may improve the performance of the approach. However, this approach extends beyond the current capabilities of most LIS software. A cost-benefit analysis between this approach and one that does not require such substantial investment in data infrastructure would be prudent.

Future studies extending this work should examine additional fluid types, as well as other common laboratory errors, such as mislabeled specimens or hemolysis, icterus, and lipemia. Incorporation of other common laboratory panels, such as complete blood count and liver function panel, should also be investigated for their ability to improve error detection.

Supplemental Material

Supplemental material is available at Clinical Chemistry online.

Nonstandard Abbreviations

IV, intravenous; BMP, basic metabolic panel; LIS, laboratory information system; UMAP, Uniform Manifold Approximation and Projection; PPV, positive predictive value; ES, enrichment score; NS, 0.9% normal saline; LR, lactated Ringer’s; D5, 5% dextrose.

Author Contributions

The corresponding author takes full responsibility that all authors on this publication have met the following required criteria of eligibility for authorship: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved. Nobody who qualifies for authorship has been omitted from the list.

Nicholas Spies (Conceptualization-Lead, Formal analysis-Lead, Investigation-Equal, Methodology-Equal, Writing—original draft-Equal, Writing—review & editing-Equal), Zita Hubler (Data curation-Equal, Investigation-Equal, Methodology-Equal, Validation-Equal, Visualization-Equal, Writing—review & editing-Equal), Vahid Azimi (Data curation-Equal, Investigation-Equal, Methodology-Equal, Validation-Equal, Writing—review & editing-Equal), Ray Zhang (Conceptualization-Equal, Investigation-Equal, Methodology-Equal, Validation-Equal, Writing—original draft-Equal, Writing—review & editing-Equal), Ronald Jackups Jr. (Conceptualization-Equal, Investigation-Equal, Methodology-Equal, Writing—original draft-Equal, Writing—review & editing-Equal), Ann Gronowski (Investigation-Equal, Methodology-Equal, Supervision-Equal, Visualization-Equal, Writing—original draft-Equal, Writing—review & editing-Equal), Christopher Farnsworth (Conceptualization-Equal, Formal analysis-Equal, Investigation-Equal, Methodology-Equal, Writing—original draft-Equal, Writing—review & editing-Equal), and Mark Zaydman (Conceptualization-Equal, Data curation-Lead, Investigation-Equal, Methodology-Equal, Supervision-Equal, Validation-Equal, Writing—original draft-Equal, Writing—review & editing-Equal).

Authors’ Disclosures or Potential Conflicts of Interest

Upon manuscript submission, all authors completed the author disclosure form.

Research Funding

None declared.

Disclosures

M.A. Zaydman has received research support from Biomerieux and honoraria from Siemens and Sebia. C.W. Farnsworth has received research support from Abbott, Roche, Siemens, Sebia, Beckman Coulter, Blue Jay Diagnostics, Biomerieux, Cepheid, and Qiagen; consulting fees from Bio-Rad, Roche, Abbott, CytoVale, and Werfen; and honoraria from Abbott, Roche, and ADLM. A.M. Gronowski and C.W. Farnsworth have served on the editorial board for Clinical Chemistry.

Role of Sponsor

No sponsor was declared.

Acknowledgments

We would like to thank Kimberly Zohner for her insight into the operational impact that these errors have on our laboratory, and for her valuable feedback regarding features that any proposed solution must have.

References

1

Institute of Medicine (US) Committee on Quality of Health Care in America
.
Kohn
LT
,
Corrigan
JM
,
Donaldson
MS
, editors.
To err is human: building a safer health system
.
Washington (DC)
:
National Academies Press (US)
;
2000
.

2

Committee on Diagnostic Error in Health Care, Board on Health Care Services, Institute of Medicine, The National Academies of Sciences, Engineering, and Medicine
.
Balogh
EP
,
Miller
BT
,
Ball
JR
, editors.
Improving Diagnosis in Health Care
.
Washington (DC)
:
National Academies Press (US)
;
2015
.

3

Singh
H
,
Schiff
GD
,
Graber
ML
,
Onakpoya
I
,
Thompson
MJ
.
The global burden of diagnostic errors in primary care
.
BMJ Qual Saf
2017
;
26
:
484
94
.

4

Shen
L
,
Levie
A
,
Singh
H
,
Murray
K
,
Desai
S
.
Harnessing event report data to identify diagnostic error during the COVID-19 pandemic
.
Jt Comm J Qual Patient Saf
2022
;
48
:
71
80
.

5

Carraro
P
,
Plebani
M
.
Errors in a stat laboratory: types and frequencies 10 years later
.
Clin Chem
2007
;
53
:
1338
42
.

6

Ladenson
JH
.
Patients as their own controls: use of the computer to identify “laboratory error”
.
Clin Chem
1975
;
21
:
1648
53
.

7

Plebani
M
,
Sciacovelli
L
,
Aita
A
,
Pelloso
M
,
Chiozza
ML
.
Performance criteria and quality indicators for the pre-analytical phase
.
Clin Chem Lab Med
2015
;
53
:
943
8
.

8

Patel
DK
,
Naik
RD
,
Boyer
RB
,
Wikswo
J
,
Vasilevskis
EE
.
Methods to identify saline-contaminated electrolyte profiles
.
Clin Chem Lab Med
2015
;
53
:
1585
91
.

9

Choucair
I
,
Lee
ES
,
Vera
MA
,
Drongmebaro
C
,
El-Khoury
JM
,
Durant
TJS
.
Contamination of clinical blood samples with crystalloid solutions: an experimental approach to derive multianalyte delta checks
.
Clin Chim Acta
2023
;
538
:
22
8
.

10

Baron
JM
,
Mermel
CH
,
Lewandrowski
KB
,
Dighe
AS
.
Detection of preanalytic laboratory testing errors using a statistically guided protocol
.
Am J Clin Pathol
2012
;
138
:
406
13
.

11

Rosenbaum
MW
,
Baron
JM
.
Using machine learning-based multianalyte delta checks to detect wrong blood in tube errors
.
Am J Clin Pathol
2018
;
150
:
555
66
.

12

Kuhn
M
,
Wickham
H
. Tidymodels: a collection of packages for modeling and machine learning using tidyverse principles. 2020. (Accessed July 2023).

13

Melville
J
. uwot: the uniform manifold approximation and projection (UMAP) method for dimensionality reduction. 2023. (Accessed July 2023).

14

Landau
WM
.
The targets R package: a dynamic make-like function-oriented pipeline toolkit for reproducibility and high-performance computing
.
J Open Source Softw
2021
;
6
:
2959
.

15

Spies N
. 2,500,000 anonymized BMP results for the manuscript “Automating the detection of IV fluid contamination using unsupervised machine learning” [Dataset]. 2023. (Accessed November 2023).

16

Spies N
. UMAP Model for the paper, “Automating the detection of IV fluid contamination using unsupervised machine learning” [Software]. 2023. (Accessed November 2023).

17

Crameri
F
. Scientific colour maps [Internet]. (Accessed July 2023).

18

Myburgh
JA
,
Mythen
MG
.
Resuscitation fluids
.
N Engl J Med
2013
;
369
:
1243
51
.

19

Spies
NC
,
Hubler
Z
,
Roper
SM
,
Omosule
CL
,
Senter-Zapata
M
,
Roemmich
BL
, et al.
GPT-4 underperforms experts in detecting IV fluid contamination
.
J Appl Lab Med
2023
;
8
:
1092
–1100.

Author notes

Previous Presentation: Platform presentation (Pathology Informatics Summit 2022).

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)

Supplementary data