Table 3.

The PEAE detection performance from each model % (95% confidence interval).

ModelRetrievalDCSMPPVSensitivitySpecificityNPVF1 score
GemmaKWYes16.75 (11.85-21.92)87.50 (76.32-97.14)98.40 (98.16-98.63)99.95 (99.91-99.99)28.11 (20.91-35.48)
GemmaKWNo11.14 (7.87-14.53)95.0 (86.96-100.00)97.22 (96.9-97.52)99.98 (99.95-100.0)19.95 (14.53-25.41)
GemmaSSYes12.46 (8.86-16.16)95.00 (86.96-100.00)97.55 (97.25-97.83)99.98 (99.95-100.0)22.03 (16.28-27.83)
GemmaSSNo11.08 (7.83-14.49)95.00 (87.18-100.0)97.20 (96.88-97.5)99.98 (99.95-100.0)19.84 (14.55-25.13)
Llama3KWYes14.34 (10.04-18.88)87.50 (76.19-97.22)98.08 (97.81-98.34)99.95 (99.91-99.99)24.65 (18.12-31.27)
Llama3KWNo8.85 (6.33-11.57)100.00 (100.0-100.0)96.21 (95.86-96.56)100.0 (100.0-100.0)16.26 (11.84-20.75)
Llama3SSYes11.34 (8.12-14.88)97.50 (91.49-100.0)97.2 (96.88-97.51)99.99 (99.97-100.0)20.31 (15.05-25.77)
Llama3SSNo8.52 (6.07-11.31)97.50 (91.67-100.0)96.15 (95.78-96.5)99.99 (99.97-100.0)15.66 (11.32-19.97)
MistralKWYes14.04 (10.08-18.21)100.0 (100.0-100.0)97.75 (97.46-98.02)100.0 (100.0-100.0)24.62 (18.36-30.94)
MistralKWNo11.14 (7.97-14.55)97.50 (91.67-100.0)97.14 (96.83-97.45)99.99 (99.97-100.0)20.0 (14.73-25.29)
MistralSSYes11.63 (8.31-15.15)100.0 (100.0-100.0)97.21 (96.9-97.52)100.0 (100.0-100.0)20.83 (15.38-26.25)
MistralSSNo11.08 (7.9-14.44)97.50 (91.67-100.0)97.12 (96.8-97.43)99.99 (99.97-100.0)19.9 (14.58-25.13)
Phi3KWYes15.27 (9.38-21.54)50.00 (34.21-65.71)98.98 (98.79-99.16)99.81 (99.73-99.89)23.39 (14.81-31.85)
Phi3KWNo12.02 (7.47-16.95)55.00 (39.47-70.59)98.52 (98.29-98.74)99.83 (99.75-99.91)19.73 (12.77-26.72)
Phi3SSYes9.32 (6.68-12.11)100.00 (100.0-100.0)96.42 (96.08-96.77)100.0 (100.0-100.0)17.06 (12.47-21.65)
Phi3SSNo6.76 (4.83-8.88)100.00 (100.0-100.0)94.93 (94.5-95.33)100.0 (100.0-100.0)12.66 (9.16-16.18)
ICD-based method (Baseline)37.5 (19.1-58.3)14.52 (6.5-23.7)99.85 (99.8-99.9)99.48 (99.3-99.6)20.93 (9.9-31.9)
ModelRetrievalDCSMPPVSensitivitySpecificityNPVF1 score
GemmaKWYes16.75 (11.85-21.92)87.50 (76.32-97.14)98.40 (98.16-98.63)99.95 (99.91-99.99)28.11 (20.91-35.48)
GemmaKWNo11.14 (7.87-14.53)95.0 (86.96-100.00)97.22 (96.9-97.52)99.98 (99.95-100.0)19.95 (14.53-25.41)
GemmaSSYes12.46 (8.86-16.16)95.00 (86.96-100.00)97.55 (97.25-97.83)99.98 (99.95-100.0)22.03 (16.28-27.83)
GemmaSSNo11.08 (7.83-14.49)95.00 (87.18-100.0)97.20 (96.88-97.5)99.98 (99.95-100.0)19.84 (14.55-25.13)
Llama3KWYes14.34 (10.04-18.88)87.50 (76.19-97.22)98.08 (97.81-98.34)99.95 (99.91-99.99)24.65 (18.12-31.27)
Llama3KWNo8.85 (6.33-11.57)100.00 (100.0-100.0)96.21 (95.86-96.56)100.0 (100.0-100.0)16.26 (11.84-20.75)
Llama3SSYes11.34 (8.12-14.88)97.50 (91.49-100.0)97.2 (96.88-97.51)99.99 (99.97-100.0)20.31 (15.05-25.77)
Llama3SSNo8.52 (6.07-11.31)97.50 (91.67-100.0)96.15 (95.78-96.5)99.99 (99.97-100.0)15.66 (11.32-19.97)
MistralKWYes14.04 (10.08-18.21)100.0 (100.0-100.0)97.75 (97.46-98.02)100.0 (100.0-100.0)24.62 (18.36-30.94)
MistralKWNo11.14 (7.97-14.55)97.50 (91.67-100.0)97.14 (96.83-97.45)99.99 (99.97-100.0)20.0 (14.73-25.29)
MistralSSYes11.63 (8.31-15.15)100.0 (100.0-100.0)97.21 (96.9-97.52)100.0 (100.0-100.0)20.83 (15.38-26.25)
MistralSSNo11.08 (7.9-14.44)97.50 (91.67-100.0)97.12 (96.8-97.43)99.99 (99.97-100.0)19.9 (14.58-25.13)
Phi3KWYes15.27 (9.38-21.54)50.00 (34.21-65.71)98.98 (98.79-99.16)99.81 (99.73-99.89)23.39 (14.81-31.85)
Phi3KWNo12.02 (7.47-16.95)55.00 (39.47-70.59)98.52 (98.29-98.74)99.83 (99.75-99.91)19.73 (12.77-26.72)
Phi3SSYes9.32 (6.68-12.11)100.00 (100.0-100.0)96.42 (96.08-96.77)100.0 (100.0-100.0)17.06 (12.47-21.65)
Phi3SSNo6.76 (4.83-8.88)100.00 (100.0-100.0)94.93 (94.5-95.33)100.0 (100.0-100.0)12.66 (9.16-16.18)
ICD-based method (Baseline)37.5 (19.1-58.3)14.52 (6.5-23.7)99.85 (99.8-99.9)99.48 (99.3-99.6)20.93 (9.9-31.9)

This table summarizes the performance metrics of 4 models using different methods. The results were based on 10 000 bootstrap resamples to estimate the performance variability, with the metrics reported as the mean and 95% confidence intervals. The baseline method is a rule-based method based on ICD codes from a Discharge Abstract Database.

Abbreviations used: DCSM, inclusion of discharge information; ICD, International Classification of Diseases; KW: keyword-based retrieval methods; NPV, negative predictive value; PPV, positive predictive value; SS, semantic similarity-based retrieval methods.

Table 3.

The PEAE detection performance from each model % (95% confidence interval).

ModelRetrievalDCSMPPVSensitivitySpecificityNPVF1 score
GemmaKWYes16.75 (11.85-21.92)87.50 (76.32-97.14)98.40 (98.16-98.63)99.95 (99.91-99.99)28.11 (20.91-35.48)
GemmaKWNo11.14 (7.87-14.53)95.0 (86.96-100.00)97.22 (96.9-97.52)99.98 (99.95-100.0)19.95 (14.53-25.41)
GemmaSSYes12.46 (8.86-16.16)95.00 (86.96-100.00)97.55 (97.25-97.83)99.98 (99.95-100.0)22.03 (16.28-27.83)
GemmaSSNo11.08 (7.83-14.49)95.00 (87.18-100.0)97.20 (96.88-97.5)99.98 (99.95-100.0)19.84 (14.55-25.13)
Llama3KWYes14.34 (10.04-18.88)87.50 (76.19-97.22)98.08 (97.81-98.34)99.95 (99.91-99.99)24.65 (18.12-31.27)
Llama3KWNo8.85 (6.33-11.57)100.00 (100.0-100.0)96.21 (95.86-96.56)100.0 (100.0-100.0)16.26 (11.84-20.75)
Llama3SSYes11.34 (8.12-14.88)97.50 (91.49-100.0)97.2 (96.88-97.51)99.99 (99.97-100.0)20.31 (15.05-25.77)
Llama3SSNo8.52 (6.07-11.31)97.50 (91.67-100.0)96.15 (95.78-96.5)99.99 (99.97-100.0)15.66 (11.32-19.97)
MistralKWYes14.04 (10.08-18.21)100.0 (100.0-100.0)97.75 (97.46-98.02)100.0 (100.0-100.0)24.62 (18.36-30.94)
MistralKWNo11.14 (7.97-14.55)97.50 (91.67-100.0)97.14 (96.83-97.45)99.99 (99.97-100.0)20.0 (14.73-25.29)
MistralSSYes11.63 (8.31-15.15)100.0 (100.0-100.0)97.21 (96.9-97.52)100.0 (100.0-100.0)20.83 (15.38-26.25)
MistralSSNo11.08 (7.9-14.44)97.50 (91.67-100.0)97.12 (96.8-97.43)99.99 (99.97-100.0)19.9 (14.58-25.13)
Phi3KWYes15.27 (9.38-21.54)50.00 (34.21-65.71)98.98 (98.79-99.16)99.81 (99.73-99.89)23.39 (14.81-31.85)
Phi3KWNo12.02 (7.47-16.95)55.00 (39.47-70.59)98.52 (98.29-98.74)99.83 (99.75-99.91)19.73 (12.77-26.72)
Phi3SSYes9.32 (6.68-12.11)100.00 (100.0-100.0)96.42 (96.08-96.77)100.0 (100.0-100.0)17.06 (12.47-21.65)
Phi3SSNo6.76 (4.83-8.88)100.00 (100.0-100.0)94.93 (94.5-95.33)100.0 (100.0-100.0)12.66 (9.16-16.18)
ICD-based method (Baseline)37.5 (19.1-58.3)14.52 (6.5-23.7)99.85 (99.8-99.9)99.48 (99.3-99.6)20.93 (9.9-31.9)
ModelRetrievalDCSMPPVSensitivitySpecificityNPVF1 score
GemmaKWYes16.75 (11.85-21.92)87.50 (76.32-97.14)98.40 (98.16-98.63)99.95 (99.91-99.99)28.11 (20.91-35.48)
GemmaKWNo11.14 (7.87-14.53)95.0 (86.96-100.00)97.22 (96.9-97.52)99.98 (99.95-100.0)19.95 (14.53-25.41)
GemmaSSYes12.46 (8.86-16.16)95.00 (86.96-100.00)97.55 (97.25-97.83)99.98 (99.95-100.0)22.03 (16.28-27.83)
GemmaSSNo11.08 (7.83-14.49)95.00 (87.18-100.0)97.20 (96.88-97.5)99.98 (99.95-100.0)19.84 (14.55-25.13)
Llama3KWYes14.34 (10.04-18.88)87.50 (76.19-97.22)98.08 (97.81-98.34)99.95 (99.91-99.99)24.65 (18.12-31.27)
Llama3KWNo8.85 (6.33-11.57)100.00 (100.0-100.0)96.21 (95.86-96.56)100.0 (100.0-100.0)16.26 (11.84-20.75)
Llama3SSYes11.34 (8.12-14.88)97.50 (91.49-100.0)97.2 (96.88-97.51)99.99 (99.97-100.0)20.31 (15.05-25.77)
Llama3SSNo8.52 (6.07-11.31)97.50 (91.67-100.0)96.15 (95.78-96.5)99.99 (99.97-100.0)15.66 (11.32-19.97)
MistralKWYes14.04 (10.08-18.21)100.0 (100.0-100.0)97.75 (97.46-98.02)100.0 (100.0-100.0)24.62 (18.36-30.94)
MistralKWNo11.14 (7.97-14.55)97.50 (91.67-100.0)97.14 (96.83-97.45)99.99 (99.97-100.0)20.0 (14.73-25.29)
MistralSSYes11.63 (8.31-15.15)100.0 (100.0-100.0)97.21 (96.9-97.52)100.0 (100.0-100.0)20.83 (15.38-26.25)
MistralSSNo11.08 (7.9-14.44)97.50 (91.67-100.0)97.12 (96.8-97.43)99.99 (99.97-100.0)19.9 (14.58-25.13)
Phi3KWYes15.27 (9.38-21.54)50.00 (34.21-65.71)98.98 (98.79-99.16)99.81 (99.73-99.89)23.39 (14.81-31.85)
Phi3KWNo12.02 (7.47-16.95)55.00 (39.47-70.59)98.52 (98.29-98.74)99.83 (99.75-99.91)19.73 (12.77-26.72)
Phi3SSYes9.32 (6.68-12.11)100.00 (100.0-100.0)96.42 (96.08-96.77)100.0 (100.0-100.0)17.06 (12.47-21.65)
Phi3SSNo6.76 (4.83-8.88)100.00 (100.0-100.0)94.93 (94.5-95.33)100.0 (100.0-100.0)12.66 (9.16-16.18)
ICD-based method (Baseline)37.5 (19.1-58.3)14.52 (6.5-23.7)99.85 (99.8-99.9)99.48 (99.3-99.6)20.93 (9.9-31.9)

This table summarizes the performance metrics of 4 models using different methods. The results were based on 10 000 bootstrap resamples to estimate the performance variability, with the metrics reported as the mean and 95% confidence intervals. The baseline method is a rule-based method based on ICD codes from a Discharge Abstract Database.

Abbreviations used: DCSM, inclusion of discharge information; ICD, International Classification of Diseases; KW: keyword-based retrieval methods; NPV, negative predictive value; PPV, positive predictive value; SS, semantic similarity-based retrieval methods.

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close