The PEAE detection performance from each model % (95% confidence interval).
Model . | Retrieval . | DCSM . | PPV . | Sensitivity . | Specificity . | NPV . | F1 score . |
---|---|---|---|---|---|---|---|
Gemma | KW | Yes | 16.75 (11.85-21.92) | 87.50 (76.32-97.14) | 98.40 (98.16-98.63) | 99.95 (99.91-99.99) | 28.11 (20.91-35.48) |
Gemma | KW | No | 11.14 (7.87-14.53) | 95.0 (86.96-100.00) | 97.22 (96.9-97.52) | 99.98 (99.95-100.0) | 19.95 (14.53-25.41) |
Gemma | SS | Yes | 12.46 (8.86-16.16) | 95.00 (86.96-100.00) | 97.55 (97.25-97.83) | 99.98 (99.95-100.0) | 22.03 (16.28-27.83) |
Gemma | SS | No | 11.08 (7.83-14.49) | 95.00 (87.18-100.0) | 97.20 (96.88-97.5) | 99.98 (99.95-100.0) | 19.84 (14.55-25.13) |
Llama3 | KW | Yes | 14.34 (10.04-18.88) | 87.50 (76.19-97.22) | 98.08 (97.81-98.34) | 99.95 (99.91-99.99) | 24.65 (18.12-31.27) |
Llama3 | KW | No | 8.85 (6.33-11.57) | 100.00 (100.0-100.0) | 96.21 (95.86-96.56) | 100.0 (100.0-100.0) | 16.26 (11.84-20.75) |
Llama3 | SS | Yes | 11.34 (8.12-14.88) | 97.50 (91.49-100.0) | 97.2 (96.88-97.51) | 99.99 (99.97-100.0) | 20.31 (15.05-25.77) |
Llama3 | SS | No | 8.52 (6.07-11.31) | 97.50 (91.67-100.0) | 96.15 (95.78-96.5) | 99.99 (99.97-100.0) | 15.66 (11.32-19.97) |
Mistral | KW | Yes | 14.04 (10.08-18.21) | 100.0 (100.0-100.0) | 97.75 (97.46-98.02) | 100.0 (100.0-100.0) | 24.62 (18.36-30.94) |
Mistral | KW | No | 11.14 (7.97-14.55) | 97.50 (91.67-100.0) | 97.14 (96.83-97.45) | 99.99 (99.97-100.0) | 20.0 (14.73-25.29) |
Mistral | SS | Yes | 11.63 (8.31-15.15) | 100.0 (100.0-100.0) | 97.21 (96.9-97.52) | 100.0 (100.0-100.0) | 20.83 (15.38-26.25) |
Mistral | SS | No | 11.08 (7.9-14.44) | 97.50 (91.67-100.0) | 97.12 (96.8-97.43) | 99.99 (99.97-100.0) | 19.9 (14.58-25.13) |
Phi3 | KW | Yes | 15.27 (9.38-21.54) | 50.00 (34.21-65.71) | 98.98 (98.79-99.16) | 99.81 (99.73-99.89) | 23.39 (14.81-31.85) |
Phi3 | KW | No | 12.02 (7.47-16.95) | 55.00 (39.47-70.59) | 98.52 (98.29-98.74) | 99.83 (99.75-99.91) | 19.73 (12.77-26.72) |
Phi3 | SS | Yes | 9.32 (6.68-12.11) | 100.00 (100.0-100.0) | 96.42 (96.08-96.77) | 100.0 (100.0-100.0) | 17.06 (12.47-21.65) |
Phi3 | SS | No | 6.76 (4.83-8.88) | 100.00 (100.0-100.0) | 94.93 (94.5-95.33) | 100.0 (100.0-100.0) | 12.66 (9.16-16.18) |
ICD-based method (Baseline) | 37.5 (19.1-58.3) | 14.52 (6.5-23.7) | 99.85 (99.8-99.9) | 99.48 (99.3-99.6) | 20.93 (9.9-31.9) |
Model . | Retrieval . | DCSM . | PPV . | Sensitivity . | Specificity . | NPV . | F1 score . |
---|---|---|---|---|---|---|---|
Gemma | KW | Yes | 16.75 (11.85-21.92) | 87.50 (76.32-97.14) | 98.40 (98.16-98.63) | 99.95 (99.91-99.99) | 28.11 (20.91-35.48) |
Gemma | KW | No | 11.14 (7.87-14.53) | 95.0 (86.96-100.00) | 97.22 (96.9-97.52) | 99.98 (99.95-100.0) | 19.95 (14.53-25.41) |
Gemma | SS | Yes | 12.46 (8.86-16.16) | 95.00 (86.96-100.00) | 97.55 (97.25-97.83) | 99.98 (99.95-100.0) | 22.03 (16.28-27.83) |
Gemma | SS | No | 11.08 (7.83-14.49) | 95.00 (87.18-100.0) | 97.20 (96.88-97.5) | 99.98 (99.95-100.0) | 19.84 (14.55-25.13) |
Llama3 | KW | Yes | 14.34 (10.04-18.88) | 87.50 (76.19-97.22) | 98.08 (97.81-98.34) | 99.95 (99.91-99.99) | 24.65 (18.12-31.27) |
Llama3 | KW | No | 8.85 (6.33-11.57) | 100.00 (100.0-100.0) | 96.21 (95.86-96.56) | 100.0 (100.0-100.0) | 16.26 (11.84-20.75) |
Llama3 | SS | Yes | 11.34 (8.12-14.88) | 97.50 (91.49-100.0) | 97.2 (96.88-97.51) | 99.99 (99.97-100.0) | 20.31 (15.05-25.77) |
Llama3 | SS | No | 8.52 (6.07-11.31) | 97.50 (91.67-100.0) | 96.15 (95.78-96.5) | 99.99 (99.97-100.0) | 15.66 (11.32-19.97) |
Mistral | KW | Yes | 14.04 (10.08-18.21) | 100.0 (100.0-100.0) | 97.75 (97.46-98.02) | 100.0 (100.0-100.0) | 24.62 (18.36-30.94) |
Mistral | KW | No | 11.14 (7.97-14.55) | 97.50 (91.67-100.0) | 97.14 (96.83-97.45) | 99.99 (99.97-100.0) | 20.0 (14.73-25.29) |
Mistral | SS | Yes | 11.63 (8.31-15.15) | 100.0 (100.0-100.0) | 97.21 (96.9-97.52) | 100.0 (100.0-100.0) | 20.83 (15.38-26.25) |
Mistral | SS | No | 11.08 (7.9-14.44) | 97.50 (91.67-100.0) | 97.12 (96.8-97.43) | 99.99 (99.97-100.0) | 19.9 (14.58-25.13) |
Phi3 | KW | Yes | 15.27 (9.38-21.54) | 50.00 (34.21-65.71) | 98.98 (98.79-99.16) | 99.81 (99.73-99.89) | 23.39 (14.81-31.85) |
Phi3 | KW | No | 12.02 (7.47-16.95) | 55.00 (39.47-70.59) | 98.52 (98.29-98.74) | 99.83 (99.75-99.91) | 19.73 (12.77-26.72) |
Phi3 | SS | Yes | 9.32 (6.68-12.11) | 100.00 (100.0-100.0) | 96.42 (96.08-96.77) | 100.0 (100.0-100.0) | 17.06 (12.47-21.65) |
Phi3 | SS | No | 6.76 (4.83-8.88) | 100.00 (100.0-100.0) | 94.93 (94.5-95.33) | 100.0 (100.0-100.0) | 12.66 (9.16-16.18) |
ICD-based method (Baseline) | 37.5 (19.1-58.3) | 14.52 (6.5-23.7) | 99.85 (99.8-99.9) | 99.48 (99.3-99.6) | 20.93 (9.9-31.9) |
This table summarizes the performance metrics of 4 models using different methods. The results were based on 10 000 bootstrap resamples to estimate the performance variability, with the metrics reported as the mean and 95% confidence intervals. The baseline method is a rule-based method based on ICD codes from a Discharge Abstract Database.
Abbreviations used: DCSM, inclusion of discharge information; ICD, International Classification of Diseases; KW: keyword-based retrieval methods; NPV, negative predictive value; PPV, positive predictive value; SS, semantic similarity-based retrieval methods.
The PEAE detection performance from each model % (95% confidence interval).
Model . | Retrieval . | DCSM . | PPV . | Sensitivity . | Specificity . | NPV . | F1 score . |
---|---|---|---|---|---|---|---|
Gemma | KW | Yes | 16.75 (11.85-21.92) | 87.50 (76.32-97.14) | 98.40 (98.16-98.63) | 99.95 (99.91-99.99) | 28.11 (20.91-35.48) |
Gemma | KW | No | 11.14 (7.87-14.53) | 95.0 (86.96-100.00) | 97.22 (96.9-97.52) | 99.98 (99.95-100.0) | 19.95 (14.53-25.41) |
Gemma | SS | Yes | 12.46 (8.86-16.16) | 95.00 (86.96-100.00) | 97.55 (97.25-97.83) | 99.98 (99.95-100.0) | 22.03 (16.28-27.83) |
Gemma | SS | No | 11.08 (7.83-14.49) | 95.00 (87.18-100.0) | 97.20 (96.88-97.5) | 99.98 (99.95-100.0) | 19.84 (14.55-25.13) |
Llama3 | KW | Yes | 14.34 (10.04-18.88) | 87.50 (76.19-97.22) | 98.08 (97.81-98.34) | 99.95 (99.91-99.99) | 24.65 (18.12-31.27) |
Llama3 | KW | No | 8.85 (6.33-11.57) | 100.00 (100.0-100.0) | 96.21 (95.86-96.56) | 100.0 (100.0-100.0) | 16.26 (11.84-20.75) |
Llama3 | SS | Yes | 11.34 (8.12-14.88) | 97.50 (91.49-100.0) | 97.2 (96.88-97.51) | 99.99 (99.97-100.0) | 20.31 (15.05-25.77) |
Llama3 | SS | No | 8.52 (6.07-11.31) | 97.50 (91.67-100.0) | 96.15 (95.78-96.5) | 99.99 (99.97-100.0) | 15.66 (11.32-19.97) |
Mistral | KW | Yes | 14.04 (10.08-18.21) | 100.0 (100.0-100.0) | 97.75 (97.46-98.02) | 100.0 (100.0-100.0) | 24.62 (18.36-30.94) |
Mistral | KW | No | 11.14 (7.97-14.55) | 97.50 (91.67-100.0) | 97.14 (96.83-97.45) | 99.99 (99.97-100.0) | 20.0 (14.73-25.29) |
Mistral | SS | Yes | 11.63 (8.31-15.15) | 100.0 (100.0-100.0) | 97.21 (96.9-97.52) | 100.0 (100.0-100.0) | 20.83 (15.38-26.25) |
Mistral | SS | No | 11.08 (7.9-14.44) | 97.50 (91.67-100.0) | 97.12 (96.8-97.43) | 99.99 (99.97-100.0) | 19.9 (14.58-25.13) |
Phi3 | KW | Yes | 15.27 (9.38-21.54) | 50.00 (34.21-65.71) | 98.98 (98.79-99.16) | 99.81 (99.73-99.89) | 23.39 (14.81-31.85) |
Phi3 | KW | No | 12.02 (7.47-16.95) | 55.00 (39.47-70.59) | 98.52 (98.29-98.74) | 99.83 (99.75-99.91) | 19.73 (12.77-26.72) |
Phi3 | SS | Yes | 9.32 (6.68-12.11) | 100.00 (100.0-100.0) | 96.42 (96.08-96.77) | 100.0 (100.0-100.0) | 17.06 (12.47-21.65) |
Phi3 | SS | No | 6.76 (4.83-8.88) | 100.00 (100.0-100.0) | 94.93 (94.5-95.33) | 100.0 (100.0-100.0) | 12.66 (9.16-16.18) |
ICD-based method (Baseline) | 37.5 (19.1-58.3) | 14.52 (6.5-23.7) | 99.85 (99.8-99.9) | 99.48 (99.3-99.6) | 20.93 (9.9-31.9) |
Model . | Retrieval . | DCSM . | PPV . | Sensitivity . | Specificity . | NPV . | F1 score . |
---|---|---|---|---|---|---|---|
Gemma | KW | Yes | 16.75 (11.85-21.92) | 87.50 (76.32-97.14) | 98.40 (98.16-98.63) | 99.95 (99.91-99.99) | 28.11 (20.91-35.48) |
Gemma | KW | No | 11.14 (7.87-14.53) | 95.0 (86.96-100.00) | 97.22 (96.9-97.52) | 99.98 (99.95-100.0) | 19.95 (14.53-25.41) |
Gemma | SS | Yes | 12.46 (8.86-16.16) | 95.00 (86.96-100.00) | 97.55 (97.25-97.83) | 99.98 (99.95-100.0) | 22.03 (16.28-27.83) |
Gemma | SS | No | 11.08 (7.83-14.49) | 95.00 (87.18-100.0) | 97.20 (96.88-97.5) | 99.98 (99.95-100.0) | 19.84 (14.55-25.13) |
Llama3 | KW | Yes | 14.34 (10.04-18.88) | 87.50 (76.19-97.22) | 98.08 (97.81-98.34) | 99.95 (99.91-99.99) | 24.65 (18.12-31.27) |
Llama3 | KW | No | 8.85 (6.33-11.57) | 100.00 (100.0-100.0) | 96.21 (95.86-96.56) | 100.0 (100.0-100.0) | 16.26 (11.84-20.75) |
Llama3 | SS | Yes | 11.34 (8.12-14.88) | 97.50 (91.49-100.0) | 97.2 (96.88-97.51) | 99.99 (99.97-100.0) | 20.31 (15.05-25.77) |
Llama3 | SS | No | 8.52 (6.07-11.31) | 97.50 (91.67-100.0) | 96.15 (95.78-96.5) | 99.99 (99.97-100.0) | 15.66 (11.32-19.97) |
Mistral | KW | Yes | 14.04 (10.08-18.21) | 100.0 (100.0-100.0) | 97.75 (97.46-98.02) | 100.0 (100.0-100.0) | 24.62 (18.36-30.94) |
Mistral | KW | No | 11.14 (7.97-14.55) | 97.50 (91.67-100.0) | 97.14 (96.83-97.45) | 99.99 (99.97-100.0) | 20.0 (14.73-25.29) |
Mistral | SS | Yes | 11.63 (8.31-15.15) | 100.0 (100.0-100.0) | 97.21 (96.9-97.52) | 100.0 (100.0-100.0) | 20.83 (15.38-26.25) |
Mistral | SS | No | 11.08 (7.9-14.44) | 97.50 (91.67-100.0) | 97.12 (96.8-97.43) | 99.99 (99.97-100.0) | 19.9 (14.58-25.13) |
Phi3 | KW | Yes | 15.27 (9.38-21.54) | 50.00 (34.21-65.71) | 98.98 (98.79-99.16) | 99.81 (99.73-99.89) | 23.39 (14.81-31.85) |
Phi3 | KW | No | 12.02 (7.47-16.95) | 55.00 (39.47-70.59) | 98.52 (98.29-98.74) | 99.83 (99.75-99.91) | 19.73 (12.77-26.72) |
Phi3 | SS | Yes | 9.32 (6.68-12.11) | 100.00 (100.0-100.0) | 96.42 (96.08-96.77) | 100.0 (100.0-100.0) | 17.06 (12.47-21.65) |
Phi3 | SS | No | 6.76 (4.83-8.88) | 100.00 (100.0-100.0) | 94.93 (94.5-95.33) | 100.0 (100.0-100.0) | 12.66 (9.16-16.18) |
ICD-based method (Baseline) | 37.5 (19.1-58.3) | 14.52 (6.5-23.7) | 99.85 (99.8-99.9) | 99.48 (99.3-99.6) | 20.93 (9.9-31.9) |
This table summarizes the performance metrics of 4 models using different methods. The results were based on 10 000 bootstrap resamples to estimate the performance variability, with the metrics reported as the mean and 95% confidence intervals. The baseline method is a rule-based method based on ICD codes from a Discharge Abstract Database.
Abbreviations used: DCSM, inclusion of discharge information; ICD, International Classification of Diseases; KW: keyword-based retrieval methods; NPV, negative predictive value; PPV, positive predictive value; SS, semantic similarity-based retrieval methods.
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.