-
PDF
- Split View
-
Views
-
Cite
Cite
Xinyan Cai, Mark H Ebell, Garth Russo, Kevin K Dobbin, Jose F Cordero, Development and internal validation of risk scores to diagnose infectious mononucleosis among college students, Family Practice, Volume 40, Issue 2, April 2023, Pages 261–267, https://doi.org/10.1093/fampra/cmac105
- Share Icon Share
Abstract
Individual symptoms and signs of infectious mononucleosis (IM) are of limited value for diagnosis.
To develop and validate risk scores based on signs and symptoms with and without haematologic parameters for the diagnosis of IM.
Data were extracted from electronic health records of a university health centre and were divided into derivation (9/1/2015–10/31/2017) and a prospective temporal internal validation (11/1/2017–1/31/2019) cohort.
Independent predictors for the diagnosis of IM were identified in univariate analysis using the derivation cohort. Logistic regression models were used to develop 2 risk scores: 1 with only symptoms and signs (IM-NoLab) and 1 adding haematologic parameters (IM-Lab). Point scores were created based on the regression coefficients, and patients were grouped into risk groups. Primary outcomes were area under the receiver operating characteristic curve (AUROCC) and classification accuracy.
The IM-NoLab model had 4 predictors and identified a low-risk group (7.9% with IM) and a high-risk group (22.2%) in the validation cohort. The AUROCC was 0.75 in the derivation cohort and 0.69 in the validation cohort. The IM-Lab model had 3 predictors and identified a low-risk group (3.6%), a moderate-risk group (12.5%), and a high-risk group (87.6%). The AUROCC was 0.97 in the derivation cohort and 0.93 in the validation cohort.
We derived and internally validated the IM-NoLab and IM-Lab risk scores. The IM-Lab score in particular had very good discrimination and have the potential to reduce the need for diagnostic testing for IM.
- common cold
- communicable diseases
- diagnostic techniques and procedures
- infectious mononucleosis
- laboratory
- primary health care
- roc curve
- bronchitis
- diagnosis
- hematology
- upper respiratory infections
- electronic medical records
- Adolescent Medicine
- univariate analysis
- college students
- primary outcome measure
There are no risk scores for the diagnosis of infectious mononucleosis.
We developed and temporally validated 2 risk scores.
The IM-NoLab risk score uses symptoms only and AUC of 0.69.
The IM-Lab score added the WBC and had an AUC of 0.93.
These risk scores can increase the efficiency of diagnosis of IM.
Introduction
Infectious mononucleosis (IM) is a common disease among young adults, especially among college students, with the incidence rate ranging from 11 to 48 cases/1,000 persons/year.1 Previous studies have shown that the most common symptoms and signs of IM include sore throat, lymphadenopathy (including posterior cervical and axillary), fever, tonsillar enlargement, pharyngeal inflammation, transient palatal petechiae, splenomegaly, and rash.2 However, a recent systematic review by the authors concluded that individual symptoms and signs are of limited value for the diagnosis of IM.3 The presence of splenomegaly (positive likelihood ratio [LR+], 2.4), posterior cervical lymphadenopathy (LR+ 3.2), and axillary or inguinal cervical lymphadenopathy (LR+ 3.1) were the most helpful signs but were still only moderately useful for ruling in IM when present. The same systematic review and other studies have found that leukocytosis and atypical leukocytosis were very helpful when present for diagnosing IM.3–5
Laboratory tests such as the heterophile antibody tests and Epstein–Barr virus specific antibody tests are available but have important limitations. While heterophile antibodies can be detected within 1 week after the onset of IM, false negatives are common early in the course of the illness, especially among children.6 The VCA-IgM, VCA-IgG, and EBNA tests are more accurate than heterophile antibody tests, but may not be available for point of care decision-making and are expensive.7 And of course, it is not practical no cost-effective to order these tests in all adolescents and young adults presenting with sore throat. A way to identify persons most likely to benefit from these tests is needed.
Clinical prediction rules (CPRs), often in the form of a simple risk score or algorithm, can integrate elements of the medical history, physical examination, and/or basic laboratory results to improve diagnosis.8,9 CPRs have been developed and validated to aid in the diagnosis of Group A beta-haemolytic streptococcal sore throat and influenza, and for the prognosis of community-acquired, pneumonia.10–14 However, to date there has been no published study attempting to develop and validate a CPR for the diagnosis of IM. Therefore, the primary goal of the current study is to develop and internally validate 2 CPRs that identify young adults as being low or high risk for IM, 1 using signs and symptoms only and 1 adding haematologic parameters.
Methods
Study setting and patient population
The investigators obtained deidentified data from the University Health Center (UHC) at the University of Georgia. The UHC provides primary care, specialty health care, education, and prevention-focussed services to approximately 35,000 students enrolled at the university each year. The UHC has 4 primary care clinics with approximately 20 primary care clinicians available during regular business hours. We included all patients in whom IM was clinically suspected between 1 September 2015 and 1 January 2019, based on the fact that a heterophile antibody test for IM was ordered.
The UHC uses an electronic health record (EHR) system to record and maintain the patients’ symptoms, signs, and laboratory test results for each clinical visit. The UGA health centre staff, who were not study team members, were responsible for linking the clinical and laboratory data and for removing any identifier/personal information, including name, age in years, birthdate, address, contact information, and student ID number. Each patient was assigned a random ID number only known by the staff to maintain confidentiality. The deidentified data were securely transferred from the UHC to the study investigators for analysis. The study investigator then merged the dataset for clinical presentation and laboratory parameters by each patient’s ID number, as created by health center staff, and the date of visit. The final study population included all students 18 years and older with a test for IM during the study period for whom complete data were available regarding predictor and outcome variables.
Dependent and independent variables
The independent (predictor) variables in this study included self-reported symptoms, signs, and laboratory findings. The symptoms were self-reported by patients in the portal when they registered for their visit using a checklist, and the clinical signs were evaluated by physicians during the physical examination. The EHR stores signs and symptoms in discrete data fields. When a field was checked that indicated that the sign or symptom was present. If left blank by the student or clinician the investigators assumed that the symptom or sign was absent.
Symptoms and signs available in the EHR included fever, diarrhoea, vomiting, fatigue, headache, joint pain, myalgia, nausea, rash, sore throat, swollen lymph nodes, cough, anterior cervical lymphadenopathy, posterior cervical lymphadenopathy, pharyngeal erythema, tonsillar erythema, exudative pharyngitis, tonsillar enlargement, and/or tonsillar exudate. Laboratory findings included the absolute number and percentage of white blood cells, lymphocytes, neutrophils, monocytes, and atypical lymphocytes.
The dependent variable for the models was the results of the Monogen heterophile antibody test in a latex agglutination form. The results were not known to clinicians at the time they examined their patient, and laboratory personnel performing the Monogen test did not have any clinical data regarding signs and symptoms, other than patient age and sex. The reported sensitivity and specificity are 94.2% and 91.3% when using a hemagglutination test as the reference standard, although the sensitivity is likely lower earlier in the course of the illness and in children.6,15
Derivation and validation cohort
The data collected from consecutive patients visiting the UGA UHC from 1 September 2015 to 31 October 2017 were used as the derivation cohort, which we used to build the model and develop the point scores. We then used the data collected from 1 November 2017 to 31 January 2019 as a validation cohort to evaluate the accuracy of the risk scores. This kind of prospective temporal validation is more robust than a simple internal validation using a random split of the data.16 Approximately two-thirds of patients were in the derivation cohort and one-third in the validation cohort.
Statistical analysis
To identify predictors associated with a diagnosis of IM, we used Pearson’s χ2 test for categorical variables. We selected variables for inclusion in the risk scores that were significantly associated with a positive test for IM at P < 0.1 for inclusion in the multivariable analysis, adding fatigue because of its known association with IM. For the haematologic parameters that were reported as continuous variables, we created categorical variables based on values commonly reported in the literature such as greater than 20%, 30%, and 40% lymphocytes and greater than 5% and 10% atypical lymphocytes. This was done to facilitate creation of a simple risk score.
Development of the CPRs
We developed 2 models to predict the likelihood of IM, 1 using signs and symptoms only (IM-NoLab risk score) and 1 adding haematologic parameters (IM-Lab risk score). The cutoffs were decided based on the inspection of histograms, as well as a previous systematic review.16 We used best subsets variable selection to identify the optimal and most parsimonious group of predictors for each model, using the gvselect function in Stata. The Bayesian information criterion was used to identify the optimal set of predictors.17 We then used the logit function to create logistic regression models with diagnosis of IM as the dependent variable.
Simple risk scores were created by dividing each beta coefficient by the smallest beta coefficient and then rounded to the nearest integer or 0.5. For each risk score, we determined the percentage of patients with IM at each point value and used that information to choose cutoffs to define risk groups. We prespecified a test threshold of 10% based on a survey 136 clinicians, asking to identify the threshold at which they would order a point of care test for IM (unpublished, data available on request).
The discrimination of the final full IM-NoLab and IM-Lab models was evaluated in the derivation cohort using the area under the receiver operating characteristic curve (AUROCC), calculated using the lroc postestimation command in Stata. We also determined the AUROCC separately for the simple risk scores in both the derivation and validation cohorts using the rocfit and rocplot commands in Stata. The calibration of the models in the derivation cohort was evaluated using a calibration belt plot, which measures how well the predicted outcome matched the observed outcome.18 We also determined the percentage of patients with IM in each risk group defined by the risk scores in both the derivation and validation cohort. All the analysis in our study was performed with Stata version 17.0 (College Station, TX).
Ethical considerations
The University of Georgia’s Institution Review Board (IRB) approved this project. It was deemed to be exempt research due to the data in this study was deidentified, and previously collected and extracted retrospectively from EHR system.
Results
Characteristics of the study population
The derivation cohort for the IM-NoLab score included 1,341 patients, of whom 226 (16.8%) had a positive heterophile antibody test for IM, while the validation cohort included 820 patients of whom 130 (15.9%) had a positive test. The derivation cohort for the IM-Lab score included 1,291 patients, of whom 216 (16.7%) had a positive test for IM, while the validation cohort for the IM-Lab score included 804 patients of whom 125 (15.5%) had a positive test. The distribution of signs, symptoms, and haematologic parameters was similar between the derivation and validation groups (Appendix Table 1). There were significant differences between the early and late cohorts, particular with regard to symptoms and signs, which may reflect changes in clinicians and student populations or behaviour over time.
Results of the bivariate analysis of the association between potential predictor variables and a positive test for IM are shown in Table 1. Symptoms significantly associated with IM (P < 0.05) included the presence of subjective fever, joint pain, rash, and patient reported “swollen glands,” and the absence of cough, myalgias, nasal congestion or discharge, and sinus pressure. On examination, lymphadenopathy other than anterior cervical was associated with IM, as was the presence of moderate or severe tonsillar exudate. Lymphocytosis, atypical lymphocytosis, and decreases in neutrophils were all strongly associated with IM.
Association between clinical characteristics of patients in the derivation cohort and infection with IM.
. | IM (N, %) . | Not IM (N, %) . | P . |
---|---|---|---|
Total patients | 226 | 1,115 | |
Patient reported symptoms (N, %) | |||
Diarrhoea | 5 (2.2) | 44 (4.0) | 0.205 |
Nausea | 25 (11.1) | 142 (12.7) | 0.487 |
Vomiting | 11 (4.9) | 42 (3.8) | 0.439 |
Fatigue | 105 (46.5) | 454 (40.7) | 0.110 |
Fever (subjective) | 98 (43.4) | 393 (35.3) | 0.021 |
Chills | 86 (38.0) | 382 (34.3) | 0.275 |
Headache | 76 (33.6) | 437 (39.2) | 0.117 |
Joint pain | 1 (0.44) | 37 (3.3) | 0.018 |
No myalgias | 192 (85.0) | 814 (73.0) | <0.001 |
Nasal discharge | 57 (25.2) | 421 (37.8) | <0.001 |
Nasal congestion | 85 (37.6) | 491 (44.0) | 0.075 |
No nasal discharge or congestion | 137 (60.6) | 592 (53.1) | 0.038 |
Postnasal drip sensation | 69 (30.5) | 359 (32.2) | 0.624 |
Rash | 7 (3.1) | 13 (1.2) | 0.029 |
Sneezing | 7 (3.1) | 51 (4.6) | 0.320 |
Sore throat | 150 (66.4) | 705 (63.2) | 0.370 |
Swollen glands | 136 (60.2) | 411 (36.9) | <0.001 |
Hoarseness | 12 (5.3) | 67 (6.0) | 0.684 |
Wheezing | 4 (1.8) | 27 (2.4) | 0.552 |
No cough | 159 (70.4) | 654 (58.7) | 0.001 |
Sinus pressure | 36 (15.9) | 258 (23.1) | 0.017 |
Clinician documented signs (N, %) | |||
Lymphadenopathy | |||
Anterior cervical | 183 (81.0) | 850 (76.2) | 0.122 |
Posterior cervical | 113 (50.0) | 179 (16.1) | <0.001 |
Occipital | 7 (3.1) | 3 (0.27) | <0.001 |
Posterior auricular | 5 (2.2) | 5 (0.45) | 0.005 |
Preauricular | 9 (4.0) | 13 (1.2) | 0.002 |
Supraclavicular | 2 (0.89) | 1 (0.09) | 0.021 |
Sublingual | 5 (2.2) | 6 (0.54) | 0.011 |
Submental | 5 (2.2) | 9 (0.81) | 0.058 |
Any lymphadenopathy other than anterior cervical | 123 (54.4) | 201 (18.0) | <0.001 |
Pharyngeal erythema | 7 (3.1) | 46 (4.1) | 0.469 |
Tonsillar erythema | 86 (38.1) | 374 (33.5) | 0.193 |
Tonsillar enlargement | 123 (54.4) | 597 (53.5) | 0.808 |
Tonsillar exudate | 82 (36.3) | 159 (14.3) | <0.001 |
Fever (temperature ≥100.04°F) | 5 (2.2) | 58 (5.2) | 0.053 |
Haematologic parameters: n/total (%) | |||
Lymphocyte count | |||
≥3.0 × 10⁹/L | 173/216 (80.1) | 69/1,075 (6.4) | <0.001 |
≥4.0 × 10⁹/L | 148/216 (68.5) | 13/1,075 (1.2) | <0.001 |
Lymphocyte percent | |||
>20% | 189/217 (87.1) | 524/1,076 (48.7) | <0.001 |
>30% | 136/217 (62.7) | 203/1,076 (18.9) | <0.001 |
>40% | 72/217 (33.2) | 41/1,076 (3.8) | <0.001 |
Atypical lymphocyte percent | |||
≥5% | 196/217 (90.3) | 107/1,077 (9.9) | <0.001 |
≥10% | 174/217 (80.2) | 33/1,077 (3.1) | <0.001 |
≥20% | 107/217 (49.3) | 4/1,077 (0.37) | <0.001 |
Neutrophils count <8.0 × 10⁹/L | 209/216 (96.8) | 744/1,075 (69.2) | <0.001 |
Neutrophils <60% | 196/216 (90.7) | 247/1,075 (23.0) | <0.001 |
. | IM (N, %) . | Not IM (N, %) . | P . |
---|---|---|---|
Total patients | 226 | 1,115 | |
Patient reported symptoms (N, %) | |||
Diarrhoea | 5 (2.2) | 44 (4.0) | 0.205 |
Nausea | 25 (11.1) | 142 (12.7) | 0.487 |
Vomiting | 11 (4.9) | 42 (3.8) | 0.439 |
Fatigue | 105 (46.5) | 454 (40.7) | 0.110 |
Fever (subjective) | 98 (43.4) | 393 (35.3) | 0.021 |
Chills | 86 (38.0) | 382 (34.3) | 0.275 |
Headache | 76 (33.6) | 437 (39.2) | 0.117 |
Joint pain | 1 (0.44) | 37 (3.3) | 0.018 |
No myalgias | 192 (85.0) | 814 (73.0) | <0.001 |
Nasal discharge | 57 (25.2) | 421 (37.8) | <0.001 |
Nasal congestion | 85 (37.6) | 491 (44.0) | 0.075 |
No nasal discharge or congestion | 137 (60.6) | 592 (53.1) | 0.038 |
Postnasal drip sensation | 69 (30.5) | 359 (32.2) | 0.624 |
Rash | 7 (3.1) | 13 (1.2) | 0.029 |
Sneezing | 7 (3.1) | 51 (4.6) | 0.320 |
Sore throat | 150 (66.4) | 705 (63.2) | 0.370 |
Swollen glands | 136 (60.2) | 411 (36.9) | <0.001 |
Hoarseness | 12 (5.3) | 67 (6.0) | 0.684 |
Wheezing | 4 (1.8) | 27 (2.4) | 0.552 |
No cough | 159 (70.4) | 654 (58.7) | 0.001 |
Sinus pressure | 36 (15.9) | 258 (23.1) | 0.017 |
Clinician documented signs (N, %) | |||
Lymphadenopathy | |||
Anterior cervical | 183 (81.0) | 850 (76.2) | 0.122 |
Posterior cervical | 113 (50.0) | 179 (16.1) | <0.001 |
Occipital | 7 (3.1) | 3 (0.27) | <0.001 |
Posterior auricular | 5 (2.2) | 5 (0.45) | 0.005 |
Preauricular | 9 (4.0) | 13 (1.2) | 0.002 |
Supraclavicular | 2 (0.89) | 1 (0.09) | 0.021 |
Sublingual | 5 (2.2) | 6 (0.54) | 0.011 |
Submental | 5 (2.2) | 9 (0.81) | 0.058 |
Any lymphadenopathy other than anterior cervical | 123 (54.4) | 201 (18.0) | <0.001 |
Pharyngeal erythema | 7 (3.1) | 46 (4.1) | 0.469 |
Tonsillar erythema | 86 (38.1) | 374 (33.5) | 0.193 |
Tonsillar enlargement | 123 (54.4) | 597 (53.5) | 0.808 |
Tonsillar exudate | 82 (36.3) | 159 (14.3) | <0.001 |
Fever (temperature ≥100.04°F) | 5 (2.2) | 58 (5.2) | 0.053 |
Haematologic parameters: n/total (%) | |||
Lymphocyte count | |||
≥3.0 × 10⁹/L | 173/216 (80.1) | 69/1,075 (6.4) | <0.001 |
≥4.0 × 10⁹/L | 148/216 (68.5) | 13/1,075 (1.2) | <0.001 |
Lymphocyte percent | |||
>20% | 189/217 (87.1) | 524/1,076 (48.7) | <0.001 |
>30% | 136/217 (62.7) | 203/1,076 (18.9) | <0.001 |
>40% | 72/217 (33.2) | 41/1,076 (3.8) | <0.001 |
Atypical lymphocyte percent | |||
≥5% | 196/217 (90.3) | 107/1,077 (9.9) | <0.001 |
≥10% | 174/217 (80.2) | 33/1,077 (3.1) | <0.001 |
≥20% | 107/217 (49.3) | 4/1,077 (0.37) | <0.001 |
Neutrophils count <8.0 × 10⁹/L | 209/216 (96.8) | 744/1,075 (69.2) | <0.001 |
Neutrophils <60% | 196/216 (90.7) | 247/1,075 (23.0) | <0.001 |
Association between clinical characteristics of patients in the derivation cohort and infection with IM.
. | IM (N, %) . | Not IM (N, %) . | P . |
---|---|---|---|
Total patients | 226 | 1,115 | |
Patient reported symptoms (N, %) | |||
Diarrhoea | 5 (2.2) | 44 (4.0) | 0.205 |
Nausea | 25 (11.1) | 142 (12.7) | 0.487 |
Vomiting | 11 (4.9) | 42 (3.8) | 0.439 |
Fatigue | 105 (46.5) | 454 (40.7) | 0.110 |
Fever (subjective) | 98 (43.4) | 393 (35.3) | 0.021 |
Chills | 86 (38.0) | 382 (34.3) | 0.275 |
Headache | 76 (33.6) | 437 (39.2) | 0.117 |
Joint pain | 1 (0.44) | 37 (3.3) | 0.018 |
No myalgias | 192 (85.0) | 814 (73.0) | <0.001 |
Nasal discharge | 57 (25.2) | 421 (37.8) | <0.001 |
Nasal congestion | 85 (37.6) | 491 (44.0) | 0.075 |
No nasal discharge or congestion | 137 (60.6) | 592 (53.1) | 0.038 |
Postnasal drip sensation | 69 (30.5) | 359 (32.2) | 0.624 |
Rash | 7 (3.1) | 13 (1.2) | 0.029 |
Sneezing | 7 (3.1) | 51 (4.6) | 0.320 |
Sore throat | 150 (66.4) | 705 (63.2) | 0.370 |
Swollen glands | 136 (60.2) | 411 (36.9) | <0.001 |
Hoarseness | 12 (5.3) | 67 (6.0) | 0.684 |
Wheezing | 4 (1.8) | 27 (2.4) | 0.552 |
No cough | 159 (70.4) | 654 (58.7) | 0.001 |
Sinus pressure | 36 (15.9) | 258 (23.1) | 0.017 |
Clinician documented signs (N, %) | |||
Lymphadenopathy | |||
Anterior cervical | 183 (81.0) | 850 (76.2) | 0.122 |
Posterior cervical | 113 (50.0) | 179 (16.1) | <0.001 |
Occipital | 7 (3.1) | 3 (0.27) | <0.001 |
Posterior auricular | 5 (2.2) | 5 (0.45) | 0.005 |
Preauricular | 9 (4.0) | 13 (1.2) | 0.002 |
Supraclavicular | 2 (0.89) | 1 (0.09) | 0.021 |
Sublingual | 5 (2.2) | 6 (0.54) | 0.011 |
Submental | 5 (2.2) | 9 (0.81) | 0.058 |
Any lymphadenopathy other than anterior cervical | 123 (54.4) | 201 (18.0) | <0.001 |
Pharyngeal erythema | 7 (3.1) | 46 (4.1) | 0.469 |
Tonsillar erythema | 86 (38.1) | 374 (33.5) | 0.193 |
Tonsillar enlargement | 123 (54.4) | 597 (53.5) | 0.808 |
Tonsillar exudate | 82 (36.3) | 159 (14.3) | <0.001 |
Fever (temperature ≥100.04°F) | 5 (2.2) | 58 (5.2) | 0.053 |
Haematologic parameters: n/total (%) | |||
Lymphocyte count | |||
≥3.0 × 10⁹/L | 173/216 (80.1) | 69/1,075 (6.4) | <0.001 |
≥4.0 × 10⁹/L | 148/216 (68.5) | 13/1,075 (1.2) | <0.001 |
Lymphocyte percent | |||
>20% | 189/217 (87.1) | 524/1,076 (48.7) | <0.001 |
>30% | 136/217 (62.7) | 203/1,076 (18.9) | <0.001 |
>40% | 72/217 (33.2) | 41/1,076 (3.8) | <0.001 |
Atypical lymphocyte percent | |||
≥5% | 196/217 (90.3) | 107/1,077 (9.9) | <0.001 |
≥10% | 174/217 (80.2) | 33/1,077 (3.1) | <0.001 |
≥20% | 107/217 (49.3) | 4/1,077 (0.37) | <0.001 |
Neutrophils count <8.0 × 10⁹/L | 209/216 (96.8) | 744/1,075 (69.2) | <0.001 |
Neutrophils <60% | 196/216 (90.7) | 247/1,075 (23.0) | <0.001 |
. | IM (N, %) . | Not IM (N, %) . | P . |
---|---|---|---|
Total patients | 226 | 1,115 | |
Patient reported symptoms (N, %) | |||
Diarrhoea | 5 (2.2) | 44 (4.0) | 0.205 |
Nausea | 25 (11.1) | 142 (12.7) | 0.487 |
Vomiting | 11 (4.9) | 42 (3.8) | 0.439 |
Fatigue | 105 (46.5) | 454 (40.7) | 0.110 |
Fever (subjective) | 98 (43.4) | 393 (35.3) | 0.021 |
Chills | 86 (38.0) | 382 (34.3) | 0.275 |
Headache | 76 (33.6) | 437 (39.2) | 0.117 |
Joint pain | 1 (0.44) | 37 (3.3) | 0.018 |
No myalgias | 192 (85.0) | 814 (73.0) | <0.001 |
Nasal discharge | 57 (25.2) | 421 (37.8) | <0.001 |
Nasal congestion | 85 (37.6) | 491 (44.0) | 0.075 |
No nasal discharge or congestion | 137 (60.6) | 592 (53.1) | 0.038 |
Postnasal drip sensation | 69 (30.5) | 359 (32.2) | 0.624 |
Rash | 7 (3.1) | 13 (1.2) | 0.029 |
Sneezing | 7 (3.1) | 51 (4.6) | 0.320 |
Sore throat | 150 (66.4) | 705 (63.2) | 0.370 |
Swollen glands | 136 (60.2) | 411 (36.9) | <0.001 |
Hoarseness | 12 (5.3) | 67 (6.0) | 0.684 |
Wheezing | 4 (1.8) | 27 (2.4) | 0.552 |
No cough | 159 (70.4) | 654 (58.7) | 0.001 |
Sinus pressure | 36 (15.9) | 258 (23.1) | 0.017 |
Clinician documented signs (N, %) | |||
Lymphadenopathy | |||
Anterior cervical | 183 (81.0) | 850 (76.2) | 0.122 |
Posterior cervical | 113 (50.0) | 179 (16.1) | <0.001 |
Occipital | 7 (3.1) | 3 (0.27) | <0.001 |
Posterior auricular | 5 (2.2) | 5 (0.45) | 0.005 |
Preauricular | 9 (4.0) | 13 (1.2) | 0.002 |
Supraclavicular | 2 (0.89) | 1 (0.09) | 0.021 |
Sublingual | 5 (2.2) | 6 (0.54) | 0.011 |
Submental | 5 (2.2) | 9 (0.81) | 0.058 |
Any lymphadenopathy other than anterior cervical | 123 (54.4) | 201 (18.0) | <0.001 |
Pharyngeal erythema | 7 (3.1) | 46 (4.1) | 0.469 |
Tonsillar erythema | 86 (38.1) | 374 (33.5) | 0.193 |
Tonsillar enlargement | 123 (54.4) | 597 (53.5) | 0.808 |
Tonsillar exudate | 82 (36.3) | 159 (14.3) | <0.001 |
Fever (temperature ≥100.04°F) | 5 (2.2) | 58 (5.2) | 0.053 |
Haematologic parameters: n/total (%) | |||
Lymphocyte count | |||
≥3.0 × 10⁹/L | 173/216 (80.1) | 69/1,075 (6.4) | <0.001 |
≥4.0 × 10⁹/L | 148/216 (68.5) | 13/1,075 (1.2) | <0.001 |
Lymphocyte percent | |||
>20% | 189/217 (87.1) | 524/1,076 (48.7) | <0.001 |
>30% | 136/217 (62.7) | 203/1,076 (18.9) | <0.001 |
>40% | 72/217 (33.2) | 41/1,076 (3.8) | <0.001 |
Atypical lymphocyte percent | |||
≥5% | 196/217 (90.3) | 107/1,077 (9.9) | <0.001 |
≥10% | 174/217 (80.2) | 33/1,077 (3.1) | <0.001 |
≥20% | 107/217 (49.3) | 4/1,077 (0.37) | <0.001 |
Neutrophils count <8.0 × 10⁹/L | 209/216 (96.8) | 744/1,075 (69.2) | <0.001 |
Neutrophils <60% | 196/216 (90.7) | 247/1,075 (23.0) | <0.001 |
Development of the risk scores
Table 2 summarizes the 2 multivariate models developed using the derivation cohort to predict the likelihood of IM. Calibration of both models was excellent based on the calibration belt plots shown in Fig. 1.
. | β regression coefficient . | Points assigned . |
---|---|---|
IM-NoLab model and risk score | ||
Tonsillar exudate | 0.8584 | 1 |
Lymphadenopathy excluding anterior cervical | 1.4758 | 2 |
Absence of myalgias | 0.8524 | 1 |
Patient report of “swollen glands” | 0.6571 | 1 |
Constant | −3.3063 | |
IM-Lab model and risk score | ||
Atypical lymphocytes | ||
>5% to 10% | 2.0443 | 1.5 |
>10% | 4.2779 | 3 |
≥40% lymphocytes | 1.3642 | 1 |
Absolute lymphocyte count ≥4.0 | 2.3806 | 1.5 |
Constant | −4.0209 |
. | β regression coefficient . | Points assigned . |
---|---|---|
IM-NoLab model and risk score | ||
Tonsillar exudate | 0.8584 | 1 |
Lymphadenopathy excluding anterior cervical | 1.4758 | 2 |
Absence of myalgias | 0.8524 | 1 |
Patient report of “swollen glands” | 0.6571 | 1 |
Constant | −3.3063 | |
IM-Lab model and risk score | ||
Atypical lymphocytes | ||
>5% to 10% | 2.0443 | 1.5 |
>10% | 4.2779 | 3 |
≥40% lymphocytes | 1.3642 | 1 |
Absolute lymphocyte count ≥4.0 | 2.3806 | 1.5 |
Constant | −4.0209 |
. | β regression coefficient . | Points assigned . |
---|---|---|
IM-NoLab model and risk score | ||
Tonsillar exudate | 0.8584 | 1 |
Lymphadenopathy excluding anterior cervical | 1.4758 | 2 |
Absence of myalgias | 0.8524 | 1 |
Patient report of “swollen glands” | 0.6571 | 1 |
Constant | −3.3063 | |
IM-Lab model and risk score | ||
Atypical lymphocytes | ||
>5% to 10% | 2.0443 | 1.5 |
>10% | 4.2779 | 3 |
≥40% lymphocytes | 1.3642 | 1 |
Absolute lymphocyte count ≥4.0 | 2.3806 | 1.5 |
Constant | −4.0209 |
. | β regression coefficient . | Points assigned . |
---|---|---|
IM-NoLab model and risk score | ||
Tonsillar exudate | 0.8584 | 1 |
Lymphadenopathy excluding anterior cervical | 1.4758 | 2 |
Absence of myalgias | 0.8524 | 1 |
Patient report of “swollen glands” | 0.6571 | 1 |
Constant | −3.3063 | |
IM-Lab model and risk score | ||
Atypical lymphocytes | ||
>5% to 10% | 2.0443 | 1.5 |
>10% | 4.2779 | 3 |
≥40% lymphocytes | 1.3642 | 1 |
Absolute lymphocyte count ≥4.0 | 2.3806 | 1.5 |
Constant | −4.0209 |

Calibration belt plots of observed expected for the full multivariate models in the derivation cohort are shown for the model used to develop the IM-NoLab risk score (a) and the IM-Lab risk score (b).
In the IM-NoLab model with only signs and symptoms, 4 predictors were included (tonsillar exudate, lymphadenopathy in a location other than the anterior cervical region, absence of myalgias, and patient report of “swollen glands”). In the derivation cohort, the AUROCC of the full model was 0.751 and of the risk score was 0.759.
After adding haematologic parameters, none of the signs and symptoms were selected into the model. Adding haematologic parameters as predictors to the model, the independent variables in the final model included >5% to 10% atypical lymphocytosis, >10% atypical lymphocytes, ≥40% lymphocytes, and absolute lymphocyte count ≥4.0. The AUROCC for the laboratory-based multivariate model is 0.951, and for the IM-Lab risk score is 0.974.
Validation of the risk scores
The classification accuracy of both risk scores in both derivation and validation groups is summarized in Table 3. The IM-NoLab score classified approximately half of patients in the low-risk group and performed similarly in the derivation and prospective temporal validation cohorts. The IM-Lab score classified almost 3 quarters of patients in a low-risk group and also performed similarly between derivation and prospective temporal validation cohorts.
. | Risk group (points) . | # with IM/total in risk group (%) . | Likelihood ratio . |
---|---|---|---|
IM-NoLab risk score | |||
Derivation cohort | Low risk (0–1 points) | 51/677 (7.6%) | 0.40 |
High risk (2+ points) | 175/664 (27.0%) | 1.77 | |
Validation cohort | Low risk (0–1 points) | 29/365 (7.9%) | 0.46 |
High risk (2+ points) | 101/455 (22.2%) | 1.51 | |
IM-Lab risk score | |||
Derivation cohort | Low risk (0 points) | 14/954 (1.5%) | 0.07 |
Moderate risk (1–2.5 points) | 20/118 (16.9%) | 1.02 | |
High risk (3+ points) | 182/219 (83.1%) | 24.5 | |
Validation cohort | Low risk (0 points) | 22/611 (3.6%) | 0.19 |
Moderate risk (1–2.5 points) | 11/88 (12.5%) | 0.71 | |
High risk (3+ points) | 92/105 (87.6%) | 35.2 |
. | Risk group (points) . | # with IM/total in risk group (%) . | Likelihood ratio . |
---|---|---|---|
IM-NoLab risk score | |||
Derivation cohort | Low risk (0–1 points) | 51/677 (7.6%) | 0.40 |
High risk (2+ points) | 175/664 (27.0%) | 1.77 | |
Validation cohort | Low risk (0–1 points) | 29/365 (7.9%) | 0.46 |
High risk (2+ points) | 101/455 (22.2%) | 1.51 | |
IM-Lab risk score | |||
Derivation cohort | Low risk (0 points) | 14/954 (1.5%) | 0.07 |
Moderate risk (1–2.5 points) | 20/118 (16.9%) | 1.02 | |
High risk (3+ points) | 182/219 (83.1%) | 24.5 | |
Validation cohort | Low risk (0 points) | 22/611 (3.6%) | 0.19 |
Moderate risk (1–2.5 points) | 11/88 (12.5%) | 0.71 | |
High risk (3+ points) | 92/105 (87.6%) | 35.2 |
. | Risk group (points) . | # with IM/total in risk group (%) . | Likelihood ratio . |
---|---|---|---|
IM-NoLab risk score | |||
Derivation cohort | Low risk (0–1 points) | 51/677 (7.6%) | 0.40 |
High risk (2+ points) | 175/664 (27.0%) | 1.77 | |
Validation cohort | Low risk (0–1 points) | 29/365 (7.9%) | 0.46 |
High risk (2+ points) | 101/455 (22.2%) | 1.51 | |
IM-Lab risk score | |||
Derivation cohort | Low risk (0 points) | 14/954 (1.5%) | 0.07 |
Moderate risk (1–2.5 points) | 20/118 (16.9%) | 1.02 | |
High risk (3+ points) | 182/219 (83.1%) | 24.5 | |
Validation cohort | Low risk (0 points) | 22/611 (3.6%) | 0.19 |
Moderate risk (1–2.5 points) | 11/88 (12.5%) | 0.71 | |
High risk (3+ points) | 92/105 (87.6%) | 35.2 |
. | Risk group (points) . | # with IM/total in risk group (%) . | Likelihood ratio . |
---|---|---|---|
IM-NoLab risk score | |||
Derivation cohort | Low risk (0–1 points) | 51/677 (7.6%) | 0.40 |
High risk (2+ points) | 175/664 (27.0%) | 1.77 | |
Validation cohort | Low risk (0–1 points) | 29/365 (7.9%) | 0.46 |
High risk (2+ points) | 101/455 (22.2%) | 1.51 | |
IM-Lab risk score | |||
Derivation cohort | Low risk (0 points) | 14/954 (1.5%) | 0.07 |
Moderate risk (1–2.5 points) | 20/118 (16.9%) | 1.02 | |
High risk (3+ points) | 182/219 (83.1%) | 24.5 | |
Validation cohort | Low risk (0 points) | 22/611 (3.6%) | 0.19 |
Moderate risk (1–2.5 points) | 11/88 (12.5%) | 0.71 | |
High risk (3+ points) | 92/105 (87.6%) | 35.2 |
The AUROCC for the IM-NoLab risk score was slightly lower at 0.690 in the validation cohort than in the derivation cohort, while that for the IM-Lab risk score remained high at 0.934 in the validation group. Receiver operator characteristic curves for both risk scores in the derivation and validation groups are shown in Fig. 2.

Receiver operating characteristic curves are shown for the IM-NoLab score in the derivation cohort (a), the IM-NoLab score in the validation cohort (b), the IM-Lab score in the derivation cohort (c), and the IM-Lab score in the validation cohort (d).
Use of the rules in series
The CPRs could be used in series. For example, a complete blood count to calculate the IM-Lab score could only be ordered on those classified as high risk by the IM-NoLab score. In the validation cohort among IM-NoLab high-risk patients, the probability of IM was 4.7% in those who were low risk by the IM-Lab score, 15% who were moderate risk, and 89% for those who were high risk by the IM-Lab score.
Discussion
To the best of our knowledge, this is the first study to develop and internally validate CPRs for the diagnosis of IM. We developed 2 risk scores one of which requires only clinical symptoms and signs (IM-NoLab), and another that requires haematologic parameters commonly available in most clinical settings (IM-Lab). The IM-Lab risk score was more accurate than the IM-NoLab score (AUROCC 0.934 for the IM-Lab score and 0.69 for the IM-NoLab score), although external validation using populations from other locations or with different age groups would be desirable.
The IM-NoLab risk score placed approximately half of patients in the low-risk group, while the IM-Lab risk score placed 74% in the low-risk group. The percentage with IM in the low-risk group was 3.6% for the IM-Lab score and 7.9% for the IM-NoLab risk score in the prospective temporal validation group. This prevalence of disease is well below our prespecified 10% test threshold. These patients are at low enough risk for IM that they would not require a heterophile antibody test at the initial visit unless they had a known exposure to someone with IM, or for another reason were felt to be at increased risk for IM or for its consequences (e.g. an athlete). If these low-risk patients had persistent symptoms, they could be retested at a follow-up visit.
Patients in the moderate- and high-risk groups have a high enough risk for IM that they should receive heterophile antibody testing at the initial visit. If this is negative, given their high baseline risk they could either be retested in 5–7 days or have a VCA-IgM test. Using this strategy would significantly reduce the need for heterophile antibody testing at the initial visit, saving money and improving efficiency in the clinic. Further study is needed to evaluate the impact of care guided by the IM-Lab and/or IM-NoLab risk scores on patient-oriented outcomes and cost.
As noted earlier, the clinical rules could be used in series: first the IM-NoLab score, then the IM-Lab score. This would reduce the need for laboratory testing and improve clinical efficiency.
Strengths and limitations
Strengths of our study include the large sample size and a university setting that is a common site for the diagnosis of IM. Both risk scores had good to excellent discrimination and calibration, and both risk scores classified a large number of patients as low risk, with the potential to reduce cost and improve efficiency. The prospective temporal validation, albeit in the same health center, adds strength over a random split-sample validation, especially in light of the observed changes in the prevalence of signs and symptoms between derivation and validation cohorts (Appendix Table 1).
A potential limitation is that the study population was limited to college students, with most being age 18–23 years. On the other hand, this is an age where IM is most common. The study population was also limited to persons for whom their treating clinician had a high enough index of suspicion for IM to order a heterophile antibody test. While a broader population of all persons with sore throat might have been preferable, the risk of IM in a population where clinicians experienced in the care of young adults felt it was unlikely would be quite low. Finally, the symptoms were based on self-report via a portal for the EHR, and the signs were based on a physician checking a box for that sign in the EHR.
Conclusion
The IM-NoLab and IM-Lab risk scores provide a potentially useful tool for clinicians to improve the diagnosis of IM. Both of the scores were performed well in the prospective internal temporal validation, and have the potential to reduce the need for diagnostic testing and improve diagnostic accuracy during the early phase of the disease when sensitivity of heterophile antibody tests is lower.
Funding
This study was not externally funded.
Ethical approval
The University of Georgia’s Institution Review Board (IRB) approved this project on 12 September 2019 (#PROJECT00001026). It was deemed to be exempt research as the data were deidentified and had been previously collected and extracted retrospectively from electronic health record system.
Conflict of interest
None declared.
Data availability
The data underlying this article will be shared on reasonable request to the corresponding author (ebell@uga.edu).