Abstract

Aims

This study aimed to develop and apply natural language processing (NLP) algorithms to identify recurrent atrial fibrillation (AF) episodes following rhythm control therapy initiation using electronic health records (EHRs).

Methods and results

We included adults with new-onset AF who initiated rhythm control therapies (ablation, cardioversion, or antiarrhythmic medication) within two US integrated healthcare delivery systems. A code-based algorithm identified potential AF recurrence using diagnosis and procedure codes. An automated NLP algorithm was developed and validated to capture AF recurrence from electrocardiograms, cardiac monitor reports, and clinical notes. Compared with the reference standard cases confirmed by physicians’ adjudication, the F-scores, sensitivity, and specificity were all above 0.90 for the NLP algorithms at both sites. We applied the NLP and code-based algorithms to patients with incident AF (= 22 970) during the 12 months after initiating rhythm control therapy. Applying the NLP algorithms, the percentages of patients with AF recurrence for sites 1 and 2 were 60.7% and 69.9% (ablation), 64.5% and 73.7% (cardioversion), and 49.6% and 55.5% (antiarrhythmic medication), respectively. In comparison, the percentages of patients with code-identified AF recurrence for sites 1 and 2 were 20.2% and 23.7% for ablation, 25.6% and 28.4% for cardioversion, and 20.0% and 27.5% for antiarrhythmic medication, respectively.

Conclusion

When compared with a code-based approach alone, this study's high-performing automated NLP method identified significantly more patients with recurrent AF. The NLP algorithms could enable efficient evaluation of treatment effectiveness of AF therapies in large populations and help develop tailored interventions.

Development and application of NLP algorithms to identify AF recurrences in EHRs.
Graphical Abstract

Development and application of NLP algorithms to identify AF recurrences in EHRs.

Introduction

Atrial fibrillation (AF) is the most common clinically significant arrhythmia and is associated with excess morbidity and mortality.1–3 Standard AF rhythm control treatment options include antiarrhythmic medications and procedures such as cardioversion and catheter ablation.1–3 However, these therapies have shown varied success rates in randomized controlled trials, with rates of documented recurrent AF within 12 months remaining high.4,5 Furthermore, there is a need to efficiently compare the effectiveness of AF rhythm control treatment options in large real-world populations.6–8

Identifying recurrent AF episodes after starting rhythm control therapies using electronic health records (EHRs) has been challenging. For instance, most prospective cohort studies used protocol-driven electrocardiograms (ECGs) and Holter monitors to assess for recurrent AF at predefined intervals (e.g. at 3, 6, and 12 months).4,9,10 To obtain similar information from retrospective studies using real-world data would traditionally require manual chart reviews of a large number of cases, which is not usually feasible. Instead of manual chart reviews, other commonly used methods to identify recurrent AF cases using structured data from EHRs include using a combination of medication, diagnosis, and procedure codes; however, the sensitivity and specificity of such code-based methods are limited by the lack of specific codes for recurrent AF.11,12

Natural language processing (NLP), a subfield of artificial intelligence, has been increasingly used to ascertain information from unstructured data in the EHR without resource-intensive chart reviews. Recently, Shah et al. used NLP to identify patients with a history of AF by extracting clinical text data.13 The NLP algorithm showed a limited performance with an F-score of 0.8, a sensitivity of 0.97, and a specificity of 0.63. However, the algorithm was limited to determining whether or not individuals have any evidence of AF.

The goal of the current study was to derive and validate an NLP algorithm to identify episodes of recurrent AF or atrial flutter within 12 months after the initiation of a rhythm control strategy. We developed and applied code-based, NLP-based, and combined (both code-based and NLP-based) algorithms and compared the frequency of AF recurrence using these different algorithms.

Methods

Setting

This retrospective cohort study was conducted at Kaiser Permanente Southern California (KPSC, site 1) and Northern California (KPNC, site 2), two integrated healthcare delivery systems that provide comprehensive care to >9 million racially, ethnically, and socioeconomically diverse members at their 36 hospitals and 493 medical offices throughout California. The prepaid health plans provide strong incentives for members to use services at Kaiser Permanente facilities. Kaiser Permanente HealthConnect®, a comprehensive EHR system based on Epic®, stores nearly all aspects of the care provided to members, including medical encounters, diagnoses, procedures, laboratory tests, pharmacy utilization, membership history, and billing claims. This study used structured data and free-text clinical notes from the EHR.

The KPNC and KPSC institutional review boards approved this study. Given the observational nature of the study, patient informed consent was waived.

Study population

We identified members aged 21 years or older with newly diagnosed AF between 1 January 2010 and 31 December 2017, and who initiated rhythm control therapy between 1 January 2010 and 31 December 2018. AF diagnoses were based on the International Classification of Diseases, Ninth and Tenth Edition (ICD-9 and ICD-10) codes (ICD-9427.31, and ICD-10 I48.0, I48.1, I48.2, and I48.91 in any position of the primary diagnosis) from inpatient stays, emergency department encounters, and outpatient visits. To identify incident AF, we excluded patients with a diagnosis code of AF up to 5 years before enrollment into the study. This method was validated in both sites and was found to have a positive predictive value of 96%.14 The index date is the date of the first AF diagnosis.

For the present analyses, we included patients receiving antiarrhythmic drugs (AADs) or rhythm control procedures after the first AF diagnosis. We identified patients who filled prescriptions for AADs (i.e. amiodarone, sotalol, disopyramide, flecainide, propafenone, dronedarone, dofetilide, quinidine, and ibutilide) or patients who underwent targeted AF procedures (i.e. catheter or surgical ablation, cardioversion, and pacemaker implantation + nodal ablation for rhythm control) using ICD-9/10, Current Procedural Terminology (CPT) codes, or health plan registry data. We selected patients who received an AF procedure (with/without AADs) or had AADs without an AF procedure. Patients were followed for 12 months following the first AF treatment (a procedure or medication dispensing) date. Detailed exclusion criteria are available in Supplementary material online, Table S1.

Demographic and clinical variables

We obtained demographic information including age, self-reported sex, self-reported race, and self-reported ethnicity. Socioeconomic status (educational attainment and annual household income) was obtained by linking the patient's residence and geocoded Census data. The most proximal body mass index, laboratory, and vital signs before the patient's presumed incident AF index date were collected from ambulatory encounter data. Baseline medical history was ascertained using 5 years of diagnosis and procedure data before each patient's index AF diagnosis date. Baseline medication use was ascertained using 12 months of pharmacy dispensing data before the index AF diagnosis date. We used procedure codes, diagnosis codes, and registry data to define code-based recurrent AF.

Free text data sources for NLP development and validation

We identified and extracted three types of free text data that were likely to contain AF recurrence information. First, 12-lead ECG reports were extracted from the GE Healthcare MUSE Cardiology Information System at site 1. In site 2, we searched ECG-related keywords in EHR data sources to retrieve presumptive ECG reports (see Supplementary material online, Text S1). Second, we collected cardiac monitoring reports from implanted and wearable cardiac rhythm monitors. These included reports from Holter, patch recorder, cardiac event monitor, pacemaker, loop recorder, implantable cardioverter defibrillator, and cardiac telemetry. These cardiac rhythm monitor reports were not consistently collected and stored in the EHR. In site 1, cardiac monitor reports were stored as procedure reports, or as telephone encounter notes. In site 2, cardiac monitor reports were stored with other general clinical notes (see Supplementary material online, Text S1). Lastly, we included clinical notes from inpatient and outpatient encounters based on the visiting department (e.g. cardiology, family practice), authorizing provider (e.g. physician, physician assistant), and note type (e.g. progress note, discharge summary) that were likely to include assessments of the patient's AF status (details are available in Supplementary material online, Table S2).

We defined AF recurrence as a newly detected AF event documented in an ECG or cardiac monitor report, or a current or recently occurred AF event as documented in the clinical note. In clinical notes, the occurrence of an AF event was based on the objective findings documented in the Physical Exam section (e.g. irregular rhythm) or based on the assessment of contextual information such as temporal references surrounding the AF documentation.

Overall natural language processing algorithm development process

NLP algorithms were first built using training data from site 1 (Figure 1). We developed separate algorithms for the three types of clinical documents (ECG reports, cardiac monitoring reports, and clinical notes). Each document was categorized by the NLP algorithms as having a positive or negative AF recurrence. After building the initial algorithms, the algorithms were tested using development data. We refined the algorithms until we reached prespecified performance levels for positive predictive value (PPV) and negative predictive value (NPV) >90%. Once algorithm performance targets were reached in site 1, the algorithms were tested using development data from site 2. The algorithms were further refined to meet the prespecified performance thresholds at both sites. The final algorithms were validated at both sites using the validation datasets.

Natural language processing algorithm development process.
Figure 1

Natural language processing algorithm development process.

Training, development, and validation datasets

For site 1, all the ECGs, cardiac monitoring reports, and clinical notes during the 12 months following the first AF treatment date were included as the training, development, and validation datasets. We first created validation datasets (n = 100) at each site using random stratified sampling, dividing them equally between those who started antiarrhythmic drugs without an AF procedure and those who underwent an AF procedure (Figure 1). From the remaining cases, we created the development datasets (n = 1000) at both sites, which were reserved to test and refine the algorithms. The remaining cases from site 1 (n = 9368) served as the training dataset to build the initial NLP algorithms at site 1.

NLP algorithm development

We developed the initial NLP algorithms at site 1.15–20 The steps for pre-processing text and developing terminology were described in the Supplementary material online, Text S2. We created the rule-based NLP algorithms with the Linguamatics I2E software (Linguamatics, an IQVIA company, Cambridge, United Kingdom), which allows for the rapid extraction of unstructured data via knowledge integration and parallel indexing and searching via rules-based search algorithms. The NLP algorithms were developed to search each indexed note at different levels: section (e.g. ‘Physical Exam’, ‘Assessment/Plan’), intra-sentence, and cross-sentence. A distance-based relationship algorithm was applied to identify related terms based on the number of words or sentences between them. The relationship search identified the words or phrases (e.g. negated, uncertain, and hypothetical statements) that modify the ascertainment of interested concepts (e.g. AF, irregular heart rhythm). These terms were based on the ‘Context Terms’ ontology provided by I2E software. Examples of the relationship search are provided in the Supplementary material online, Text S2. The temporal relationship algorithm identified past AF recurrence events (e.g. ‘3 months ago’, ‘history of’, and ‘after 12/19/14 admission for atrial fibrillation’). We tested and updated the algorithms based on the adjudicated medical records performed by four physicians on our study team during the algorithm development process.

NLP algorithm validation

The same physicians who manually adjudicated the initial NLP results during the algorithm development also independently reviewed the free-text data in the validation datasets. Two cardiologists (M.S.L., C.C.) independently reviewed 451 reports from site 1 to determine true recurrent AF cases. The results of the two cardiologists’ reviews were compared, and conflicts were resolved through discussions and consensus. In site 2, two internists (B.L., N.B.) reviewed 826 reports using the same way as in site 1. All adjudicators were blinded to the NLP results. The adjudicated results served as the reference standard against the NLP algorithms. We evaluated the performance of the final NLP algorithms against the reference standard cases (n = 100 for each site) created by physician reviewers.

Definition of recurrent AF for the patient-level analysis

We applied the final NLP-based algorithms along with code-based algorithms to the entire cohort of patients with incident AF initiating rhythm control therapy and identified definite and probable AF recurrence during the subsequent 12 months. In the code-based algorithm, a definite AF recurrence was defined by CPT codes or health plan registry data for ablation or cardioversion, which is considered a repeated procedure after their index procedure. A probable AF recurrence in the code-based algorithm was defined by hospitalization with a primary diagnosis of AF or receipt of pacemaker implantation in combination with nodal ablation. In the NLP-based algorithm, a definite AF recurrence was defined as an AF event identified from ECG or cardiac monitor reports. A probable AF recurrence was defined as an AF event identified only from physician clinical notes.

Statistical analysis

We used descriptive statistics for patient demographic and clinical characteristics using means (SD) for continuous variables, and frequencies (percentages) for categorical variables.

To calculate the NLP algorithm performance measures, we first determined the counts of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) identified by each NLP algorithm per type of clinical document and site. Then, we calculated the sensitivity, specificity, PPV, NPV, and F-score21 (also called the F1 score when |$b{\rm{\ is\ }}1$|⁠, which is the harmonic mean of sensitivity and PPV) and 95% confidence intervals. The F-score ranges from 0 to 1, with higher values indicating better predictions and a value of 1 indicating perfect classification.

We reported the numbers (percentages) of patients with visit reports and AF recurrence by applying code-based, NLP-based, and combined (both code-based and NLP-based) algorithms for each rhythm control treatment type (ablation, antiarrhythmic medication, cardioversion, and pacemaker) and per site. The primary analysis evaluated AF recurrence during the first 12-month post-treatment period. In a sensitivity analysis, we evaluated AF recurrence during the first 3 months, and months 4–12.22 SAS version 9.4 (SAS Institute, Cary, NC, USA) was used for data analysis.

Results

Study population

We identified 22 970 eligible patients with incident AF. The mean ± SD age of the cohort was 70.9 ± 11.0 years, 42.2% were women, 80.4% were White, 6.1% were Black, 8.8% were Asian or Pacific Islander, and 11.9% were of Hispanic ethnicity (Table 1). A sizable proportion of patients had comorbidities of dyslipidemia (80.7%), hypertension (78.2%), diabetes mellitus (33.5%), chronic heart failure (21.5%), and ischemic stroke or transient ischemic attack (5.1%). Both sites had similar patient demographic and clinical characteristics (see Table 1 and Supplementary material online, Table S3).

Table 1

Characteristics of patients with incident atrial fibrillation who received rhythm control therapy

OverallSite 1Site 2
Characteristics(n = 22 970)(n = 10 468)(n = 12 502)
Mean (SD) age, years70.9 (11.0)70.8 (11.2)71.0 (10.8)
Age group, years, n (%)
 <656011 (26.2)2788 (26.6)3223 (25.8)
 65–747924 (34.5)3577 (34.2)4347 (34.8)
 75–846736 (29.3)3036 (29.0)3700 (29.6)
 ≥852299 (10.0)1067 (10.2)1232 (9.9)
Women, n (%)9702 (42.2)4352 (41.6)5350 (42.8)
Race, n (%)
 White18 477 (80.4)8360 (79.9)10 117 (80.9)
 Black1392 (6.1)786 (7.5)606 (4.8)
 Asian or Pacific Islander2030 (8.8)773 (7.4)1257 (10.1)
 Other204 (0.9)124 (1.2)80 (0.6)
 Unknown867 (3.8)425 (4.1)442 (3.5)
Hispanic ethnicity, n (%)2724 (11.9)1593 (15.2)1131 (9.0)
Low educational attainment, n (%)3535 (15.4)2062 (19.7)1473 (11.8)
Annual household income <$50 000, n (%)5180 (22.6)2740 (26.2)2440 (19.5)
Index Year
 2010–115364 (23.4)2394 (22.9)2970 (23.8)
 2012–135963 (26.0)2685 (25.6)3278 (26.2)
 2014–155875 (25.6)2681 (25.6)3194 (25.5)
 2016–175768 (25.1)2708 (25.9)3060 (24.5)
Body mass index, kg/m2
 <18.5273 (1.2)128 (1.2)145 (1.2)
 18.5–25.05353 (23.3)2389 (22.8)2964 (23.7)
 25.0–25.97879 (34.3)3621 (34.6)4258 (34.1)
 30.0–39.97581 (33.0)3487 (33.3)4094 (32.7)
 ≥401835 (8.0)828 (7.9)1007 (8.1)
 Unknown49 (0.2)15 (0.1)34 (0.3)
Medical history, n (%)
 Dyslipidemia18 528 (80.7)8711 (83.2)9817 (78.5)
 Hypertension17 961 (78.2)8260 (78.9)9701 (77.6)
 Current or former smoker12 081 (52.6)5468 (52.2)6613 (52.9)
 Chronic lung disease8097 (35.3)3827 (36.6)4270 (34.2)
 Diabetes mellitus7689 (33.5)3720 (35.5)3969 (31.7)
 Chronic heart failure5004 (21.8)2504 (23.9)2500 (20.0)
 Diagnosed depression4265 (18.6)2263 (21.6)2002 (16.0)
 Mitral and/or aortic valvular disease4017 (17.5)2063 (19.7)1954 (15.6)
 Cancer3850 (16.8)1785 (17.1)2065 (16.5)
 Percutaneous coronary intervention3122 (13.6)595 (5.7)2527 (20.2)
 Obstructive sleep apnoea2763 (12.0)1307 (12.5)1456 (11.6)
 Peripheral artery disease1873 (8.2)609 (5.8)1264 (10.1)
 Acute myocardial infarction1700 (7.4)885 (8.5)815 (6.5)
 Chronic liver disease1202 (5.2)689 (6.6)513 (4.1)
 Ischaemic stroke or transient ischemic attack1174 (5.1)586 (5.6)588 (4.7)
  Ischemic stroke640 (2.8)366 (3.5)274 (2.2)
  Transient ischemic attack676 (2.9)320 (3.1)356 (2.8)
 Coronary artery bypass surgery1046 (4.6)457 (4.4)589 (4.7)
 Venous thromboembolism956 (4.2)511 (4.9)445 (3.6)
 Diagnosed dementia630 (2.7)305 (2.9)325 (2.6)
Baseline medication use, n (%)
 β-Blocker12 521 (54.5)5458 (52.1)7063 (56.5)
 Calcium channel blocker6135 (26.7)2725 (26.0)3410 (27.3)
 Digoxin654 (2.8)317 (3.0)337 (2.7)
 Diuretic10 483 (45.6)4668 (44.6)5815 (46.5)
 Angiotensin-converting enzyme inhibitor9197 (40.0)4290 (41.0)4907 (39.2)
 Angiotensin II receptor blocker4349 (18.9)1908 (18.2)2441 (19.5)
 Aldosterone receptor antagonist852 (3.7)452 (4.3)400 (3.2)
 Statin13 983 (60.9)6465 (61.8)7518 (60.1)
 Non-statin lipid-lowering drug1169 (5.1)574 (5.5)595 (4.8)
 Warfarin2655 (11.6)1003 (9.6)1652 (13.2)
 Antiplatelet agent2103 (9.2)1036 (9.9)1067 (8.5)
 Direct oral anticoagulant244 (1.1)69 (0.7)175 (1.4)
OverallSite 1Site 2
Characteristics(n = 22 970)(n = 10 468)(n = 12 502)
Mean (SD) age, years70.9 (11.0)70.8 (11.2)71.0 (10.8)
Age group, years, n (%)
 <656011 (26.2)2788 (26.6)3223 (25.8)
 65–747924 (34.5)3577 (34.2)4347 (34.8)
 75–846736 (29.3)3036 (29.0)3700 (29.6)
 ≥852299 (10.0)1067 (10.2)1232 (9.9)
Women, n (%)9702 (42.2)4352 (41.6)5350 (42.8)
Race, n (%)
 White18 477 (80.4)8360 (79.9)10 117 (80.9)
 Black1392 (6.1)786 (7.5)606 (4.8)
 Asian or Pacific Islander2030 (8.8)773 (7.4)1257 (10.1)
 Other204 (0.9)124 (1.2)80 (0.6)
 Unknown867 (3.8)425 (4.1)442 (3.5)
Hispanic ethnicity, n (%)2724 (11.9)1593 (15.2)1131 (9.0)
Low educational attainment, n (%)3535 (15.4)2062 (19.7)1473 (11.8)
Annual household income <$50 000, n (%)5180 (22.6)2740 (26.2)2440 (19.5)
Index Year
 2010–115364 (23.4)2394 (22.9)2970 (23.8)
 2012–135963 (26.0)2685 (25.6)3278 (26.2)
 2014–155875 (25.6)2681 (25.6)3194 (25.5)
 2016–175768 (25.1)2708 (25.9)3060 (24.5)
Body mass index, kg/m2
 <18.5273 (1.2)128 (1.2)145 (1.2)
 18.5–25.05353 (23.3)2389 (22.8)2964 (23.7)
 25.0–25.97879 (34.3)3621 (34.6)4258 (34.1)
 30.0–39.97581 (33.0)3487 (33.3)4094 (32.7)
 ≥401835 (8.0)828 (7.9)1007 (8.1)
 Unknown49 (0.2)15 (0.1)34 (0.3)
Medical history, n (%)
 Dyslipidemia18 528 (80.7)8711 (83.2)9817 (78.5)
 Hypertension17 961 (78.2)8260 (78.9)9701 (77.6)
 Current or former smoker12 081 (52.6)5468 (52.2)6613 (52.9)
 Chronic lung disease8097 (35.3)3827 (36.6)4270 (34.2)
 Diabetes mellitus7689 (33.5)3720 (35.5)3969 (31.7)
 Chronic heart failure5004 (21.8)2504 (23.9)2500 (20.0)
 Diagnosed depression4265 (18.6)2263 (21.6)2002 (16.0)
 Mitral and/or aortic valvular disease4017 (17.5)2063 (19.7)1954 (15.6)
 Cancer3850 (16.8)1785 (17.1)2065 (16.5)
 Percutaneous coronary intervention3122 (13.6)595 (5.7)2527 (20.2)
 Obstructive sleep apnoea2763 (12.0)1307 (12.5)1456 (11.6)
 Peripheral artery disease1873 (8.2)609 (5.8)1264 (10.1)
 Acute myocardial infarction1700 (7.4)885 (8.5)815 (6.5)
 Chronic liver disease1202 (5.2)689 (6.6)513 (4.1)
 Ischaemic stroke or transient ischemic attack1174 (5.1)586 (5.6)588 (4.7)
  Ischemic stroke640 (2.8)366 (3.5)274 (2.2)
  Transient ischemic attack676 (2.9)320 (3.1)356 (2.8)
 Coronary artery bypass surgery1046 (4.6)457 (4.4)589 (4.7)
 Venous thromboembolism956 (4.2)511 (4.9)445 (3.6)
 Diagnosed dementia630 (2.7)305 (2.9)325 (2.6)
Baseline medication use, n (%)
 β-Blocker12 521 (54.5)5458 (52.1)7063 (56.5)
 Calcium channel blocker6135 (26.7)2725 (26.0)3410 (27.3)
 Digoxin654 (2.8)317 (3.0)337 (2.7)
 Diuretic10 483 (45.6)4668 (44.6)5815 (46.5)
 Angiotensin-converting enzyme inhibitor9197 (40.0)4290 (41.0)4907 (39.2)
 Angiotensin II receptor blocker4349 (18.9)1908 (18.2)2441 (19.5)
 Aldosterone receptor antagonist852 (3.7)452 (4.3)400 (3.2)
 Statin13 983 (60.9)6465 (61.8)7518 (60.1)
 Non-statin lipid-lowering drug1169 (5.1)574 (5.5)595 (4.8)
 Warfarin2655 (11.6)1003 (9.6)1652 (13.2)
 Antiplatelet agent2103 (9.2)1036 (9.9)1067 (8.5)
 Direct oral anticoagulant244 (1.1)69 (0.7)175 (1.4)

Low education attainment: education under the high school level is >25%.

Antiarrhythmic agents include amiodarone, disopyramide, dofetilide, dronedarone, flecainide, propafenone, and sotalol.

Antiplatelet agents include aspirin, clopidogrel, prasugrel, and ticagrelor.

β-Blockers include atenolol, bisoprolol, carvedilol, metoprolol tartrate, metoprolol succinate, nadolol, and propranolol.

Calcium channel blockers include diltiazem and verapamil.

Table 1

Characteristics of patients with incident atrial fibrillation who received rhythm control therapy

OverallSite 1Site 2
Characteristics(n = 22 970)(n = 10 468)(n = 12 502)
Mean (SD) age, years70.9 (11.0)70.8 (11.2)71.0 (10.8)
Age group, years, n (%)
 <656011 (26.2)2788 (26.6)3223 (25.8)
 65–747924 (34.5)3577 (34.2)4347 (34.8)
 75–846736 (29.3)3036 (29.0)3700 (29.6)
 ≥852299 (10.0)1067 (10.2)1232 (9.9)
Women, n (%)9702 (42.2)4352 (41.6)5350 (42.8)
Race, n (%)
 White18 477 (80.4)8360 (79.9)10 117 (80.9)
 Black1392 (6.1)786 (7.5)606 (4.8)
 Asian or Pacific Islander2030 (8.8)773 (7.4)1257 (10.1)
 Other204 (0.9)124 (1.2)80 (0.6)
 Unknown867 (3.8)425 (4.1)442 (3.5)
Hispanic ethnicity, n (%)2724 (11.9)1593 (15.2)1131 (9.0)
Low educational attainment, n (%)3535 (15.4)2062 (19.7)1473 (11.8)
Annual household income <$50 000, n (%)5180 (22.6)2740 (26.2)2440 (19.5)
Index Year
 2010–115364 (23.4)2394 (22.9)2970 (23.8)
 2012–135963 (26.0)2685 (25.6)3278 (26.2)
 2014–155875 (25.6)2681 (25.6)3194 (25.5)
 2016–175768 (25.1)2708 (25.9)3060 (24.5)
Body mass index, kg/m2
 <18.5273 (1.2)128 (1.2)145 (1.2)
 18.5–25.05353 (23.3)2389 (22.8)2964 (23.7)
 25.0–25.97879 (34.3)3621 (34.6)4258 (34.1)
 30.0–39.97581 (33.0)3487 (33.3)4094 (32.7)
 ≥401835 (8.0)828 (7.9)1007 (8.1)
 Unknown49 (0.2)15 (0.1)34 (0.3)
Medical history, n (%)
 Dyslipidemia18 528 (80.7)8711 (83.2)9817 (78.5)
 Hypertension17 961 (78.2)8260 (78.9)9701 (77.6)
 Current or former smoker12 081 (52.6)5468 (52.2)6613 (52.9)
 Chronic lung disease8097 (35.3)3827 (36.6)4270 (34.2)
 Diabetes mellitus7689 (33.5)3720 (35.5)3969 (31.7)
 Chronic heart failure5004 (21.8)2504 (23.9)2500 (20.0)
 Diagnosed depression4265 (18.6)2263 (21.6)2002 (16.0)
 Mitral and/or aortic valvular disease4017 (17.5)2063 (19.7)1954 (15.6)
 Cancer3850 (16.8)1785 (17.1)2065 (16.5)
 Percutaneous coronary intervention3122 (13.6)595 (5.7)2527 (20.2)
 Obstructive sleep apnoea2763 (12.0)1307 (12.5)1456 (11.6)
 Peripheral artery disease1873 (8.2)609 (5.8)1264 (10.1)
 Acute myocardial infarction1700 (7.4)885 (8.5)815 (6.5)
 Chronic liver disease1202 (5.2)689 (6.6)513 (4.1)
 Ischaemic stroke or transient ischemic attack1174 (5.1)586 (5.6)588 (4.7)
  Ischemic stroke640 (2.8)366 (3.5)274 (2.2)
  Transient ischemic attack676 (2.9)320 (3.1)356 (2.8)
 Coronary artery bypass surgery1046 (4.6)457 (4.4)589 (4.7)
 Venous thromboembolism956 (4.2)511 (4.9)445 (3.6)
 Diagnosed dementia630 (2.7)305 (2.9)325 (2.6)
Baseline medication use, n (%)
 β-Blocker12 521 (54.5)5458 (52.1)7063 (56.5)
 Calcium channel blocker6135 (26.7)2725 (26.0)3410 (27.3)
 Digoxin654 (2.8)317 (3.0)337 (2.7)
 Diuretic10 483 (45.6)4668 (44.6)5815 (46.5)
 Angiotensin-converting enzyme inhibitor9197 (40.0)4290 (41.0)4907 (39.2)
 Angiotensin II receptor blocker4349 (18.9)1908 (18.2)2441 (19.5)
 Aldosterone receptor antagonist852 (3.7)452 (4.3)400 (3.2)
 Statin13 983 (60.9)6465 (61.8)7518 (60.1)
 Non-statin lipid-lowering drug1169 (5.1)574 (5.5)595 (4.8)
 Warfarin2655 (11.6)1003 (9.6)1652 (13.2)
 Antiplatelet agent2103 (9.2)1036 (9.9)1067 (8.5)
 Direct oral anticoagulant244 (1.1)69 (0.7)175 (1.4)
OverallSite 1Site 2
Characteristics(n = 22 970)(n = 10 468)(n = 12 502)
Mean (SD) age, years70.9 (11.0)70.8 (11.2)71.0 (10.8)
Age group, years, n (%)
 <656011 (26.2)2788 (26.6)3223 (25.8)
 65–747924 (34.5)3577 (34.2)4347 (34.8)
 75–846736 (29.3)3036 (29.0)3700 (29.6)
 ≥852299 (10.0)1067 (10.2)1232 (9.9)
Women, n (%)9702 (42.2)4352 (41.6)5350 (42.8)
Race, n (%)
 White18 477 (80.4)8360 (79.9)10 117 (80.9)
 Black1392 (6.1)786 (7.5)606 (4.8)
 Asian or Pacific Islander2030 (8.8)773 (7.4)1257 (10.1)
 Other204 (0.9)124 (1.2)80 (0.6)
 Unknown867 (3.8)425 (4.1)442 (3.5)
Hispanic ethnicity, n (%)2724 (11.9)1593 (15.2)1131 (9.0)
Low educational attainment, n (%)3535 (15.4)2062 (19.7)1473 (11.8)
Annual household income <$50 000, n (%)5180 (22.6)2740 (26.2)2440 (19.5)
Index Year
 2010–115364 (23.4)2394 (22.9)2970 (23.8)
 2012–135963 (26.0)2685 (25.6)3278 (26.2)
 2014–155875 (25.6)2681 (25.6)3194 (25.5)
 2016–175768 (25.1)2708 (25.9)3060 (24.5)
Body mass index, kg/m2
 <18.5273 (1.2)128 (1.2)145 (1.2)
 18.5–25.05353 (23.3)2389 (22.8)2964 (23.7)
 25.0–25.97879 (34.3)3621 (34.6)4258 (34.1)
 30.0–39.97581 (33.0)3487 (33.3)4094 (32.7)
 ≥401835 (8.0)828 (7.9)1007 (8.1)
 Unknown49 (0.2)15 (0.1)34 (0.3)
Medical history, n (%)
 Dyslipidemia18 528 (80.7)8711 (83.2)9817 (78.5)
 Hypertension17 961 (78.2)8260 (78.9)9701 (77.6)
 Current or former smoker12 081 (52.6)5468 (52.2)6613 (52.9)
 Chronic lung disease8097 (35.3)3827 (36.6)4270 (34.2)
 Diabetes mellitus7689 (33.5)3720 (35.5)3969 (31.7)
 Chronic heart failure5004 (21.8)2504 (23.9)2500 (20.0)
 Diagnosed depression4265 (18.6)2263 (21.6)2002 (16.0)
 Mitral and/or aortic valvular disease4017 (17.5)2063 (19.7)1954 (15.6)
 Cancer3850 (16.8)1785 (17.1)2065 (16.5)
 Percutaneous coronary intervention3122 (13.6)595 (5.7)2527 (20.2)
 Obstructive sleep apnoea2763 (12.0)1307 (12.5)1456 (11.6)
 Peripheral artery disease1873 (8.2)609 (5.8)1264 (10.1)
 Acute myocardial infarction1700 (7.4)885 (8.5)815 (6.5)
 Chronic liver disease1202 (5.2)689 (6.6)513 (4.1)
 Ischaemic stroke or transient ischemic attack1174 (5.1)586 (5.6)588 (4.7)
  Ischemic stroke640 (2.8)366 (3.5)274 (2.2)
  Transient ischemic attack676 (2.9)320 (3.1)356 (2.8)
 Coronary artery bypass surgery1046 (4.6)457 (4.4)589 (4.7)
 Venous thromboembolism956 (4.2)511 (4.9)445 (3.6)
 Diagnosed dementia630 (2.7)305 (2.9)325 (2.6)
Baseline medication use, n (%)
 β-Blocker12 521 (54.5)5458 (52.1)7063 (56.5)
 Calcium channel blocker6135 (26.7)2725 (26.0)3410 (27.3)
 Digoxin654 (2.8)317 (3.0)337 (2.7)
 Diuretic10 483 (45.6)4668 (44.6)5815 (46.5)
 Angiotensin-converting enzyme inhibitor9197 (40.0)4290 (41.0)4907 (39.2)
 Angiotensin II receptor blocker4349 (18.9)1908 (18.2)2441 (19.5)
 Aldosterone receptor antagonist852 (3.7)452 (4.3)400 (3.2)
 Statin13 983 (60.9)6465 (61.8)7518 (60.1)
 Non-statin lipid-lowering drug1169 (5.1)574 (5.5)595 (4.8)
 Warfarin2655 (11.6)1003 (9.6)1652 (13.2)
 Antiplatelet agent2103 (9.2)1036 (9.9)1067 (8.5)
 Direct oral anticoagulant244 (1.1)69 (0.7)175 (1.4)

Low education attainment: education under the high school level is >25%.

Antiarrhythmic agents include amiodarone, disopyramide, dofetilide, dronedarone, flecainide, propafenone, and sotalol.

Antiplatelet agents include aspirin, clopidogrel, prasugrel, and ticagrelor.

β-Blockers include atenolol, bisoprolol, carvedilol, metoprolol tartrate, metoprolol succinate, nadolol, and propranolol.

Calcium channel blockers include diltiazem and verapamil.

NLP algorithm validation

Compared with the physician-adjudicated reference cases of recurrent AF, F-scores were 0.995 and 0.953 (sites 1 and 2) for the NLP algorithms applied to ECG reports, 0.928 and 0.870 for the algorithms applied to cardiac monitoring reports, and 0.928 and 0.911 for the algorithms applied to clinical notes (Table 2). The sensitivity, specificity, PPVs, and NPVs were all above 0.90 for the NLP algorithms applied to the ECG reports and clinical notes at both sites. The only exception was the algorithm applied to the cardiac monitoring reports for site 2 with a sensitivity of 0.833. The specificity was higher than the sensitivity on most of the validation datasets. The algorithm was found to consistently perform well in both female and male groups.

Table 2

Natural language processing algorithm for recurrent AF validation results

Accuracy measurements (95% CI)
Data sourceSiten*Positive rate (%)TPFPFNTNSensitivitySpecificityPPVNPVF-score
ECGSite 114178.0%11010301
(0.967–1)
0.968
(0.833–0.999)
0.991
(0.941–0.999)
1
(NA)
0.995
(0.986–1)
ECGSite 239425.4%92182930.920
(0.848–0.965)
0.997
(0.981–1)
0.989
(0.929–0.999)
0.973
(0.950–0.986)
0.953
(0.920–0.980)
Cardiac monitorSite 19237.0%3232550.941
(0.803–0.993)
0.948
(0.856–0.989)
0.914
(0.779–0.970)
0.965
(0.877–0.991)
0.928
(0.862–0.984)
Cardiac monitorSite 222810.5%20242020.833
(0.626–0.953)
0.990
(0.965–0.999)
0.909
(0.713–0.976)
0.981
(0.954–0.992)
0.870
(0.743–0.961)
Clinical noteSite 121828.9%58451510.921
(0.824–0.974)
0.974
(0.935–0.993)
0.936
(0.846–0.975)
0.968
(0.929–0.986)
0.928
(0.877–0.972)
Clinical noteSite 220422.1%41441550.911
(0.788–0.975)
0.975
(0.937–0.993)
0.911
(0.795–0.964)
0.975
(0.938–0.99)
0.911
(0.841–0.965)
Accuracy measurements (95% CI)
Data sourceSiten*Positive rate (%)TPFPFNTNSensitivitySpecificityPPVNPVF-score
ECGSite 114178.0%11010301
(0.967–1)
0.968
(0.833–0.999)
0.991
(0.941–0.999)
1
(NA)
0.995
(0.986–1)
ECGSite 239425.4%92182930.920
(0.848–0.965)
0.997
(0.981–1)
0.989
(0.929–0.999)
0.973
(0.950–0.986)
0.953
(0.920–0.980)
Cardiac monitorSite 19237.0%3232550.941
(0.803–0.993)
0.948
(0.856–0.989)
0.914
(0.779–0.970)
0.965
(0.877–0.991)
0.928
(0.862–0.984)
Cardiac monitorSite 222810.5%20242020.833
(0.626–0.953)
0.990
(0.965–0.999)
0.909
(0.713–0.976)
0.981
(0.954–0.992)
0.870
(0.743–0.961)
Clinical noteSite 121828.9%58451510.921
(0.824–0.974)
0.974
(0.935–0.993)
0.936
(0.846–0.975)
0.968
(0.929–0.986)
0.928
(0.877–0.972)
Clinical noteSite 220422.1%41441550.911
(0.788–0.975)
0.975
(0.937–0.993)
0.911
(0.795–0.964)
0.975
(0.938–0.99)
0.911
(0.841–0.965)

CI = confidence interval; FN = false negative; FP = false positive; TN = true negative; TP = true positive; NPV = negative predictive value; PPV = positive predictive value; and ECG = Electrocardiogram.

Note: The validation dataset has 100 patients for each site. Each patient might have more than one document in each of the three data sources.* n = number of documents in the validation dataset

Table 2

Natural language processing algorithm for recurrent AF validation results

Accuracy measurements (95% CI)
Data sourceSiten*Positive rate (%)TPFPFNTNSensitivitySpecificityPPVNPVF-score
ECGSite 114178.0%11010301
(0.967–1)
0.968
(0.833–0.999)
0.991
(0.941–0.999)
1
(NA)
0.995
(0.986–1)
ECGSite 239425.4%92182930.920
(0.848–0.965)
0.997
(0.981–1)
0.989
(0.929–0.999)
0.973
(0.950–0.986)
0.953
(0.920–0.980)
Cardiac monitorSite 19237.0%3232550.941
(0.803–0.993)
0.948
(0.856–0.989)
0.914
(0.779–0.970)
0.965
(0.877–0.991)
0.928
(0.862–0.984)
Cardiac monitorSite 222810.5%20242020.833
(0.626–0.953)
0.990
(0.965–0.999)
0.909
(0.713–0.976)
0.981
(0.954–0.992)
0.870
(0.743–0.961)
Clinical noteSite 121828.9%58451510.921
(0.824–0.974)
0.974
(0.935–0.993)
0.936
(0.846–0.975)
0.968
(0.929–0.986)
0.928
(0.877–0.972)
Clinical noteSite 220422.1%41441550.911
(0.788–0.975)
0.975
(0.937–0.993)
0.911
(0.795–0.964)
0.975
(0.938–0.99)
0.911
(0.841–0.965)
Accuracy measurements (95% CI)
Data sourceSiten*Positive rate (%)TPFPFNTNSensitivitySpecificityPPVNPVF-score
ECGSite 114178.0%11010301
(0.967–1)
0.968
(0.833–0.999)
0.991
(0.941–0.999)
1
(NA)
0.995
(0.986–1)
ECGSite 239425.4%92182930.920
(0.848–0.965)
0.997
(0.981–1)
0.989
(0.929–0.999)
0.973
(0.950–0.986)
0.953
(0.920–0.980)
Cardiac monitorSite 19237.0%3232550.941
(0.803–0.993)
0.948
(0.856–0.989)
0.914
(0.779–0.970)
0.965
(0.877–0.991)
0.928
(0.862–0.984)
Cardiac monitorSite 222810.5%20242020.833
(0.626–0.953)
0.990
(0.965–0.999)
0.909
(0.713–0.976)
0.981
(0.954–0.992)
0.870
(0.743–0.961)
Clinical noteSite 121828.9%58451510.921
(0.824–0.974)
0.974
(0.935–0.993)
0.936
(0.846–0.975)
0.968
(0.929–0.986)
0.928
(0.877–0.972)
Clinical noteSite 220422.1%41441550.911
(0.788–0.975)
0.975
(0.937–0.993)
0.911
(0.795–0.964)
0.975
(0.938–0.99)
0.911
(0.841–0.965)

CI = confidence interval; FN = false negative; FP = false positive; TN = true negative; TP = true positive; NPV = negative predictive value; PPV = positive predictive value; and ECG = Electrocardiogram.

Note: The validation dataset has 100 patients for each site. Each patient might have more than one document in each of the three data sources.* n = number of documents in the validation dataset

Identification of recurrent AF

The final cohort included 987 patients who underwent ablation, 19 841 patients treated with AADs, 7828 patients who underwent cardioversion, and 76 patients with combined pacemaker implantation and nodal ablation. The percentages of patients who had at least one clinical document potentially addressing recurrent AF during the first 12 months are shown in Table 3 for each type of clinical data source (ECG, cardiac monitor, and clinical notes). Over 90% had at least one eligible clinical data source during the first 12 months post-treatment regardless of the type of rhythm control therapy. Clinical documents were more prevalent in patients undergoing cardiac procedures than in those taking AADs. The variation was even greater for ECG or cardiac monitor reports. For example, although 94.5% and 98.9% of patients with ablation had ECG or cardiac monitor reports, only 79.2% and 79.3% of AAD patients had ECG or cardiac monitor reports. In addition, not all selected clinical free-text data commented specifically on the presence or absence of AF (see Supplementary material online, Table S4).

Table 3

Number and percentage of patients with specific visit reports/notes during the first 12 months after initiation of rhythm control therapy

ECG (n/%)Cardiac monitor (n/%)ECG or cardiac monitor (n/%)Clinical note (n/%)Total (n/%)
Ablation
 site 1 (= 366)307 (83.9%)274 (74.9%)346 (94.5%)365 (99.7%)365 (99.7%)
 site 2 (n = 621)579 (93.2%)543 (87.4%)614 (98.9%)621 (100%)621 (100%)
Antiarrhythmic medication
 site 1 (n = 8979)6898 (76.8%)2487 (27.7%)7111 (79.2%)8343 (92.9%)8360 (93.1%)
 site 2 (n = 10 862)8244 (75.9%)2826 (26.0%)8611 (79.3%)9814 (90.4%)9921 (91.03%)
Cardioversion
 site 1 (n = 3264)2761 (84.6%)995 (30.5%)2795 (85.6%)3191 (97.8%)3193 (97.8%)
 site 2 (n = 4564)4098 (89.8%)1298 (28.4%)4178 (91.5%)4465 (97.8%)4516 (98.9%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)44 (69.8%)35 (55.6%)49 (77.8%)62 (98.4%)62 (98.4%)
 site 2 (n = 13)8 (61.5%)1 (7.7%)8 (61.5%)13 (100%)13 (100%)
ECG (n/%)Cardiac monitor (n/%)ECG or cardiac monitor (n/%)Clinical note (n/%)Total (n/%)
Ablation
 site 1 (= 366)307 (83.9%)274 (74.9%)346 (94.5%)365 (99.7%)365 (99.7%)
 site 2 (n = 621)579 (93.2%)543 (87.4%)614 (98.9%)621 (100%)621 (100%)
Antiarrhythmic medication
 site 1 (n = 8979)6898 (76.8%)2487 (27.7%)7111 (79.2%)8343 (92.9%)8360 (93.1%)
 site 2 (n = 10 862)8244 (75.9%)2826 (26.0%)8611 (79.3%)9814 (90.4%)9921 (91.03%)
Cardioversion
 site 1 (n = 3264)2761 (84.6%)995 (30.5%)2795 (85.6%)3191 (97.8%)3193 (97.8%)
 site 2 (n = 4564)4098 (89.8%)1298 (28.4%)4178 (91.5%)4465 (97.8%)4516 (98.9%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)44 (69.8%)35 (55.6%)49 (77.8%)62 (98.4%)62 (98.4%)
 site 2 (n = 13)8 (61.5%)1 (7.7%)8 (61.5%)13 (100%)13 (100%)

ECG = electrocardiogram.

Table 3

Number and percentage of patients with specific visit reports/notes during the first 12 months after initiation of rhythm control therapy

ECG (n/%)Cardiac monitor (n/%)ECG or cardiac monitor (n/%)Clinical note (n/%)Total (n/%)
Ablation
 site 1 (= 366)307 (83.9%)274 (74.9%)346 (94.5%)365 (99.7%)365 (99.7%)
 site 2 (n = 621)579 (93.2%)543 (87.4%)614 (98.9%)621 (100%)621 (100%)
Antiarrhythmic medication
 site 1 (n = 8979)6898 (76.8%)2487 (27.7%)7111 (79.2%)8343 (92.9%)8360 (93.1%)
 site 2 (n = 10 862)8244 (75.9%)2826 (26.0%)8611 (79.3%)9814 (90.4%)9921 (91.03%)
Cardioversion
 site 1 (n = 3264)2761 (84.6%)995 (30.5%)2795 (85.6%)3191 (97.8%)3193 (97.8%)
 site 2 (n = 4564)4098 (89.8%)1298 (28.4%)4178 (91.5%)4465 (97.8%)4516 (98.9%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)44 (69.8%)35 (55.6%)49 (77.8%)62 (98.4%)62 (98.4%)
 site 2 (n = 13)8 (61.5%)1 (7.7%)8 (61.5%)13 (100%)13 (100%)
ECG (n/%)Cardiac monitor (n/%)ECG or cardiac monitor (n/%)Clinical note (n/%)Total (n/%)
Ablation
 site 1 (= 366)307 (83.9%)274 (74.9%)346 (94.5%)365 (99.7%)365 (99.7%)
 site 2 (n = 621)579 (93.2%)543 (87.4%)614 (98.9%)621 (100%)621 (100%)
Antiarrhythmic medication
 site 1 (n = 8979)6898 (76.8%)2487 (27.7%)7111 (79.2%)8343 (92.9%)8360 (93.1%)
 site 2 (n = 10 862)8244 (75.9%)2826 (26.0%)8611 (79.3%)9814 (90.4%)9921 (91.03%)
Cardioversion
 site 1 (n = 3264)2761 (84.6%)995 (30.5%)2795 (85.6%)3191 (97.8%)3193 (97.8%)
 site 2 (n = 4564)4098 (89.8%)1298 (28.4%)4178 (91.5%)4465 (97.8%)4516 (98.9%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)44 (69.8%)35 (55.6%)49 (77.8%)62 (98.4%)62 (98.4%)
 site 2 (n = 13)8 (61.5%)1 (7.7%)8 (61.5%)13 (100%)13 (100%)

ECG = electrocardiogram.

The numbers and percentages of patients who experienced AF recurrence in the first 12 months after AF therapies are shown in Table 4 (see Supplementary material online, Table S5). For sites 1 and 2, the percentages of definite or probable AF recurrence identified by the NLP-based algorithms between the first 12 months were 60.7 and 69.9% (ablation), 64.5 and 73.7% (cardioversion), and 49.6 and 55.5% (AAD), respectively. The numbers of patients with AF recurrence found by NLP are two to three times higher than those found by codes (Table 4). The percentages of patients with code-identified AF recurrence for sites 1 and 2 were 20.2 and 23.7% for ablation, 25.6 and 28.4% for cardioversion, and 20.0 and 27.5% for antiarrhythmic medication, respectively. The great majority of recurring cases found by code or NLP algorithm were identified by the NLP algorithm. The findings were consistent when we examined the recurrence in 1–3 months and 4–12 months (see Supplementary material online, Tables S510). The percentages of AF recurrence were comparable between the two study sites (see Figure 2 and Supplementary material online, Figure S1).

Percentage of patients with definite AF recurrence (A) and definite or probable AF recurrence (B) during the first 12 months after initiation of rhythm control therapy.
Figure 2

Percentage of patients with definite AF recurrence (A) and definite or probable AF recurrence (B) during the first 12 months after initiation of rhythm control therapy.

Table 4

Atrial fibrillation (AF) recurrence during the first 12 months after initiation of rhythm control therapy (comparing NLP-based algorithm with code-based algorithm)

NLP-based algorithm*Code-based algorithm
DefiniteProbableDefiniteProbableTotal
ECGCardiac monitorTotalClinical noteTotal
Ablation
 site 1 (= 366)98 (26.8%)131 (35.8%)172 (47.0%)160 (43.7%)222 (60.7%)49 (13.1%)54 (14.8%)74 (20.2%)
 site 2 (n = 621)160 (25.8%)261 (42.0%)339 (54.6%)311 (50.1%)434 (69.9%)108 (17.4%)60 (9.7%)147 (23.7%)
Antiarrhythmic medication
 site 1 (n = 8979)2337 (26.0%)943 (10.5%)2770 (30.8%)3967 (44.2%)4457 (49.6%)1051 (11.7%)1153 (12.8%)1800 (20.0%)
 site 2 (n = 10 862)3503 (32.3%)1004 (9.2%)4025 (37.1%)4916 (45.3%)6024 (55.5%)2017 (18.6%)1258 (11.6%)2987 (27.5%)
Cardioversion
 site 1 (n = 3264)1320 (40.4%)504 (15.4%)1481 (45.4%)1902 (58.3%)2104 (64.5%)571 (17.5%)527 (16.1%)835 (25.6%)
 site 2 (n = 4564)2622 (57.4%)542 (11.9%)2777 (60.8%)2606 (57.1%)3365 (73.7%)988 (21.6%)458 (10.0%)1296 (28.4%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)8 (12.7%)11 (17.5%)15 (23.8%)33 (52.4%)36 (57.1%)2 (3.2%)8 (12.7%)10 (15.9%)
 site 2 (n = 13)1 (7.7%)01 (7.7%)5 (38.5%)5 (38.5%)010 (76.9%)10 (76.9%)
NLP-based algorithm*Code-based algorithm
DefiniteProbableDefiniteProbableTotal
ECGCardiac monitorTotalClinical noteTotal
Ablation
 site 1 (= 366)98 (26.8%)131 (35.8%)172 (47.0%)160 (43.7%)222 (60.7%)49 (13.1%)54 (14.8%)74 (20.2%)
 site 2 (n = 621)160 (25.8%)261 (42.0%)339 (54.6%)311 (50.1%)434 (69.9%)108 (17.4%)60 (9.7%)147 (23.7%)
Antiarrhythmic medication
 site 1 (n = 8979)2337 (26.0%)943 (10.5%)2770 (30.8%)3967 (44.2%)4457 (49.6%)1051 (11.7%)1153 (12.8%)1800 (20.0%)
 site 2 (n = 10 862)3503 (32.3%)1004 (9.2%)4025 (37.1%)4916 (45.3%)6024 (55.5%)2017 (18.6%)1258 (11.6%)2987 (27.5%)
Cardioversion
 site 1 (n = 3264)1320 (40.4%)504 (15.4%)1481 (45.4%)1902 (58.3%)2104 (64.5%)571 (17.5%)527 (16.1%)835 (25.6%)
 site 2 (n = 4564)2622 (57.4%)542 (11.9%)2777 (60.8%)2606 (57.1%)3365 (73.7%)988 (21.6%)458 (10.0%)1296 (28.4%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)8 (12.7%)11 (17.5%)15 (23.8%)33 (52.4%)36 (57.1%)2 (3.2%)8 (12.7%)10 (15.9%)
 site 2 (n = 13)1 (7.7%)01 (7.7%)5 (38.5%)5 (38.5%)010 (76.9%)10 (76.9%)

ECG = electrocardiogram; NLP = natural language processing.

Note: Patients could receive other types of rhythm control therapies during the 12-month follow-up. The same patient could have overlaps in different therapy groups.

*

In the NLP algorithm, definite AF recurrence was defined as an AF event identified from ECG or cardiac monitor reports; probable AF recurrence was defined as an AF event identified from physician clinical notes.

In the code-based algorithm, definite AF recurrence was identified by CPT codes for catheter ablation, atrioventricular (AV) nodal ablation, or cardioversion; probable AF recurrence was identified by hospitalization with a primary diagnosis of AF or pacemaker implantation.

Table 4

Atrial fibrillation (AF) recurrence during the first 12 months after initiation of rhythm control therapy (comparing NLP-based algorithm with code-based algorithm)

NLP-based algorithm*Code-based algorithm
DefiniteProbableDefiniteProbableTotal
ECGCardiac monitorTotalClinical noteTotal
Ablation
 site 1 (= 366)98 (26.8%)131 (35.8%)172 (47.0%)160 (43.7%)222 (60.7%)49 (13.1%)54 (14.8%)74 (20.2%)
 site 2 (n = 621)160 (25.8%)261 (42.0%)339 (54.6%)311 (50.1%)434 (69.9%)108 (17.4%)60 (9.7%)147 (23.7%)
Antiarrhythmic medication
 site 1 (n = 8979)2337 (26.0%)943 (10.5%)2770 (30.8%)3967 (44.2%)4457 (49.6%)1051 (11.7%)1153 (12.8%)1800 (20.0%)
 site 2 (n = 10 862)3503 (32.3%)1004 (9.2%)4025 (37.1%)4916 (45.3%)6024 (55.5%)2017 (18.6%)1258 (11.6%)2987 (27.5%)
Cardioversion
 site 1 (n = 3264)1320 (40.4%)504 (15.4%)1481 (45.4%)1902 (58.3%)2104 (64.5%)571 (17.5%)527 (16.1%)835 (25.6%)
 site 2 (n = 4564)2622 (57.4%)542 (11.9%)2777 (60.8%)2606 (57.1%)3365 (73.7%)988 (21.6%)458 (10.0%)1296 (28.4%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)8 (12.7%)11 (17.5%)15 (23.8%)33 (52.4%)36 (57.1%)2 (3.2%)8 (12.7%)10 (15.9%)
 site 2 (n = 13)1 (7.7%)01 (7.7%)5 (38.5%)5 (38.5%)010 (76.9%)10 (76.9%)
NLP-based algorithm*Code-based algorithm
DefiniteProbableDefiniteProbableTotal
ECGCardiac monitorTotalClinical noteTotal
Ablation
 site 1 (= 366)98 (26.8%)131 (35.8%)172 (47.0%)160 (43.7%)222 (60.7%)49 (13.1%)54 (14.8%)74 (20.2%)
 site 2 (n = 621)160 (25.8%)261 (42.0%)339 (54.6%)311 (50.1%)434 (69.9%)108 (17.4%)60 (9.7%)147 (23.7%)
Antiarrhythmic medication
 site 1 (n = 8979)2337 (26.0%)943 (10.5%)2770 (30.8%)3967 (44.2%)4457 (49.6%)1051 (11.7%)1153 (12.8%)1800 (20.0%)
 site 2 (n = 10 862)3503 (32.3%)1004 (9.2%)4025 (37.1%)4916 (45.3%)6024 (55.5%)2017 (18.6%)1258 (11.6%)2987 (27.5%)
Cardioversion
 site 1 (n = 3264)1320 (40.4%)504 (15.4%)1481 (45.4%)1902 (58.3%)2104 (64.5%)571 (17.5%)527 (16.1%)835 (25.6%)
 site 2 (n = 4564)2622 (57.4%)542 (11.9%)2777 (60.8%)2606 (57.1%)3365 (73.7%)988 (21.6%)458 (10.0%)1296 (28.4%)
Pacemaker combined with nodal ablation
 site 1 (n = 63)8 (12.7%)11 (17.5%)15 (23.8%)33 (52.4%)36 (57.1%)2 (3.2%)8 (12.7%)10 (15.9%)
 site 2 (n = 13)1 (7.7%)01 (7.7%)5 (38.5%)5 (38.5%)010 (76.9%)10 (76.9%)

ECG = electrocardiogram; NLP = natural language processing.

Note: Patients could receive other types of rhythm control therapies during the 12-month follow-up. The same patient could have overlaps in different therapy groups.

*

In the NLP algorithm, definite AF recurrence was defined as an AF event identified from ECG or cardiac monitor reports; probable AF recurrence was defined as an AF event identified from physician clinical notes.

In the code-based algorithm, definite AF recurrence was identified by CPT codes for catheter ablation, atrioventricular (AV) nodal ablation, or cardioversion; probable AF recurrence was identified by hospitalization with a primary diagnosis of AF or pacemaker implantation.

Discussion

We developed and validated NLP algorithms to identify valid recurrent AF episodes using various clinical data sources from EHRs. Compared with the physician-adjudicated reference standard, the NLP algorithms showed high accuracy, and consistently performed well at two large integrated healthcare delivery systems across different clinical data sources. Compared with code-based algorithms commonly used in retrospective observational studies, the NLP algorithms identified far more recurrent AF cases. These results not only verify the NLP method but also suggest that code-based algorithms may significantly underestimate rates of recurrent AF.

Previous studies identified AF recurrence using ECG and cardiac monitors, which were used in this study to detect NLP-definite AF recurrence. The percentages of patients with NLP-definite AF recurrences were comparable to other studies. The proportions of ablation patients with 1-year NLP-definite AF recurrence (47.0% site 1, 54.6% site 2) were similar to the percentage (45.9%) reported from a prospective multicenter real-world study (n = 3679) based on the German national ablation registry.5 Patients in that study were evaluated every three months at each ablation clinic, which included a personal phone interview, ECG, and 24-hour Holter monitor. The percentages of patients with NLP-definite AF recurrence during the first three months following ablation (27.3% site 1, 17.6% site 2) in our study were also consistent with the percentage (24.1%) reported in a study of 3120 Korean patients who had their first ablation.23 Patients in that cohort study had ECG during their 2 weeks’ outpatient visits following discharge and were then recommended to use portable cardiac monitors if they developed symptoms suggestive of recurrence. Following the 3-month blanking period, the percentages of ablation patients with NLP-definite AF recurrence during the first year (33.3% site 1, 48.6% site 2) were similar to the percentage (36.4%) reported in the CABANA study (n = 611).24 In that multicenter randomized controlled trial, AF recurrences were captured using symptom-activated recording and a monthly 24-hour cardiac monitor.

In real-world settings, many patients do not get repeated ECG or cardiac monitoring as patients enrolled in studies using protocol-driven cardiac monitoring at specific follow-up times.5,10,23,24 Our study makes a novel contribution by detecting AF recurrence using EHR-based outpatient and inpatient clinical notes, which could improve the identification of AF recurrence in real-world clinical settings or pragmatic studies. For example, after incorporating clinical notes, the NLP-detected recurrence rates for the ablation group increased from 33.3 and 48.6% (definite AF) to 45.1 and 58.6% (definite or probable AF) for the two sites between months 4 and 12. The differences were more notable for the other treatment groups. The growing number of probable AF recurrences by therapy types could be attributed to patient characteristics and care practices. For instance, patients who underwent AF procedures are more likely to have more follow-up clinic visits and receive greater surveillance monitoring than patients who were only treated with AADs.

Even though the two sites shared many similarities, we discovered variations in how the EHR systems collected and stored data. The growing availability of cardiac monitoring technologies allows more sensitive detection of AF recurrence. However, various types of monitoring devices present a barrier to obtaining their results from complex EHR systems. Care providers, for example, could obtain results from many different sources, such as telemetry, or be notified by monitor device manufacturers. These results were routed and recorded in various locations in the EHR systems depending on data sources, institutions, hospitals, departments, and providers. With the growing adoption of new consumer-based wearable devices with irregular rhythm notification capabilities, integrating wearable data into EHRs requires careful planning to minimize data silos, ensure clinical accuracy, and enhance meaningful use.25

The evolution of EHRs has also made it easier to create lengthy and bloated notes.26–28 Cardiology is one of the specialties with the longest notes and the largest note redundancy (73% are ‘the amount of text identical to the patient's last note’), with 37 and 40% of contents being copied or created from templates, respectively.28 In this study, we observed that when managing AF patients, cardiologists frequently copied and pasted the historical AF visit information, which is likely used to support recall and clinical reasoning for disease management. These note texts may lack the detailed temporal information necessary for an NLP algorithm to exclude previous AF episodes. Previous NLP efforts on identifying possible copied note text used sequence alignment to find identical sequences of text across notes.28,29 In this study, we used methods such as excluding sections and statements that may have been copied forward as historical information. The most effective approach for detecting copied text may be EHRs that employ markup tags to explicitly annotate copied content, which can be evaluated in future studies.

One of the primary goals of AF rhythm control therapy is to reduce AF recurrence and any associated symptoms. While identifying AF recurrence is clinically important, frequent ECG monitoring employed in large prospective research studies is not usually feasible due to cost and logistical reasons. Furthermore, reliance on administrative diagnosis code-based identification using EHRs may significantly underestimate or overestimate AF recurrence. Moreover, due to the time and resources required, population-based retrospective studies to evaluate AF recurrence by manual chart review would be extremely difficult to conduct.

Using NLP algorithms, we could gain a more complete understanding of AF recurrence, facilitating clinical research and improving system-level patient care strategies. In research studies, our NLP algorithms can aid in assessing the efficacy of various rhythm control therapies, identifying risk factors for AF recurrence, evaluating long-term outcomes, and directing future research.3,8 The NLP programme, for example, could identify the timing and frequency of AF occurrences before and after AF treatments, which can evaluate effectiveness within and between AF treatments. Automatic detection of AF recurrence using NLP could benefit not only retrospective studies but also prospective studies and clinical trials including patient recruitment.30 The NLP algorithms could also help clinicians monitor disease progression, optimize patient management, and predict future outcomes in clinical practice.9 For instance, early AF recurrence was found to be a predictor of later AF recurrence after ablation,23 which can lead to future treatment strategies to prevent early AF recurrence and improve long-term treatment success.

Study strengths and limitations

This study was performed within two large, well-characterized, diverse community-based populations within two large integrated healthcare delivery systems with comprehensive EHRs. The included health plans provide strong incentives for members to use services at our owned facilities, so clinical documentation is likely to be more comprehensive than in EHRs representing patients whose care is more fragmented. We developed separate NLP algorithms to identify AF recurrence from multiple free-text data sources within EHRs, including clinical notes, which provide a wealth of information but also have significant variability in terms of structure, content, and accuracy. The algorithms were highly accurate, as independently validated through physician adjudication of relevant EHR information. Our rule-based NLP programmes require far fewer computational resources than deep neural network (DNN)-based programmes. Unlike DNN-based NLP, which requires large amounts of annotated data for training, rule-based NLP can be implemented with relatively small amounts of data. In addition, DNN models can be difficult to interpret due to their black-box nature. Rule-based systems, on the other hand, are intended to be more transparent and explainable. Having these advantages makes rule-based systems more practical for clinical use.

This study has several limitations. We did not include more patients in the validation data due to time and resource constraints, which resulted in wide CIs for some NLP performance measurements. Due to the small validation data sample size, we were unable to evaluate the algorithm performances by race and ethnicity. We found minor performance degradation when the algorithms were evaluated on data from site 2. Performance variances may be related to differences in the training data or data sources. Additionally, reporting language and style can differ between institutions and physicians. Our NLP method may perform differently in other test datasets. Another limitation was that AF recurrence detection can be affected by the frequency of AF-related clinical encounters and testing, which varies among healthcare systems, clinicians, and patients.31 For instance, site 2 had a much larger proportion of patients identified by cardiac monitor data than site 1 (39.9% vs. 27.3%). The intensity of visits and monitoring may be also influenced by patients’ symptoms or type of therapy. For instance, patients with symptomatic AF are more likely to receive more frequent consultations, monitoring, and rhythm control therapies.7 As a result, patients in this study may have more intense monitoring than individuals receiving just rate control therapy. Compared with studies where participants were monitored by cardiac monitors at preset intervals (e.g. every 3 or 6 months),10 our study may detect more recurrences with temporal dispersion of the testing.31 On the other hand, our method may identify fewer recurrences for patients who had fewer clinical encounters. Moreover, AF recurrence from EHR data relied on clinicians’ interpretation, which may not always adhere to a standardized criterion (e.g. documented 30 s of AF) for establishing a recurrent AF episode.24,32,33 Our study was also unable to assess the AF burden because all patients were not consistently monitored for a sufficient amount of time with relevant technologies.33 Finally, the accuracy of the code-based methods was not validated in this work. Although our results show that the code-based algorithms appear to have low sensitivity, we were unable to analyse their specificity and PPV.

Conclusions

We developed and validated a robust automated NLP method for identifying AF recurrence. Compared with the code-based methods, the NLP algorithms identified and classified significantly more patients with recurrent AF. We demonstrated how a computational tool might be used to support population-based care or a study using real-world EHR data. The automated identification of AF recurrence may aid future studies on the effectiveness of AF treatment and help develop tailored interventions.

Funding

This work was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health (R01HL142834).

Conflicts of Interests: D.E.S. has received research support from Bristol Myers Squibb and served as a consultant to Bristol Myers Squibb, Fitbit (now Google), Medtronic, and Pfizer. A.S.G. has received research funding through his institution from iRhythm Technologies, the Bristol Myers Squibb/Pfizer alliance, and Janssen Research and Development. The remaining authors have no conflicts of interest to report.

Data availability

The data underlying this article cannot be shared publicly due to HIPPA and institutional restrictions. The requests for NLP algorithms should be made to the corresponding author of this paper. They will be forwarded and considered on an individual basis by the Kaiser Permanente research and legal departments to verify whether the request is subject to any intellectual property or confidentiality obligations.

References

1.

January
CT
,
Wann
LS
,
Alpert
JS
,
Calkins
H
,
Cigarroa
JE
,
Cleveland
JC
Jr
et al.
2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines and the Heart Rhythm Society
.
Circulation
2014
;
130
:
e199
e267
.

2.

January
CT
,
Wann
LS
,
Calkins
H
,
Chen
LY
,
Cigarroa
JE
,
Cleveland
JC
Jr
et al.
2019 AHA/ACC/HRS focused update of the 2014 AHA/ACC/HRS guideline for the management of patients with atrial fibrillation: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society in Collaboration with the Society of Thoracic Surgeons
.
Circulation
2019
;
140
:
e125
e151
.

3.

Hindricks
G
,
Potpara
T
,
Dagres
N
,
Arbelo
E
,
Bax
JJ
,
Blomström-Lundqvist
C
et al.
2020 ESC guidelines for the diagnosis and management of atrial fibrillation developed in collaboration with the European Association of Cardio-Thoracic Surgery (EACTS)
.
Eur Heart J
2020
;
42
:
373
498
.

4.

Steven
D
,
Sultan
A
,
Reddy
V
,
Luker
J
,
Altenburg
M
,
Hoffmann
B
et al.
Benefit of pulmonary vein isolation guided by loss of pace capture on the ablation line: results from a prospective 2-center randomized trial
.
J Am Coll Cardiol
2013
;
62
:
44
50
.

5.

Sultan
A
,
Luker
J
,
Andresen
D
,
Kuck
KH
,
Hoffmann
E
,
Brachmann
J
et al.
Predictors of atrial fibrillation recurrence after catheter ablation: data from the German Ablation Registry
.
Sci Rep
2017
;
7
:
16678
.

6.

Freeman
JV
,
Shrader
P
,
Pieper
KS
,
Allen
LA
,
Chan
PS
,
Fonarow
GC
et al.
Outcomes and anticoagulation use after catheter ablation for atrial fibrillation
.
Circ Arrhythm Electrophysiol
2019
;
12
:
e007612
.

7.

Kirchhof
P
,
Camm
AJ
,
Goette
A
,
Brandes
A
,
Eckardt
L
,
Elvan
A
et al.
Early rhythm-control therapy in patients with atrial fibrillation
.
N Engl J Med
2020
;
383
:
1305
1316
.

8.

Camm
AJ
,
Naccarelli
GV
,
Mittal
S
,
Crijns
HJGM
,
Hohnloser
SH
,
Ma
CS
et al.
The increasing role of rhythm control in patients with atrial fibrillation: JACC state-of-the-art review
.
J Am Coll Cardiol
2022
;
79
:
1932
1948
.

9.

Dretzke
J
,
Chuchu
N
,
Agarwal
R
,
Herd
C
,
Chua
W
,
Fabritz
L
et al.
Predicting recurrent atrial fibrillation after catheter ablation: a systematic review of prognostic models
.
Europace
2020
;
22
:
748
760
.

10.

Turagam
MK
,
Musikantow
D
,
Whang
W
,
Koruth
JS
,
Miller
MA
,
Langan
MN
et al.
Assessment of catheter ablation or antiarrhythmic drugs for first-line therapy of atrial fibrillation: a meta-analysis of randomized clinical trials
.
JAMA Cardiol
2021
;
6
:
697
705
.

11.

Pallisgaard
JL
,
Gislason
GH
,
Hansen
J
,
Johannessen
A
,
Torp-Pedersen
C
,
Rasmussen
PV
et al.
Temporal trends in atrial fibrillation recurrence rates after ablation between 2005 and 2014: a nationwide Danish cohort study
.
Eur Heart J
2018
;
39
:
442
449
.

12.

Taha
A
,
Nielsen
SJ
,
Bergfeldt
L
,
Ahlsson
A
,
Friberg
L
,
Björck
S
et al.
New-onset atrial fibrillation after coronary artery bypass grafting and long-term outcome: a population-based nationwide study from the SWEDEHEART registry
.
J Am Heart Assoc
2021
;
10
:
e017966
.

13.

Shah
RU
,
Mukherjee
R
,
Zhang
Y
,
Jones
AE
,
Springer
J
,
Hackett
I
et al.
Impact of different electronic cohort definitions to identify patients with atrial fibrillation from the electronic medical record
.
J Am Heart Assoc
2020
;
9
:
e014527
.

14.

Shen
AY
,
Yao
JF
,
Brar
SS
,
Jorgensen
MB
,
Wang
X
,
Chen
W
.
Racial/ethnic differences in ischemic stroke rates and the efficacy of warfarin among patients with atrial fibrillation
.
Stroke
2008
;
39
:
2736
2743
.

15.

Zheng
C
,
Rashid
N
,
Wu
YL
,
Koblick
R
,
Lin
AT
,
Levy
GD
et al.
Using natural language processing and machine learning to identify gout flares from electronic clinical notes
.
Arthritis Care Res (Hoboken)
2014
;
66
:
1740
1748
.

16.

Zheng
C
,
Luo
Y
,
Mercado
C
,
Sy
L
,
Jacobsen
SJ
,
Ackerson
B
et al.
Using natural language processing for identification of herpes zoster ophthalmicus cases to support population-based study
.
Clin Exp Ophthalmol
2018
;
47
:
7
14
.

17.

Zheng
C
,
Sun
BC
,
Wu
YL
,
Lee
M‐S
,
Shen
E
,
Redberg
RF
et al.
Automated identification and extraction of exercise treadmill test results
.
J Am Heart Assoc
2020
;
9
:
e014940
.

18.

Zheng
C
,
Yu
W
,
Xie
F
,
Chen
W
,
Mercado
C
,
Sy
LS
et al.
The use of natural language processing to identify Tdap-related local reactions at five health care systems in the Vaccine Safety Datalink
.
Int J Med Informatics
2019
;
127
:
27
34
.

19.

Zheng
C
,
Duffy
J
,
Liu
IA
,
Sy
LS
,
Navarro
RA
,
Kim
SS
et al.
Identifying cases of shoulder injury related to vaccine administration (SIRVA) in the United States: development and validation of a natural language processing method
.
JMIR Public Health Surveill
2022
;
8
:
e30426
.

20.

Zheng
C
,
Rashid
N
,
Koblick
R
,
An
J
.
Medication extraction from electronic clinical notes in an integrated health system: a study on Aspirin use in patients with nonvalvular atrial fibrillation
.
Clin Ther
2015
;
37
:
2048
2058
.

21.

Derczynski
L
.
Complementarity, F-score, and NLP evaluation
. In:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). Portorož, Slovenia
, European Language Resources Association (ELRA), Paris, France,
2016
, pp
261
266
.

22.

Willems
S
,
Khairy
P
,
Andrade
JG
,
Hoffmann
BA
,
Levesque
S
,
Verma
A
et al.
Redefining the blanking period after catheter ablation for paroxysmal atrial fibrillation: insights from the ADVICE (Adenosine Following Pulmonary Vein Isolation to Target Dormant Conduction Elimination) trial
.
Circ Arrhythm Electrophysiol
2016
;
9
:
e003909
.

23.

Kim
YG
,
Boo
KY
,
Choi
JI
,
Choi
YY
,
Choi
HY
,
Roh
SY
et al.
Early recurrence is reliable predictor of late recurrence after radiofrequency catheter ablation of atrial fibrillation
.
JACC Clin Electrophysiol
2021
;
7
:
343
351
.

24.

Poole
JE
,
Bahnson
TD
,
Monahan
KH
,
Johnson
G
,
Rostami
H
,
Silverstein
AP
et al.
Recurrence of atrial fibrillation after catheter ablation or antiarrhythmic drug therapy in the CABANA trial
.
J Am Coll Cardiol
2020
;
75
:
3105
3118
.

25.

Bayoumy
K
,
Gaber
M
,
Elshafeey
A
,
Mhaimeed
O
,
Dineen
EH
,
Marvel
FA
et al.
Smart wearable devices in cardiovascular care: where we are and how to move forward
.
Nat Rev Cardiol
2021
;
18
:
581
599
.

26.

Kuhn
T
,
Basch
P
,
Barr
M
,
Yackel
T
,
Medical Informatics Committee of the American College of P. Clinical documentation in the 21st century: executive summary of a policy position paper from the American College of Physicians
.
Ann Intern Med
2015
;
162
:
301
303
.

27.

Downing
NL
,
Bates
DW
,
Longhurst
CA.
Physician burnout in the electronic health record era: are we ignoring the real cause?
Ann Intern Med
2018
;
169
:
50
51
.

28.

Rule
A
,
Bedrick
S
,
Chiang
MF
,
Hribar
MR.
Length and redundancy of outpatient progress notes across a decade at an academic medical center
.
JAMA Netw Open
2021
;
4
:
e2115334
.

29.

Wrenn
JO
,
Stein
DM
,
Bakken
S
,
Stetson
PD.
Quantifying clinical narrative redundancy in an electronic health record
.
J Am Med Inform Assoc
2010
;
17
:
49
53
.

30.

Danforth
KN
,
Smith
AE
,
Loo
RK
,
Jacobsen
SJ
,
Mittman
BS
,
Kanter
MH
.
Electronic clinical surveillance to improve outpatient care: diverse applications within an integrated delivery system
.
EGEMS (Wash DC)
2014
;
2
:
1056
.

31.

Diederichsen
SZ
,
Haugan
KJ
,
Kronborg
C
,
Graff
C
,
Højberg
S
,
Køber
L
et al.
Comprehensive evaluation of rhythm monitoring strategies in screening for atrial fibrillation: insights from patients at risk monitored long term with an implantable loop recorder
.
Circulation
2020
;
141
:
1510
1522
.

32.

Calkins
H
,
Hindricks
G
,
Cappato
R
,
Kim
YH
,
Saad
EB
,
Aguinaga
L
et al.
2017 HRS/EHRA/ECAS/APHRS/SOLAECE expert consensus statement on catheter and surgical ablation of atrial fibrillation
.
Europace
2018
;
20
:
e1
e160
.

33.

Marchlinski
FE
,
Walsh
K
,
Guandalini
GS.
Reporting AF recurrence after catheter ablation: the burden is on us to get it right
.
J Am Coll Cardiol
2020
;
75
:
3119
3121
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data