Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review

Characteristics of the study and EHR data.

Study (year); country	Source(s) of routinely recorded EHRs on ADL	Study population	Type of ADL included
Anzaldi et al (2017)³⁷; United States	Clinical notes from a nonprofit medical group (hospital, emergency department, and nursing home)	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Kharrazi et al (2018)²^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Kan et al (2018)³⁸^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Hernandez-Boussard et al (2017)³⁹; United States	Clinical notes from a single, large, academic medical center	Prostate cancer patients	Continence (UI^a)
Humbert-Droz et al (2022)⁴⁰; United States	Rheumatology notes from the RISE^b registry	Rheumatology patients	Ambulating Feeding Dressing Personal hygiene Toileting
Alves et al (2022)⁴¹; United States	Clinical notes from neurology practices included in the OM1 MS Registry	Patients with MS^c	Ambulating (mobility impairments)
Chen et al (2019)⁴²; United States	Clinical notes from a large group practice	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Banerjee et al (2019)⁴³; United States	Clinical notes from a tertiary care academic Medical center included in a research database	Prostate cancer patients	Continence (UI^a and fecal incontinence)
Meskers et al (2022)⁴⁴; The Netherlands	Clinical notes from a large teaching hospital	Hospitalized COVID-19 patients	Ambulating (mobility activities)
Rivera et al (2022)⁴⁵; United States	Provider documentation, discharge notes, and PT and occupational therapy documentation from a large stroke referral center	Ischemic stroke patients	Ambulating (Modified Raking Scale)
Chen et al (2019)⁴⁶; United States	Unstructured free-text from a large multispecialty medical group	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Gori et al (2019)⁴⁷; United States	Clinical notes from a tertiary academic medical center included in a research data warehouse	Prostate cancer patients	Continence (UI^a)
Newman-Griffis et al (2018)⁴⁸; United States	PT^d notes from a Clinical Center	Patients with PT^d notes	Ambulating (mobility)
Bozkurt et al (2020)⁴⁹; United States	Clinical notes from an Academic Medical Centre included in a prostate cancer clinical data warehouse	Patients diagnosed with prostate cancer	Continence (UI^a)
Doing-Harris et al (2019)⁵⁰; United States	Clinical notes from a Veteran Health Administration Repository (VINCI)	Veterans diagnosed with cardiac disease	Ambulating (able to mobilize, bed-ridden, wheelchair-bound) Dressing
Goudar-zvand et al (2019)⁵¹; United States	Clinical notes and current visit information from the Mayo Clinic Biobank	Physician-diagnosed CI^e and CU^f patients aged 65 years and older	Ambulating (transferring) Feeding Dressing Personal hygiene (bathing) Toileting
Greve et al (2022)⁵²; United States	Clinical notes from a single tertiary medical center	Patients with cerebral palsy	Ambulating
Thieu et al (2021)⁵³; United States	PT^d notes from a Clinical Center	Patients with PTi notes	Ambulating (mobility domain of the ICF^g)
Newman-Griffis et al (2021)⁵⁴; United States	Notes from the Rehabilitation and Medicine Department at the NIH Clinical Center using databases of the NIH Biomedical Translational Research Information System	Patients receiving PT^d	Ambulating (mobility activities)
Newman-Griffis et al (2021)⁵⁵; United States	Claims database	Patients receiving disability benefits primarily related to musculo-skeletal, neurological, or mental impairments	Ambulating (mobility) Personal hygiene (self-care)
Sung et al (2021)⁵⁶; Taiwan (English written records)	Clinical notes from 2 hospital stroke registries	Patients hospitalized for acute ischemic stroke	Ambulating (Modified Raking Scale)
Yang et al (2022)⁵⁷; Canada	Clinical notes from a clinical database from a large MS^c clinic	Patients with MS^c	Ambulating (mobility)

Study (year); country	Source(s) of routinely recorded EHRs on ADL	Study population	Type of ADL included
Anzaldi et al (2017)³⁷; United States	Clinical notes from a nonprofit medical group (hospital, emergency department, and nursing home)	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Kharrazi et al (2018)²^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Kan et al (2018)³⁸^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Hernandez-Boussard et al (2017)³⁹; United States	Clinical notes from a single, large, academic medical center	Prostate cancer patients	Continence (UI^a)
Humbert-Droz et al (2022)⁴⁰; United States	Rheumatology notes from the RISE^b registry	Rheumatology patients	Ambulating Feeding Dressing Personal hygiene Toileting
Alves et al (2022)⁴¹; United States	Clinical notes from neurology practices included in the OM1 MS Registry	Patients with MS^c	Ambulating (mobility impairments)
Chen et al (2019)⁴²; United States	Clinical notes from a large group practice	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Banerjee et al (2019)⁴³; United States	Clinical notes from a tertiary care academic Medical center included in a research database	Prostate cancer patients	Continence (UI^a and fecal incontinence)
Meskers et al (2022)⁴⁴; The Netherlands	Clinical notes from a large teaching hospital	Hospitalized COVID-19 patients	Ambulating (mobility activities)
Rivera et al (2022)⁴⁵; United States	Provider documentation, discharge notes, and PT and occupational therapy documentation from a large stroke referral center	Ischemic stroke patients	Ambulating (Modified Raking Scale)
Chen et al (2019)⁴⁶; United States	Unstructured free-text from a large multispecialty medical group	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Gori et al (2019)⁴⁷; United States	Clinical notes from a tertiary academic medical center included in a research data warehouse	Prostate cancer patients	Continence (UI^a)
Newman-Griffis et al (2018)⁴⁸; United States	PT^d notes from a Clinical Center	Patients with PT^d notes	Ambulating (mobility)
Bozkurt et al (2020)⁴⁹; United States	Clinical notes from an Academic Medical Centre included in a prostate cancer clinical data warehouse	Patients diagnosed with prostate cancer	Continence (UI^a)
Doing-Harris et al (2019)⁵⁰; United States	Clinical notes from a Veteran Health Administration Repository (VINCI)	Veterans diagnosed with cardiac disease	Ambulating (able to mobilize, bed-ridden, wheelchair-bound) Dressing
Goudar-zvand et al (2019)⁵¹; United States	Clinical notes and current visit information from the Mayo Clinic Biobank	Physician-diagnosed CI^e and CU^f patients aged 65 years and older	Ambulating (transferring) Feeding Dressing Personal hygiene (bathing) Toileting
Greve et al (2022)⁵²; United States	Clinical notes from a single tertiary medical center	Patients with cerebral palsy	Ambulating
Thieu et al (2021)⁵³; United States	PT^d notes from a Clinical Center	Patients with PTi notes	Ambulating (mobility domain of the ICF^g)
Newman-Griffis et al (2021)⁵⁴; United States	Notes from the Rehabilitation and Medicine Department at the NIH Clinical Center using databases of the NIH Biomedical Translational Research Information System	Patients receiving PT^d	Ambulating (mobility activities)
Newman-Griffis et al (2021)⁵⁵; United States	Claims database	Patients receiving disability benefits primarily related to musculo-skeletal, neurological, or mental impairments	Ambulating (mobility) Personal hygiene (self-care)
Sung et al (2021)⁵⁶; Taiwan (English written records)	Clinical notes from 2 hospital stroke registries	Patients hospitalized for acute ischemic stroke	Ambulating (Modified Raking Scale)
Yang et al (2022)⁵⁷; Canada	Clinical notes from a clinical database from a large MS^c clinic	Patients with MS^c	Ambulating (mobility)

UI: urinary incontinence.

RISE: American College of Rheumatology’s Rheumatology Informatics System for Effectiveness.

MS: Multiple Sclerosis.

PT: Physical Therapy.

CI: cognitive impaired.

CU: Cognitive Unimpaired.

ICF: International Classification of Functioning, Disability and Health.

Table 1.

Characteristics of the study and EHR data.

Study (year); country	Source(s) of routinely recorded EHRs on ADL	Study population	Type of ADL included
Anzaldi et al (2017)³⁷; United States	Clinical notes from a nonprofit medical group (hospital, emergency department, and nursing home)	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Kharrazi et al (2018)²^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Kan et al (2018)³⁸^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Hernandez-Boussard et al (2017)³⁹; United States	Clinical notes from a single, large, academic medical center	Prostate cancer patients	Continence (UI^a)
Humbert-Droz et al (2022)⁴⁰; United States	Rheumatology notes from the RISE^b registry	Rheumatology patients	Ambulating Feeding Dressing Personal hygiene Toileting
Alves et al (2022)⁴¹; United States	Clinical notes from neurology practices included in the OM1 MS Registry	Patients with MS^c	Ambulating (mobility impairments)
Chen et al (2019)⁴²; United States	Clinical notes from a large group practice	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Banerjee et al (2019)⁴³; United States	Clinical notes from a tertiary care academic Medical center included in a research database	Prostate cancer patients	Continence (UI^a and fecal incontinence)
Meskers et al (2022)⁴⁴; The Netherlands	Clinical notes from a large teaching hospital	Hospitalized COVID-19 patients	Ambulating (mobility activities)
Rivera et al (2022)⁴⁵; United States	Provider documentation, discharge notes, and PT and occupational therapy documentation from a large stroke referral center	Ischemic stroke patients	Ambulating (Modified Raking Scale)
Chen et al (2019)⁴⁶; United States	Unstructured free-text from a large multispecialty medical group	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Gori et al (2019)⁴⁷; United States	Clinical notes from a tertiary academic medical center included in a research data warehouse	Prostate cancer patients	Continence (UI^a)
Newman-Griffis et al (2018)⁴⁸; United States	PT^d notes from a Clinical Center	Patients with PT^d notes	Ambulating (mobility)
Bozkurt et al (2020)⁴⁹; United States	Clinical notes from an Academic Medical Centre included in a prostate cancer clinical data warehouse	Patients diagnosed with prostate cancer	Continence (UI^a)
Doing-Harris et al (2019)⁵⁰; United States	Clinical notes from a Veteran Health Administration Repository (VINCI)	Veterans diagnosed with cardiac disease	Ambulating (able to mobilize, bed-ridden, wheelchair-bound) Dressing
Goudar-zvand et al (2019)⁵¹; United States	Clinical notes and current visit information from the Mayo Clinic Biobank	Physician-diagnosed CI^e and CU^f patients aged 65 years and older	Ambulating (transferring) Feeding Dressing Personal hygiene (bathing) Toileting
Greve et al (2022)⁵²; United States	Clinical notes from a single tertiary medical center	Patients with cerebral palsy	Ambulating
Thieu et al (2021)⁵³; United States	PT^d notes from a Clinical Center	Patients with PTi notes	Ambulating (mobility domain of the ICF^g)
Newman-Griffis et al (2021)⁵⁴; United States	Notes from the Rehabilitation and Medicine Department at the NIH Clinical Center using databases of the NIH Biomedical Translational Research Information System	Patients receiving PT^d	Ambulating (mobility activities)
Newman-Griffis et al (2021)⁵⁵; United States	Claims database	Patients receiving disability benefits primarily related to musculo-skeletal, neurological, or mental impairments	Ambulating (mobility) Personal hygiene (self-care)
Sung et al (2021)⁵⁶; Taiwan (English written records)	Clinical notes from 2 hospital stroke registries	Patients hospitalized for acute ischemic stroke	Ambulating (Modified Raking Scale)
Yang et al (2022)⁵⁷; Canada	Clinical notes from a clinical database from a large MS^c clinic	Patients with MS^c	Ambulating (mobility)

Study (year); country	Source(s) of routinely recorded EHRs on ADL	Study population	Type of ADL included
Anzaldi et al (2017)³⁷; United States	Clinical notes from a nonprofit medical group (hospital, emergency department, and nursing home)	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Kharrazi et al (2018)²^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Kan et al (2018)³⁸^a; United States	Same as³⁷	Same as³⁷	Same as³⁷
Hernandez-Boussard et al (2017)³⁹; United States	Clinical notes from a single, large, academic medical center	Prostate cancer patients	Continence (UI^a)
Humbert-Droz et al (2022)⁴⁰; United States	Rheumatology notes from the RISE^b registry	Rheumatology patients	Ambulating Feeding Dressing Personal hygiene Toileting
Alves et al (2022)⁴¹; United States	Clinical notes from neurology practices included in the OM1 MS Registry	Patients with MS^c	Ambulating (mobility impairments)
Chen et al (2019)⁴²; United States	Clinical notes from a large group practice	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Banerjee et al (2019)⁴³; United States	Clinical notes from a tertiary care academic Medical center included in a research database	Prostate cancer patients	Continence (UI^a and fecal incontinence)
Meskers et al (2022)⁴⁴; The Netherlands	Clinical notes from a large teaching hospital	Hospitalized COVID-19 patients	Ambulating (mobility activities)
Rivera et al (2022)⁴⁵; United States	Provider documentation, discharge notes, and PT and occupational therapy documentation from a large stroke referral center	Ischemic stroke patients	Ambulating (Modified Raking Scale)
Chen et al (2019)⁴⁶; United States	Unstructured free-text from a large multispecialty medical group	Patients aged over 65	Ambulating (walking difficulty) Continence (absence of fecal control and severe urinary control issues)
Gori et al (2019)⁴⁷; United States	Clinical notes from a tertiary academic medical center included in a research data warehouse	Prostate cancer patients	Continence (UI^a)
Newman-Griffis et al (2018)⁴⁸; United States	PT^d notes from a Clinical Center	Patients with PT^d notes	Ambulating (mobility)
Bozkurt et al (2020)⁴⁹; United States	Clinical notes from an Academic Medical Centre included in a prostate cancer clinical data warehouse	Patients diagnosed with prostate cancer	Continence (UI^a)
Doing-Harris et al (2019)⁵⁰; United States	Clinical notes from a Veteran Health Administration Repository (VINCI)	Veterans diagnosed with cardiac disease	Ambulating (able to mobilize, bed-ridden, wheelchair-bound) Dressing
Goudar-zvand et al (2019)⁵¹; United States	Clinical notes and current visit information from the Mayo Clinic Biobank	Physician-diagnosed CI^e and CU^f patients aged 65 years and older	Ambulating (transferring) Feeding Dressing Personal hygiene (bathing) Toileting
Greve et al (2022)⁵²; United States	Clinical notes from a single tertiary medical center	Patients with cerebral palsy	Ambulating
Thieu et al (2021)⁵³; United States	PT^d notes from a Clinical Center	Patients with PTi notes	Ambulating (mobility domain of the ICF^g)
Newman-Griffis et al (2021)⁵⁴; United States	Notes from the Rehabilitation and Medicine Department at the NIH Clinical Center using databases of the NIH Biomedical Translational Research Information System	Patients receiving PT^d	Ambulating (mobility activities)
Newman-Griffis et al (2021)⁵⁵; United States	Claims database	Patients receiving disability benefits primarily related to musculo-skeletal, neurological, or mental impairments	Ambulating (mobility) Personal hygiene (self-care)
Sung et al (2021)⁵⁶; Taiwan (English written records)	Clinical notes from 2 hospital stroke registries	Patients hospitalized for acute ischemic stroke	Ambulating (Modified Raking Scale)
Yang et al (2022)⁵⁷; Canada	Clinical notes from a clinical database from a large MS^c clinic	Patients with MS^c	Ambulating (mobility)

UI: urinary incontinence.

RISE: American College of Rheumatology’s Rheumatology Informatics System for Effectiveness.

MS: Multiple Sclerosis.

PT: Physical Therapy.

CI: cognitive impaired.

CU: Cognitive Unimpaired.

ICF: International Classification of Functioning, Disability and Health.

Three studies were found to use the same dataset and NLP system.²^,³⁷^,³⁸ As 2 of them²^,³⁸ used the NLP system developed in the study by Anzaldi et al,³⁷ we only refer to the study of Anzaldi et al³⁷ in the remainder of this review. Thus, the total number of NLP systems we report on is 20.

Data source

Of the 20 studies, 1 study used clinical notes written in Dutch,⁴⁴ while the remaining 19 studies used English clinical notes³⁷^,^39–43^,^45–57 (Table 1). Most clinical notes used were retrieved directly from an EHR system (n = 11).³⁷^,³⁹^,^42–46^,⁴⁸^,⁵²^,⁵³^,⁵⁷ In the other studies, clinical notes were first transferred from an EHR system to a research database, registry, or claims database (n = 9).⁴⁰^,⁴¹^,⁴⁷^,^49–51^,^54–56 In such a database or registry, EHR data may be cleaned or combined with data from other sources before the data are transferred to the researchers.⁵⁸^,⁵⁹

Study population

The studies included in this review focused on a variety of diagnoses or patient groups (Table 1). The most frequently studied patient groups were patients aged over 65 (n = 4),³⁷^,⁴²^,⁴⁶^,⁵¹ patients with prostate cancer (n = 4),³⁹^,⁴³^,⁴⁷^,⁴⁹ patients suffering from a chronic disease (n = 4),⁴⁰^,⁴¹^,⁵²^,⁵⁷ and patients receiving physical therapy (n = 3).⁴⁸^,⁵³^,⁵⁴

Activities of daily living

Each of the 6 ADL is covered in at least 1 study (Table 1). However, none of the studies covered all 6 ADL. The majority of studies focused on 1 activity (n = 13),³⁹^,⁴¹^,^43–45^,^47–49^,^52–54^,⁵⁶^,⁵⁷ while others covered 2 (n = 5)³⁷^,⁴²^,⁴⁶^,⁵⁰^,⁵⁵ or 5 (n = 2)⁴⁰^,⁵¹ activities. The most frequently studied activities were ambulating (n = 16)^40–42^,^44–46^,⁴⁸^,^50–57^,⁶⁰ and continence (n = 7).³⁷^,³⁹^,⁴²^,⁴³^,⁴⁶^,⁴⁷^,⁴⁹

Purpose of using NLP

In 70% of the studies, NLP was used for classification purposes (n = 14/20),³⁷^,^42–46^,⁴⁹^,⁵⁰^,⁵²^,^54–57 for example, classifying patients as frail or not,⁴⁶^,⁵⁰ determining the presence and severity of urine incontinence,⁴³^,⁴⁹ and assigning ICF categories.⁵⁴^,⁵⁵ The remaining studies used NLP for information extraction (n = 3),^39–41 for Named Entity Recognition (n = 2),⁴⁸^,⁵³ or topic modeling (n = 1),⁵¹ as is shown in Table S2.

Type of NLP

The rule-based approach, the oldest and simplest NLP approach, was used as the sole method in 3 studies,³⁷^,³⁹^,⁴⁰ while another study combined rule-based with deep learning (Table S2).⁴⁹

More than half of the NLP systems relied on machine learning (n = 12); all of these studies were published in 2019 or later.^41–45^,^50–56 Eight of these studies applied a combination of machine learning and deep learning.⁴⁴^,^50–56^,⁶⁰ Various machine-learning algorithms were used, with Support Vector Machines (SVMs) being the most prevalent (n = 5).⁴⁴^,⁵⁰^,⁵²^,⁵⁴^,⁵⁵

Thirteen studies applied deep learning; all of them were published in 2018 or later.⁴⁴^,^46–57 Word2Vec was used in 7 studies.^46–49^,⁵²^,⁵⁴^,⁵⁷ Among the 13 deep-learning NLP systems, 2 studies were based on ClinicalBERT⁵⁵^,⁵⁶ and 1 study used BERTje.⁴⁴ These 3 studies were published in 2021 or later.

Pre-processing

Table 2 shows a variety of pre-processing steps applied to prepare unstructured notes, with tokenization the most frequently used pre-processing technique (n = 10).^39–42^,⁴⁶^,⁴⁹^,^52–55

Table 2.

Pre-processing steps used in the included studies.

Pre-processing step	Number of studies	References
Tokenization	10	^39–42^,⁴⁶^,⁴⁹^,^52–55
Stop-word removal	6	⁴¹^,⁴³^,⁴⁹^,⁵¹^,⁵⁶^,⁵⁷
None	4	³⁷^,⁴⁴^,⁴⁵^,⁴⁸
Normalization	3	⁴³^,⁵⁵^,⁵⁶
Removal of redundant information	3	⁴⁰^,⁴⁹^,⁵⁷
Sentence splitting	2	³⁹^,⁴⁹
Lemmatization	2	⁴¹^,⁵²
Stemming	2	⁴³^,⁵¹
Sentence segmentation	2	⁴²^,⁵⁴
Lowercase	2	⁵⁵^,⁵⁶
Removal of identifying information	2	⁴⁰^,⁵⁷
Standard tool for text-cleaning methodologies, not further defined	1	⁴⁷
Manual	1	⁵⁰
Removal of formatting	1	⁴⁰

Pre-processing step	Number of studies	References
Tokenization	10	^39–42^,⁴⁶^,⁴⁹^,^52–55
Stop-word removal	6	⁴¹^,⁴³^,⁴⁹^,⁵¹^,⁵⁶^,⁵⁷
None	4	³⁷^,⁴⁴^,⁴⁵^,⁴⁸
Normalization	3	⁴³^,⁵⁵^,⁵⁶
Removal of redundant information	3	⁴⁰^,⁴⁹^,⁵⁷
Sentence splitting	2	³⁹^,⁴⁹
Lemmatization	2	⁴¹^,⁵²
Stemming	2	⁴³^,⁵¹
Sentence segmentation	2	⁴²^,⁵⁴
Lowercase	2	⁵⁵^,⁵⁶
Removal of identifying information	2	⁴⁰^,⁵⁷
Standard tool for text-cleaning methodologies, not further defined	1	⁴⁷
Manual	1	⁵⁰
Removal of formatting	1	⁴⁰

Table 2.

. https://doi.org/10.1002/acr.24861

Pre-processing steps used in the included studies.

Pre-processing step	Number of studies	References
Tokenization	10	^39–42^,⁴⁶^,⁴⁹^,^52–55
Stop-word removal	6	⁴¹^,⁴³^,⁴⁹^,⁵¹^,⁵⁶^,⁵⁷
None	4	³⁷^,⁴⁴^,⁴⁵^,⁴⁸
Normalization	3	⁴³^,⁵⁵^,⁵⁶
Removal of redundant information	3	⁴⁰^,⁴⁹^,⁵⁷
Sentence splitting	2	³⁹^,⁴⁹
Lemmatization	2	⁴¹^,⁵²
Stemming	2	⁴³^,⁵¹
Sentence segmentation	2	⁴²^,⁵⁴
Lowercase	2	⁵⁵^,⁵⁶
Removal of identifying information	2	⁴⁰^,⁵⁷
Standard tool for text-cleaning methodologies, not further defined	1	⁴⁷
Manual	1	⁵⁰
Removal of formatting	1	⁴⁰

Pre-processing step	Number of studies	References
Tokenization	10	^39–42^,⁴⁶^,⁴⁹^,^52–55
Stop-word removal	6	⁴¹^,⁴³^,⁴⁹^,⁵¹^,⁵⁶^,⁵⁷
None	4	³⁷^,⁴⁴^,⁴⁵^,⁴⁸
Normalization	3	⁴³^,⁵⁵^,⁵⁶
Removal of redundant information	3	⁴⁰^,⁴⁹^,⁵⁷
Sentence splitting	2	³⁹^,⁴⁹
Lemmatization	2	⁴¹^,⁵²
Stemming	2	⁴³^,⁵¹
Sentence segmentation	2	⁴²^,⁵⁴
Lowercase	2	⁵⁵^,⁵⁶
Removal of identifying information	2	⁴⁰^,⁵⁷
Standard tool for text-cleaning methodologies, not further defined	1	⁴⁷
Manual	1	⁵⁰
Removal of formatting	1	⁴⁰

In general, deep learning requires less pre-processing compared to rule-based and machine-learning models. In the 3 studies that employed deep learning only, little to no pre-processing was performed. One of the 3 studies⁴⁶ only used sentence segmentation, the second study reported no pre-processing steps,⁴⁸ and the third study used a standard tool for text-cleaning methodologies.⁴⁷ However, as the precise methodologies applied to the dataset were not specified in this last study, the extent of data pre-processing remains unclear.

Pre-processing details were not reported in 3 other studies. Anzaldi et al used a rule-based approach for identifying geriatrics syndromes in EHR free-text notes as well as the explicit mention of “frailty” in the notes.³⁷ In such an approach, pre-processing is not always a necessity. In the study of Rivera et al, although pre-processing was not reported, we cannot conclude that data pre-processing was therefore not applied as a machine learning usually involves pre-processing.⁴⁵ Lastly, Meskers et al used BERTje to create vectors instead of pre-processing methods.⁴⁴

Software

The studies used different software (Table S2). Most studies used Python (n = 11).⁴⁰^,⁴¹^,⁴³^,⁴⁴^,⁴⁹^,⁵⁰^,^53–57 For 4 studies it is unclear which software was used.³⁷^,⁴⁷^,⁴⁸^,⁵² In addition, Javascript or a tool developed for NLP applications, including cTakes, GATE, MedTagger, and CRFSuite were used.³⁹^,⁴²^,⁴⁵^,⁴⁶^,⁵¹

Methods used to evaluate NLP system performance

Almost all studies used cross-validation or train-/test datasets to evaluate the NLP system’s performance (Table S2). Six studies evaluated their NLP system using both cross-validation and train-/test datasets.⁴³^,⁴⁴^,⁴⁹^,⁵²^,⁵⁵^,⁵⁷ In 4 studies, an expert manually evaluated the performance of the NLP system. Five other studies used solely train-/test datasets (n = 5) and 4 other studies only cross-validation (n = 4).⁵⁰^,⁵³^,⁵⁴^,⁵⁶

The study by Goudarzvand et al used recent publications to evaluate their NLP system,⁵¹ which was used for topic modeling—the only study in our review with this purpose. As topic modeling lacks a gold standard for comparing the outcome of the model with, they validated the results against recent publications to verify whether meaningful outcomes were generated.

The 3 most frequently reported evaluation metrics for the NLP performance for ADL were the F1 score (or harmonic mean of precision and sensitivity), (n = 12), precision (or positive predictive value) (n = 8), and sensitivity (or recall) (n = 7). Other primary evaluation metrics reported were accuracy (n = 4), area under the curve (AUC) (n = 4), Inter-Annotator Agreement (n = 3), specificity (n = 1), negative predictive value (n = 1), false positive rate (n = 1), and root-mean-square error (n = 1).

Outcomes of the performance evaluation

More than half of the studies reported relatively high scores for the evaluation metrics (n = 12),³⁷^,³⁹^,⁴¹^,⁴⁴^,^47–50^,^52–54^,⁵⁶ indicating good performance by the NLP systems for that dataset and the purpose of the study (Table S2). This was particularly the case for systems extracting information on ambulating.

Other studies reported mixed performance outcomes (n = 7).⁴⁰^,⁴²^,⁴³^,⁴⁵^,⁴⁶^,⁵⁵^,⁵⁷ Some of the studies showed different outcomes when the results were stratified based on the type of ADL. For instance, Chen et al used a machine-learning model for classifying geriatric syndrome constructs. High scores were obtained for fecal control (F1 = 0.857) and walking difficulty (F1 = 0.758), but for severe urinary control issues low scores were obtained (F1 = 0.532).⁴² Another study by Chen et al applied a deep-learning model for the same constructs and obtained comparably mixed scores.⁴⁶ Humbert-Droz et al found that scores varied depending on the method used to evaluate the NLP system. They evaluated NLP performance by comparing the outcome of the NLP tool with (1) a manual review, (2) structured EHR data, and (3) an external database. The highest scores for sensitivity, positive predictive value, and F1 scores were observed for the manual review, while the lowest scores were found in the comparison with structured EHR data. Humbert-Droz et al pointed out that this does not necessarily reflect the NLP system’s performance. They encountered several issues with structured EHR data limiting their use as the gold standard in an evaluation.⁴⁰ Furthermore, mixed results were found when different approaches were compared. For example, Yang et al showed higher scores for a combined ruled-based and deep-learning model, compared to the scores for each approach individually. They noted that this hybrid approach was better at leveraging the strengths of each approach and tackling challenges with regard to the dataset, including imbalanced data.⁵⁷

Authors of the studies in this systematic review identified several limitations concerning the NLP systems they developed. Generalizability to other healthcare sectors, practices, languages, patient groups, or data sources emerged as a significant challenge,¹³^,³⁷^,^39–41^,⁴³^,⁴⁷^,⁴⁹^,^52–57 as the NLP systems were trained on datasets with specific characteristics. Another major challenge relates to the dataset on which the NLP system is trained and tested. Authors reported issues with small datasets due to factors such as restricted access to relevant EHR data, few amount of notes per patient due to a short hospital stay, or few patients in the study sample.³⁷^,³⁹^,⁴⁹^,⁵⁴ In addition, inadequate documentation and lack of granularity were mentioned.⁴¹^,^43–45^,⁴⁹^,⁵¹^,⁵²^,⁵⁴^,⁵⁷

Discussion

Principal findings

This systematic review provides a comprehensive overview of current research employing NLP to extract information on ADL from unstructured free-text notes in EHRs. Adequate information on ADL is important for care provision and research to ensure that individuals receive the necessary daily support. As information on ADL is often recorded in unstructured free-text EHR notes, NLP could be valuable for deriving this information. We explored 20 NLP systems described in 22 studies. Most studies (65%) utilized NLP for classifying unstructured EHR data on 1 or 2 ADL. Our findings show that a variety of NLP methods, algorithms, and pre-processing steps were used. There was a notable prevalence of deep-learning approaches. The majority of studies using deep learning also applied ruled-based methods or machine learning. Evaluation of the NLP system’s performance predominantly involved train-/test datasets and cross-validation. The studies included in this review used a wide range of evaluation metrics, including F1, precision, and sensitivity. Despite the variety of NLP approaches and evaluation metrics, most studies reported relativity high overall scores on the evaluation metrics, indicating that the characteristics of the best-performing NLP system depend on study-specific factors.

The variability in models, approaches, and reporting complicates the direct comparison between the NLP systems and the quest for the best possible method. However, overall, the results of this review indicate that NLP systems are promising for research using unstructured EHR data on ADL for the following reasons.

First, the field of NLP is developing rapidly. It has evolved from ruled-based methods to machine learning and deep learning. Compared to previous systematic reviews on the use of NLP for unstructured EHR notes, we included relatively more deep-learning approaches.³²^,³⁴^,³⁵ This shows that relatively new deep-learning algorithms, including transformers such as BERT, are being studied for NLP systems to extract information from unstructured clinical notes on ADL.

To improve the performance of the NLP system, often multiple approaches are compared or combined. Most studies adopted a hybrid approach by combining deep learning with ruled-based or machine-learning algorithms in their final model. The possible benefits of hybrid approaches are also recognized by systematic reviews that focused on the application of NLP in other healthcare domains, including radiology,⁶¹^,⁶² clinical information in general,³¹^,³⁴ and chronic diseases.³⁵ Hybrid approaches may be better able to address challenges related to the dataset, such as small or imbalanced datasets. Some of the studies included in this review encountered challenges with the datasets arising from how the information was recorded during healthcare provision, such as inadequate recordings or a low level of granularity, or because they did not have access to all relevant EHR data. These challenges are not unique to unstructured data but are also mentioned in the broader literature discussing data quality challenges in the use of EHR data for research (eg,⁵⁹^,⁶³)

Second, the characteristics of the best-performing NLP system depend on the context in which the dataset is generated, such as different EHR systems and different healthcare organizations. The studies included in this review that retrieved the data directly from an EHR system, rather than from a research database or registry, had access to data from a single organization or from organizations belonging to one medical group. It is expected that the NLP system will perform differently on datasets with other characteristics. NLP systems trained on datasets from multiple sources with different characteristics will have a higher external validity.

Third, a variety of metrics were used to evaluate the performance of the NLP systems. However, most studies evaluated the performance with train-/test datasets and cross-validation and reported F1 scores. Although the most appropriate evaluation metrics depend on the research aim, F1 scores are commonly valuable in many cases, especially for classification purposes, which was the most prevalent purpose of the NLP systems in this review. Almost all F1 scores exceeded 0.7. This indicates that the methodologies used in developing the NLP systems, considering the characteristics of the specific dataset and research question of the study, are promising for generating information on ADL from unstructured EHR data.

Strengths, limitations, and recommendations for further research

To the best of our knowledge, this is the first systematic review exploring NLP systems for extracting information on ADL from unstructured EHR data. A strength is that we used a broad search strategy in 5 different literature databases. However, the following limitations should be kept in mind. First, while ambulating and continence were covered by most studies, some ADL were only included in a few NLP systems. More research on NLP systems covering all 6 ADL is recommended. Second, some studies provided limited information on the algorithms, for example with few details on the pre-processing. Future NLP studies should prioritize adequate reporting, as is emphasized in other systematic reviews as well.⁶¹^,^64–66 Third, the field of NLP is developing rapidly. To keep up with the developments, it is recommended to conduct the search again in the near future.

Conclusion

The results of this systematic review indicate that NLP is a promising method for deriving information on ADL from unstructured EHR notes. Various NLP systems are already used in research and show overall good evaluation outcomes. Choosing which NLP system will perform best, depends on the characteristics of the dataset, research question, and type of ADL studied. Since there is no one-size-fits-all method, our findings suggest that research on ADL could benefit from an iterative process in which different NLP approaches are compared or combined based on the performance evaluation outcomes. Future developments in NLP for ADL extraction should focus on addressing generalizability issues and refining evaluation methodologies.

Author contributions

Mariska G. Oosterveld-Vlug, Robert A. Verheij, Anneke L. Francke, and Yvonne Wieland-Jorna conceptualized and designed this review. Yvonne Wieland-Jorna, Daan van Kooten, Yvonne de Man, and Mariska G. Oosterveld-Vlug screened the literature. Data extraction was performed by Yvonne Wieland-Jorna and Daan van Kooten and checked by Yvonne Wieland-Jorna, Daan van Kooten, Robert A. Verheij, Anneke L. Francke, or Mariska G. Oosterveld-Vlug. Yvonne Wieland-Jorna and Daan van Kooten analyzed the data. Anneke L. Francke and Mariska G. Oosterveld-Vlug supervised and guided the project throughout the process. Yvonne Wieland-Jorna wrote the initial draft of the manuscript. Daan van Kooten contributed to and advised on the NLP aspects of the manuscript. Daan van Kooten, Robert A. Verheij, Anneke L. Francke, and Mariska G. Oosterveld-Vlug reviewed and edited the manuscript. All authors have approved the final version of the manuscript for publication.

Supplementary material

Supplementary material is available at JAMIA Open online.

Funding

The research described in this paper is part of the “Learning from Data” program, funded by the Ministry of Health, Welfare, and Sports in the Netherlands.

Conflicts of interest

None declared.

Data availability

Data are available on request.

References

Arslan

Damen

de Wilde

, et al.

Incidence and prevalence of knee osteoarthritis using codified and narrative data from electronic health records: a population-based study

Arthritis Care Res (Hoboken)

2022

;

(

937

944

Kharrazi

Anzaldi

Hernandez

, et al.

The value of unstructured electronic health record data in geriatric syndrome case identification

J Am Geriatr Soc

2018

;

(

1499

1507

. https://doi.org/10.1111/jgs.15411

Scheurwegs

Luyckx

Luyten

Daelemans

Van den Bulcke

Data integration of structured and unstructured sources for assigning clinical codes to patient stays

J Am Med Inform Assoc

2016

;

(

e11

e19

. https://doi.org/10.1093/jamia/ocv115

Seinen

Kors

van Mulligen

Fridgeirsson

Rijnbeek

PR.

The added value of text from Dutch general practitioner notes in predictive modeling

J Am Med Inform Assoc

2023

;

(

1973

1984

. https://doi.org/10.1093/jamia/ocad160

Murdoch

Detsky

AS.

The inevitable application of big data to health care

JAMA

2013

;

309

(

1351

1352

. https://doi.org/10.1001/jama.2013.393

Afrizal

Hidayanto

Handayani

Budiharsana

Eryando

Narrative review for exploring barriers to readiness of electronic health record implementation in primary health care

Healthc Inform Res

2019

;

(

141

152

Rahal

Mercer

Kuziemsky

Yaya

Factors affecting the mature use of electronic medical records by primary care physicians: a systematic review

BMC Med Inform Decis Mak

2021

;

(

Skube

Lindemann

Arsoniadis

Akre

Wick

Melton

GB.

Characterizing functional health status of surgical patients in clinical notes

AMIA Jt Summits Transl Sci Proc.

2018

;

2017

379

388

. https://doi.org/10.1111/jgs.17776

Schiltz

Foradori

Reimer

Plow

Dolansky

MA.

Availability of information on functional limitations in structured electronic health records data

J Am Geriatr Soc

2022

;

(

2161

2163

Iezzoni

LI.

Multiple chronic conditions and disabilities: implications for health services research and data demands

Health Serv Res

2010

;

(

5 Pt 2

1523

1540

. https://doi.org/10.1111/j.1475-6773.2010.01145.x

. https://doi.org/10.1111/j.1748-3743.2007.00074.x

Edemekong

Bomgaars

Sukumaran

Schoo

Activities of Daily Living. StatPearls;

2023

Hartigan

A comparative review of the katz ADL and the barthel index in assessing the activities of daily living of older people

Int J Older People Nurs

2007

;

(

204

212

World Health Organization

. Towards a Common Language for Functioning, Disability and Health ICF. Geneva: World Health Organization;

2002

Kurtzke

JF.

Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS)

Neurology

1983

;

(

1444

1452

. https://doi.org/10.1212/wnl.33.11.1444

Schag

Heinrich

Ganz

PA.

Karnofsky performance status revisited: reliability, validity, and guidelines

J Clin Oncol

1984

;

(

187

193

. https://doi.org/10.1200/jco.1984.2.3.187

Mehta

Pandit

Concurrence of big data analytics and healthcare: a systematic review

Int J Med Inform

2018

;

114

. https://doi.org/10.1016/j.ijmedinf.2018.03.013

Savova

Pestian

Connolly

Miller

Dexheimer

JW.

Natural language processing: applications in pediatric research. In: Pediatric Biomedical Informatics: Computer Applications in Pediatric Research. Translational Bioinformatics, Vol. 10. Singapore: Springer;

2016

231

250

. https://doi.org/10.1007/978-981-10-1104-7_12

Bohr

Memarzadeh

The rise of artificial intelligence in healthcare applications. In: Artificial Intelligence in Healthcare.

2020

. https://doi.org/10.1016/B978-0-12-818438-7.00002-2

Sun

Cai

Liu

Fang

Wang

Data processing and text mining technologies on electronic medical records: a review

J Healthc Eng

2018

;

2018

4302425

. https://doi.org/10.1155/2018/4302425

Kannan

Gurusamy

Vijayarani

, et al.

Preprocessing techniques for text mining

Int J Comput Sci Commun Netw

2014

;

(

. https://doi.org/10.1016/j.procs.2013.05.005

Haddi

Liu

Shi

The role of text pre-processing in sentiment analysis

Procedia Comput Sci

2013

;

. https://doi.org/10.1016/j.eswa.2018.06.022

Symeonidis

Effrosynidis

Arampatzis

A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis

Expert Syst Appl

2018

;

110

298

310

Johnson

Murty

Navakanth

A detailed review on word embedding techniques with emphasis on word2vec

Multimed Tools Appl

2023

;

(

37979

38007

. https://doi.org/10.1016/j.eng.2019.12.014

Yang

Bian

Hogan

Clinical concept extraction using transformers

J Am Med Inform Assoc

2020

;

(

1935

1942

Zhou

Duan

Liu

Shum

H-Y.

Progress in neural NLP: modeling, learning, and reasoning

Engineering

2020

;

(

275

290

. https://doi.org/10.1016/j.jbi.2018.10.005

Huang

Altosaar

Ranganath

2019

. Clinicalbert: modeling clinical notes and predicting hospital readmission. arXiv, arXiv:1904.05342, preprint: not peer reviewed.

Roberts

Datta

, et al.

Deep learning in clinical natural language processing: a methodical review

J Am Med Inform Assoc

2020

;

(

457

470

Velupillai

Suominen

Liakata

, et al.

Using clinical natural language processing for health outcomes research: overview and actionable suggestions for future advances

J Biomed Inform

2018

;

Ghojogh

Crowley

2019

. The theory behind overfitting, cross validation, regularization, bagging, and boosting: tutorial. arXiv, arXiv:1905.12787, preprint: not peer reviewed.

Salman

Liu

2019

. Overfitting mechanism and avoidance in deep neural networks. arXiv, arXiv:1901.06566, preprint: not peer reviewed.

Pan

Goldwasser

, et al.

Neural natural language processing for unstructured data in electronic health records: a review

Comput Sci Rev

2022

;

100511

. https://doi.org/10.1016/j.cosrev.2022.100511

. https://doi.org/10.1093/jamia/ocy173

Koleck

Dreisbach

Bourne

Bakken

Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review

J Am Med Inform Assoc

2019

;

(

364

379

Datta

Bernstam

Roberts

A frame semantic overview of NLP-based information extraction for cancer-related EHR notes

J Biomed Inform

2019

;

100

103301

. https://doi.org/10.1016/j.jbi.2019.103301

Kreimeyer

Foster

Pandey

, et al.

Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review

J Biomed Inform

2017

;

. https://doi.org/10.1016/j.jbi.2017.07.012

Sheikhalishahi

Miotto

Dudley

Lavelli

Rinaldi

Osmani

Natural language processing of clinical notes on chronic diseases: systematic review

JMIR Med Inform

2019

;

(

e12239

. https://doi.org/10.2196/12239

Moher

Liberati

Tetzlaff

Altman

Group

;

PRISMA Group

Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement

Ann Int Med

2009

;

151

(

264

269, W64

Anzaldi

Davison

Boyd

Leff

Kharrazi

Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: a retrospective cohort study

BMC Geriatr

2017

;

(

248

. https://doi.org/10.1186/s12877-017-0645-7

Kan

Kharrazi

Leff

, et al.

Defining and assessing geriatric risk factors and associated health care utilization among older adults using claims and electronic health records

Med Care

2018

;

(

233

239

. https://doi.org/10.1097/MLR.0000000000000865

Hernandez-Boussard

Kourdis

Seto

, et al.

Mining electronic health records to extract patient-centered outcomes following prostate cancer treatment

AMIA Annu Symp Proc

2017

;

2017

876

882

. https://doi.org/10.1002/acr.24869

Humbert-Droz

Izadi

Schmajuk

, et al.

Development of a natural language processing system for extracting rheumatoid arthritis outcomes from clinical notes using the national rheumatology informatics system for effectiveness registry

Arthritis Care Res (Hoboken)

2022

;

(

608

615

Alves

Green

Leavy

, et al.

Validation of a machine learning approach to estimate expanded disability status scale scores for multiple sclerosis

Mult Scler J Exp Transl Clin

2022

;

(

20552173221108635

. https://doi.org/10.1177/20552173221108635

. https://doi.org/10.2196/13039

Chen

Dredze

Weiner

Hernandez

Kimura

Kharrazi

Extraction of geriatric syndromes from electronic health record clinical notes: assessment of statistical natural language processing methods

JMIR Med Inform

2019

;

(

e13039

Banerjee

Seneviratne

, et al.

Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment

JAMIA Open

2019

;

(

150

159

. https://doi.org/10.1093/jamiaopen/ooy057

Meskers

CGM

van der Veen

Kim

, et al.

Automated recognition of functioning, activity and participation in COVID-19 from electronic patient records by natural language processing: a proof- of- concept

Ann Med

2022

;

(

235

243

. https://doi.org/10.1080/07853890.2021.2025418

Rivera

Burton

Hayson

, et al.

Neurologic outcomes of carotid and other emergent interventions for ischemic stroke over 6 years with dataset enhanced by machine learning

J Vasc Surg

2022

;

(

1280

1288.e2

. https://doi.org/10.1016/j.jvs.2022.06.020

Chen

Dredze

Weiner

Kharrazi

Identifying vulnerable older adult populations by contextualizing geriatric syndrome information in clinical notes of electronic health records

J Am Med Inform Assoc

2019

;

(

8-9

787

795

. https://doi.org/10.1093/jamia/ocz093

Gori

Banerjee

Chung

, et al.

Extracting patient-centered outcomes from clinical notes in electronic health records: assessment of urinary incontinence after radical prostatectomy

EGEMS (Wash DC)

2019

;

(

. https://doi.org/10.5334/egems.297

. https://doi.org/10.18653/v1/w18-2301

Newman-Griffis D, Zirikly A. Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility. Association for Computational Linguistics (ACL);

2018

Bozkurt

Paul

Coquet

, et al.

Phenotyping severity of patient-centered outcomes using clinical notes: a prostate cancer use case

Learn Health Syst

2020

;

(

e10237

. https://doi.org/10.1002/lrh2.10237

Doing-Harris

Bray

Thackeray

, et al.

Development of a cardiac-centered frailty ontology

J Biomed Semantics

2019

;

(

. https://doi.org/10.1186/s13326-019-0195-3

Goudarzvand

St Sauver

Mielke

Takahashi

Lee

Sohn

Early temporal characteristics of elderly patient cognitive impairment in electronic health records

BMC Med Inform Decis Mak

2019

;

(

Suppl 4

149

. https://doi.org/10.1186/s12911-019-0858-0

. https://doi.org/10.1111/dmcn.15301

Greve

Bailes

, et al.

Gross motor function prediction using natural language processing in cerebral palsy

Dev Med Child Neurol

2022

;

(

100

106

Thieu

Maldonado

, et al.

A comprehensive study of mobility functioning information in clinical notes: entity hierarchy, corpus annotation, and sequence labeling

Int J Med Inform

2021

;

147

104351

. https://doi.org/10.1016/j.ijmedinf.2020.104351

Newman-Griffis

Fosler-Lussier

Automated coding of under-studied medical concept domains: linking physical activity reports to the international classification of functioning, disability, and health

Front Digit Health

2021

;

. https://doi.org/10.3389/fdgth.2021.620828

. https://doi.org/10.3389/fresc.2021.742702

Newman-Griffis

Maldonado

, et al.

Linking free text documentation of functioning and disability to the ICF with natural language processing

Front Rehabil Sci

2021

;