Abstract

Objective

Compare the changes in neuropsychological test scores between remote and in-person follow-up assessment over a 1-year period using standardized regression–based (SRB) change indices.

Method

Participants were from the Wake Forest Alzheimer’s Disease Research Center (ADRC; N = 230) [mean age: 68.6 (7.8) years; education: 16.3 (2.3) years; 71% female; 86% White] and cognitively normal (as defined by a CDR of 0) at baseline and follow-up [mean days: 420.03 (48.53)]. Follow-up testing with the Uniform Data Set v3 Cognitive Battery was completed in person (n = 121) or remotely (n = 109) via phone (n = 61) or video (n = 48). SRB change scores were calculated using published formulas. Chi-square analysis compared the frequency of scores falling outside of an SRB cut-point +/−1.645 for follow-up assessments and mean SRB change scores were compared.

Results

There were no significant differences in the frequency of SRB change scores for in-person versus remote follow-up assessments at the SRB cut-point. Similarly, one-way ANOVAs comparing mean SRB change scores revealed no significant differences between in-person, telephone, and video follow-up means for any of the tests.

Conclusions

Telephone and video cognitive assessments performed similarly to in-person assessment and offer a valuable tool for research and clinical applications.

INTRODUCTION

In-person administration of neuropsychological tests has long been the gold standard for the evaluation of cognitive functioning for both clinical and research purposes. Remote administration (i.e., via telephone or video) has become increasingly common in recent years, due in part to the COVID-19 pandemic (Sperling et al., 2024). Particularly for older adults, for whom COVID-19-related health concerns were often greatest, modification of neuropsychological tests for remote administration became an immediate and urgent need to allow for continuity of clinical care and research evaluations.

There is evidence that telephone (Rapp et al., 2012) and video (Alegret et al., 2021; Barton et al., 2011; Loh et al., 2007; Wadsworth et al., 2016) administered cognitive assessments are able to identify cognitive impairment and are well received by older adults. Further, a pilot study found that a remotely administered Uniform Data Set Version 3 (UDSv3) battery was well tolerated by older adults and resulted in similar adjudication outcomes compared to the in-person version (Sachs et al., 2024). A review of tele-neuropsychology literature published by Sperling and colleagues (2024) demonstrated that at the outset of the COVID-19 pandemic, there was a sharp increase in publications on tele-neuropsychology. Though this increased use of remote assessment largely arose out of an immediate need, remote assessments also have the potential to increase the accessibility of neuropsychological testing for individuals for whom in-person clinic is burdensome or unfeasible (e.g., older adults with mobility issues, individuals in rural areas, and individuals from underserved communities). However, there is a need for further evaluation of the validity of remotely administered neuropsychological tests (Sperling et al., 2024).

An important question is whether remote assessment can be used with in-person assessments to measure change in cognitive functions. Standardized regression–based (SRB) change indices quantify a difference between an observed follow-up score and an expected follow-up score that considers baseline scores, practice effects, test–retest reliability, and other desired variables (e.g., demographic characteristics, days between assessments). SRB estimates are thought to estimate change that may occur because of normal fluctuations and measurement error, with change outside of these estimates representing “true” change, though this change is agnostic to cause. These estimates can be reported as z-scores to allow for comparisons of “true” change across individuals and groups. To our knowledge, there are no existing studies that utilized SRB change indices to compare patterns of longitudinal change between in-person and remotely administered cognitive assessments. A benefit of using SRB change indices is that they can be used to develop more precise change estimates in test performance over time that account for baseline scores, practice effects, reliability, regression to the mean, and other desired variables (age, education, test–retest interval; McSweeny et al., 1993). These indices have already been developed for existing UDSv3 tests for individuals with normal cognition (as defined by a Clinical Dementia Rating [CDR] of 0 and amyloid negatively) at baseline and first follow-up (Kiselica et al., 2025) and therefore can be used to identify increases or decreases in test scores that fall outside of the predicted ranges.

We use the SRB indices to identify and compare the frequency of score changes that exceed SRB cut-points (above and below) in participants with stable normal cognition at baseline and 1-year follow-up tested via either remotely (via telephone or video) or in-person assessment with the UDSv3. We also compare mean SRB scores between in person, telephone, and video scores at follow-up. The aim of this analysis is to determine whether in-person, telephone, and video follow-up assessments yield similar SRB scores, which would be informative about whether remote follow-up testing, and what type of remote follow-up testing (in person or video), may be acceptable alternative to in-person follow-up.

METHODS

Participants

All participants (N = 230) were recruited from the Wake Forest Alzheimer’s Disease Research Center Clinical Core (ADRC). All participants completed informed consent for study procedures as part of their participation. Additionally, all study procedures for the clinical core study are approved by the university’s institutional review board. In order to be considered for the current study, participants had to have complete UDSv3 neuropsychological testing data from their initial study visit and their first follow-up visit. Follow-up visits had to occur at least 6 months after baseline visits and no more than 2 years afterward. Additionally, participants needed to have stable, normal cognition as defined by a CDR score of 0.0 at both baseline and first follow-up.

Cognitive Testing

Detailed descriptions of the cognitive tests within the UDSv3 neuropsychological battery are described in detail by Weintraub and coworkers (2018). Baseline cognitive testing was completed in-person for all participants. Follow-up cognitive testing was completed either in person (N = 121) or remotely (N = 109) via telephone or video. For this study, data from the following UDSv3 tests were used: semantic fluency (animals and vegetables), letter fluency (combined F and L), Craft Story (immediate and delayed verbatim recall), and number span forward and backward (number of correct trials). Other UDSv3 measures (e.g., Multilingual Naming Test; Montreal Cognitive Assessment) were not included because at the time these data were collected, they were not part of our site’s remotely administered cognitive battery.

SRB Change Index

An SRB change index score was computed for each participant for each of the UDSv3 tests listed earlier using SRB equations published and described in detail by Kiselica and coworkers (2025). Briefly, expected follow-up scores were calculated from each individual’s baseline score, years of education, age, race, and days between evaluations. These predicted scores were subtracted from the actual follow-up scores resulting in a difference score that is divided by the published standard error of the estimate to produce an SRB change index. SRB change index scores that fall outside of designated z-score cut points may be indicative of “true change” not attributable to baseline scores, practice effects, test–retest reliability, or demographic characteristics.

Statistical Analysis

Mean test scores at baseline and follow-up for participants who completed telephone, video, and in-person follow-up were compared to determine whether the participants who opted for different follow-up modalities were not different from one another. Demographic variables (age, education, race, gender) and number of days between assessments were also compared between groups. Chi-square analyses were used to compare the number of participants in each group whose scores fell outside of a Z-score range of +/−1.645 at the follow-up visit. This particular cutoff was selected because it is the cutoff most often used by neuropsychologists (Duff, 2012) resulting in only the top and bottom 5% of scores being identified as abnormal (McSweeny et al., 1993). Though +/−1.645 is the most common cutoff used by neuropsychologists in dementia research, it is somewhat arbitrary and scores falling just below the cutoff may also be indicative of change. Because independent-samples t-test analyses were conducted to evaluate mean differences in SRB scores across follow-up modalities, potentially meaningful group differences could still be detected even if individual test scores did not fall outside of the +/−1.645 cutoff.

RESULTS

Sample Characteristics

Data from 230 individuals (mean age: 68.64 years; education 16.25 years; 70.9% female; 86.1% White) at the Wake Forest ADRC were used for analysis (Table 1). Of the 230 individuals included, 121 completed their follow-up visits in-person, and 109 completed their follow-up visits remotely via telephone (n = 61) or video (n = 48). Demographic differences between the three follow-up type groups (in-person, telephone, and video) were analyzed via one-way ANOVAs (age and education) or chi-square analysis (gender, race). There were not significant differences for age, race, or gender, but there was a significant difference for education, F(2, 227) = 4.14, p = .02. Post hoc analyses revealed that this effect was due to differences between telephone and video participants with video follow-up participants being more highly educated (mean = 17.00 years) than telephone follow-up participants (mean = 15.72 years). One-way ANOVAs were also run to evaluate potential differences in days between baseline and follow-up assessment [F(2, 227) = 5.46, p = .05] and baseline Montreal Cognitive Assessment (MoCA) scores [F(2, 227) = 3.90, p = .02]. Post hoc analyses revealed that there were significantly greater days between assessments for the in-person follow-up group (429.82 days) compared to the telephone follow-up group (407.74 days), and that baseline MoCA scores were higher for video follow-up participants (26.94) when compared to telephone follow-up participants (25.57). Of note, SRBCI scores account for demographic variables, baseline scores, and days between assessments, so these group differences should not affect the primary analysis. SRB change scores were not significantly different between groups on any of the cognitive tests used in the primary analysis (all ps > .10). There were slight differences in the raw scores but not SRB change scores. These differences are summarized in Tables 2 and 3. However, because only SRBCI scores (which account for demographic variables), not raw scores, were used for the primary analyses these differences in age, years of education, and fluency scores did not affect primary group comparisons.

Table 1

Baseline demographic characteristics

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Statisticp
Age69.12 (7.79)69.89 (8.34)66.81 (7.07)F = 2.25.11
Education16.21 (2.30)15.72 (2.37)17.00 (2.26)F = 4.14.02
RaceX  2 = 0.71.70
 White102 (84%)54 (89%)42 (88%)
 Non-White19 (16%)7 (11%)6 (12%)
GenderX  2 = 0.18.91
 Male34 (28%)19 (31%)14 (29%)
 Female87 (72%)42 (69%)34 (71%)
Days between assessments429.82 (51.44)407.74 (42.90)410.96 (42.91)F = 5.46.05
MoCA26.33 (2.53)25.57 (2.99)26.94 (2.00)F = 3.90.02
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Statisticp
Age69.12 (7.79)69.89 (8.34)66.81 (7.07)F = 2.25.11
Education16.21 (2.30)15.72 (2.37)17.00 (2.26)F = 4.14.02
RaceX  2 = 0.71.70
 White102 (84%)54 (89%)42 (88%)
 Non-White19 (16%)7 (11%)6 (12%)
GenderX  2 = 0.18.91
 Male34 (28%)19 (31%)14 (29%)
 Female87 (72%)42 (69%)34 (71%)
Days between assessments429.82 (51.44)407.74 (42.90)410.96 (42.91)F = 5.46.05
MoCA26.33 (2.53)25.57 (2.99)26.94 (2.00)F = 3.90.02

Note. Mean (SD) or N (%).

Table 1

Baseline demographic characteristics

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Statisticp
Age69.12 (7.79)69.89 (8.34)66.81 (7.07)F = 2.25.11
Education16.21 (2.30)15.72 (2.37)17.00 (2.26)F = 4.14.02
RaceX  2 = 0.71.70
 White102 (84%)54 (89%)42 (88%)
 Non-White19 (16%)7 (11%)6 (12%)
GenderX  2 = 0.18.91
 Male34 (28%)19 (31%)14 (29%)
 Female87 (72%)42 (69%)34 (71%)
Days between assessments429.82 (51.44)407.74 (42.90)410.96 (42.91)F = 5.46.05
MoCA26.33 (2.53)25.57 (2.99)26.94 (2.00)F = 3.90.02
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Statisticp
Age69.12 (7.79)69.89 (8.34)66.81 (7.07)F = 2.25.11
Education16.21 (2.30)15.72 (2.37)17.00 (2.26)F = 4.14.02
RaceX  2 = 0.71.70
 White102 (84%)54 (89%)42 (88%)
 Non-White19 (16%)7 (11%)6 (12%)
GenderX  2 = 0.18.91
 Male34 (28%)19 (31%)14 (29%)
 Female87 (72%)42 (69%)34 (71%)
Days between assessments429.82 (51.44)407.74 (42.90)410.96 (42.91)F = 5.46.05
MoCA26.33 (2.53)25.57 (2.99)26.94 (2.00)F = 3.90.02

Note. Mean (SD) or N (%).

Table 2

Comparison of mean scores obtained at baseline for in-person versus remote follow-up (FU) groups

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
Mean (SD)Mean (SD)Mean (SD)
Craft IR Verbatim23.59 (5.99)22.52 (5.39)24.35 (6.14)1.36.26
Craft DR Verbatim21.16 (6.34)18.87 (5.71)22.33 (5.76)4.85.01
Animal Fluency21.40 (5.21)21.18 (5.54)22.67 (5.08)1.26.29
Vegetable Fluency15.12 (3.68)15.21 (5.01)16.67 (3.82)2.59.08
Verbal Fluency27.61 (7.77)26.69 (6.93)28.79 (8.07)1.02.36
Number Forward8.21 (2.28)7.54 (1.88)8.19 (2.06)2.14.12
Number Backward7.23 (2.21)6.51 (1.45)7.60 (2.23)4.23.02
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
Mean (SD)Mean (SD)Mean (SD)
Craft IR Verbatim23.59 (5.99)22.52 (5.39)24.35 (6.14)1.36.26
Craft DR Verbatim21.16 (6.34)18.87 (5.71)22.33 (5.76)4.85.01
Animal Fluency21.40 (5.21)21.18 (5.54)22.67 (5.08)1.26.29
Vegetable Fluency15.12 (3.68)15.21 (5.01)16.67 (3.82)2.59.08
Verbal Fluency27.61 (7.77)26.69 (6.93)28.79 (8.07)1.02.36
Number Forward8.21 (2.28)7.54 (1.88)8.19 (2.06)2.14.12
Number Backward7.23 (2.21)6.51 (1.45)7.60 (2.23)4.23.02
Table 2

Comparison of mean scores obtained at baseline for in-person versus remote follow-up (FU) groups

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
Mean (SD)Mean (SD)Mean (SD)
Craft IR Verbatim23.59 (5.99)22.52 (5.39)24.35 (6.14)1.36.26
Craft DR Verbatim21.16 (6.34)18.87 (5.71)22.33 (5.76)4.85.01
Animal Fluency21.40 (5.21)21.18 (5.54)22.67 (5.08)1.26.29
Vegetable Fluency15.12 (3.68)15.21 (5.01)16.67 (3.82)2.59.08
Verbal Fluency27.61 (7.77)26.69 (6.93)28.79 (8.07)1.02.36
Number Forward8.21 (2.28)7.54 (1.88)8.19 (2.06)2.14.12
Number Backward7.23 (2.21)6.51 (1.45)7.60 (2.23)4.23.02
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
Mean (SD)Mean (SD)Mean (SD)
Craft IR Verbatim23.59 (5.99)22.52 (5.39)24.35 (6.14)1.36.26
Craft DR Verbatim21.16 (6.34)18.87 (5.71)22.33 (5.76)4.85.01
Animal Fluency21.40 (5.21)21.18 (5.54)22.67 (5.08)1.26.29
Vegetable Fluency15.12 (3.68)15.21 (5.01)16.67 (3.82)2.59.08
Verbal Fluency27.61 (7.77)26.69 (6.93)28.79 (8.07)1.02.36
Number Forward8.21 (2.28)7.54 (1.88)8.19 (2.06)2.14.12
Number Backward7.23 (2.21)6.51 (1.45)7.60 (2.23)4.23.02
Table 3

Comparison of mean scores obtained at follow up for in-person versus remote follow-up (FU) groups

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
 Mean (SD)Mean (SD)Mean (SD)  
Craft IR Verbatim23.93 (6.48)23.20 (5.53)24.10 (6.24)0.37.69
Craft DR Verbatim21.50 (6.26)20.90 (6.73)21.81 (5.98)0.30.74
Animal Fluency22.04 (6.02)21.08 (5.65)23.33 (5.66)1.99.14
Vegetable Fluency15.36 (4.03)14.46 (4.15)16.46 (4.21)3.20.04
Verbal Fluency28.88 (8.71)26.72 (8.55)29.06 (7.22)1.56.21
Number Forward8.26 (2.29)8.13 (2.13)8.42 (2.32)0.22.81
Number Backward7.10 (2.02)7.23 (2.19)7.75 (2.08)1.70.18
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
 Mean (SD)Mean (SD)Mean (SD)  
Craft IR Verbatim23.93 (6.48)23.20 (5.53)24.10 (6.24)0.37.69
Craft DR Verbatim21.50 (6.26)20.90 (6.73)21.81 (5.98)0.30.74
Animal Fluency22.04 (6.02)21.08 (5.65)23.33 (5.66)1.99.14
Vegetable Fluency15.36 (4.03)14.46 (4.15)16.46 (4.21)3.20.04
Verbal Fluency28.88 (8.71)26.72 (8.55)29.06 (7.22)1.56.21
Number Forward8.26 (2.29)8.13 (2.13)8.42 (2.32)0.22.81
Number Backward7.10 (2.02)7.23 (2.19)7.75 (2.08)1.70.18
Table 3

Comparison of mean scores obtained at follow up for in-person versus remote follow-up (FU) groups

 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
 Mean (SD)Mean (SD)Mean (SD)  
Craft IR Verbatim23.93 (6.48)23.20 (5.53)24.10 (6.24)0.37.69
Craft DR Verbatim21.50 (6.26)20.90 (6.73)21.81 (5.98)0.30.74
Animal Fluency22.04 (6.02)21.08 (5.65)23.33 (5.66)1.99.14
Vegetable Fluency15.36 (4.03)14.46 (4.15)16.46 (4.21)3.20.04
Verbal Fluency28.88 (8.71)26.72 (8.55)29.06 (7.22)1.56.21
Number Forward8.26 (2.29)8.13 (2.13)8.42 (2.32)0.22.81
Number Backward7.10 (2.02)7.23 (2.19)7.75 (2.08)1.70.18
 In-person (N = 121)Telephone (N = 61)Video (N = 48)Fp
 Mean (SD)Mean (SD)Mean (SD)  
Craft IR Verbatim23.93 (6.48)23.20 (5.53)24.10 (6.24)0.37.69
Craft DR Verbatim21.50 (6.26)20.90 (6.73)21.81 (5.98)0.30.74
Animal Fluency22.04 (6.02)21.08 (5.65)23.33 (5.66)1.99.14
Vegetable Fluency15.36 (4.03)14.46 (4.15)16.46 (4.21)3.20.04
Verbal Fluency28.88 (8.71)26.72 (8.55)29.06 (7.22)1.56.21
Number Forward8.26 (2.29)8.13 (2.13)8.42 (2.32)0.22.81
Number Backward7.10 (2.02)7.23 (2.19)7.75 (2.08)1.70.18

The chi-square analysis revealed that there were not significant differences in the frequency of scores exceeding a Z-score cutoff of +/−1.645 between in-person and remote follow-up visits for any of the UDSv3 tests (Table 4). The specific frequencies of score elevations versus drops can also be found in Table 4, though, due to the small cell size and insignificant findings for the total frequences, a separate analysis of elevations versus drops was not conducted.

Table 4

Frequencies of in-person versus remote FU scores falling outside +/−1.645 standardized regression–based change index cutoff

TestIn-person (N = 121)Telephone (N = 61)(N = 48)
Craft Immediate: total +/− 1.6459 (7.44%)4 (6.56%)3 (6.25%)
Score Increases (>1.645)321
Score Decreases (<−1.645)622
Craft Delay: Total +/−1.6454 (3.31%)5 (8.20%)2 (4.12%)
Score Increases (>1.645)231
Score Decreases (<−1.645)221
Animal Fluency: total +/−1.64513 (10.74%)5 (8.20%)4 (8.33%)
Score Increases (>1.645)933
Score Decreases (<−1.645)421
Vegetable Fluency: total +/−1.6458 (6.61%)8 (13.11%)2 (4.12%)
Score Increases (>1.645)631
Score Decreases (<−1.645)251
Verbal Fluency: total +/−1.64511 (9.09%)5 (8.20%)1 (2.08%)
Score Increases (>1.645)720
Score Decreases (<−1.645)431
Number Forward: total +/−1.6455 (4.13%)6 (9.84%)5 (10.42%)
Score Increases (>1.645)443
Score Decreases (<−1.645)122
Number Backward: total +/−1.6457 (5.79%)7 (11.48%)2 (4.12%)
Score Increases (>1.645)052
Score Decreases (<−1.645)720
TestIn-person (N = 121)Telephone (N = 61)(N = 48)
Craft Immediate: total +/− 1.6459 (7.44%)4 (6.56%)3 (6.25%)
Score Increases (>1.645)321
Score Decreases (<−1.645)622
Craft Delay: Total +/−1.6454 (3.31%)5 (8.20%)2 (4.12%)
Score Increases (>1.645)231
Score Decreases (<−1.645)221
Animal Fluency: total +/−1.64513 (10.74%)5 (8.20%)4 (8.33%)
Score Increases (>1.645)933
Score Decreases (<−1.645)421
Vegetable Fluency: total +/−1.6458 (6.61%)8 (13.11%)2 (4.12%)
Score Increases (>1.645)631
Score Decreases (<−1.645)251
Verbal Fluency: total +/−1.64511 (9.09%)5 (8.20%)1 (2.08%)
Score Increases (>1.645)720
Score Decreases (<−1.645)431
Number Forward: total +/−1.6455 (4.13%)6 (9.84%)5 (10.42%)
Score Increases (>1.645)443
Score Decreases (<−1.645)122
Number Backward: total +/−1.6457 (5.79%)7 (11.48%)2 (4.12%)
Score Increases (>1.645)052
Score Decreases (<−1.645)720

Note. χ2 comparing remote versus in-person scores falling outside of cutoffs was not significant for any of above tests.

Table 4

Frequencies of in-person versus remote FU scores falling outside +/−1.645 standardized regression–based change index cutoff

TestIn-person (N = 121)Telephone (N = 61)(N = 48)
Craft Immediate: total +/− 1.6459 (7.44%)4 (6.56%)3 (6.25%)
Score Increases (>1.645)321
Score Decreases (<−1.645)622
Craft Delay: Total +/−1.6454 (3.31%)5 (8.20%)2 (4.12%)
Score Increases (>1.645)231
Score Decreases (<−1.645)221
Animal Fluency: total +/−1.64513 (10.74%)5 (8.20%)4 (8.33%)
Score Increases (>1.645)933
Score Decreases (<−1.645)421
Vegetable Fluency: total +/−1.6458 (6.61%)8 (13.11%)2 (4.12%)
Score Increases (>1.645)631
Score Decreases (<−1.645)251
Verbal Fluency: total +/−1.64511 (9.09%)5 (8.20%)1 (2.08%)
Score Increases (>1.645)720
Score Decreases (<−1.645)431
Number Forward: total +/−1.6455 (4.13%)6 (9.84%)5 (10.42%)
Score Increases (>1.645)443
Score Decreases (<−1.645)122
Number Backward: total +/−1.6457 (5.79%)7 (11.48%)2 (4.12%)
Score Increases (>1.645)052
Score Decreases (<−1.645)720
TestIn-person (N = 121)Telephone (N = 61)(N = 48)
Craft Immediate: total +/− 1.6459 (7.44%)4 (6.56%)3 (6.25%)
Score Increases (>1.645)321
Score Decreases (<−1.645)622
Craft Delay: Total +/−1.6454 (3.31%)5 (8.20%)2 (4.12%)
Score Increases (>1.645)231
Score Decreases (<−1.645)221
Animal Fluency: total +/−1.64513 (10.74%)5 (8.20%)4 (8.33%)
Score Increases (>1.645)933
Score Decreases (<−1.645)421
Vegetable Fluency: total +/−1.6458 (6.61%)8 (13.11%)2 (4.12%)
Score Increases (>1.645)631
Score Decreases (<−1.645)251
Verbal Fluency: total +/−1.64511 (9.09%)5 (8.20%)1 (2.08%)
Score Increases (>1.645)720
Score Decreases (<−1.645)431
Number Forward: total +/−1.6455 (4.13%)6 (9.84%)5 (10.42%)
Score Increases (>1.645)443
Score Decreases (<−1.645)122
Number Backward: total +/−1.6457 (5.79%)7 (11.48%)2 (4.12%)
Score Increases (>1.645)052
Score Decreases (<−1.645)720

Note. χ2 comparing remote versus in-person scores falling outside of cutoffs was not significant for any of above tests.

One-way ANOVAs comparing mean SRB change scores for in-person, telephone, and video follow-up visits indicated that there were no significant differences for any of the cognitive tests (Table 5). For number span backwards, there was a trend toward significance [F(2,227), p = .08] that appears to be driven by slightly higher SRB change scores for telephone follow-up assessments, but the effect size is very small (partial η2 = 0.02).

Table 5

Standardized regression–based chance index scores

 In-personTelephoneVideoFpPartial η2
 Mean (SD)Mean (SD)Mean (SD)   
Craft IR Verbatim0.11 (0.95)0.08 (0.90)0.04 (0.95)0.09.910.0008
Craft DR Verbatim0.07 (0.83)0.20 (1.08)−0.01 (0.76)0.76.470.0067
Animal Fluency0.17 (1.14)0.00 (0.97)0.23 (0.99)0.75.470.0066
Vegetable Fluency0.10 (1.01)−0.15 (0.99)0.12 (0.96)1.51.220.0132
Verbal Fluency0.02 (1.04)−0.23 (1.12)−0.17 (0.74)1.53.220.0133
Number Forward0.10 (0.86)0.26 (1.07)0.15 (1.01)0.56.570.0049
Number Backward−0.14 (0.82)0.17 (1.17)0.06 (0.76)2.56.080.0220
 In-personTelephoneVideoFpPartial η2
 Mean (SD)Mean (SD)Mean (SD)   
Craft IR Verbatim0.11 (0.95)0.08 (0.90)0.04 (0.95)0.09.910.0008
Craft DR Verbatim0.07 (0.83)0.20 (1.08)−0.01 (0.76)0.76.470.0067
Animal Fluency0.17 (1.14)0.00 (0.97)0.23 (0.99)0.75.470.0066
Vegetable Fluency0.10 (1.01)−0.15 (0.99)0.12 (0.96)1.51.220.0132
Verbal Fluency0.02 (1.04)−0.23 (1.12)−0.17 (0.74)1.53.220.0133
Number Forward0.10 (0.86)0.26 (1.07)0.15 (1.01)0.56.570.0049
Number Backward−0.14 (0.82)0.17 (1.17)0.06 (0.76)2.56.080.0220
Table 5

Standardized regression–based chance index scores

 In-personTelephoneVideoFpPartial η2
 Mean (SD)Mean (SD)Mean (SD)   
Craft IR Verbatim0.11 (0.95)0.08 (0.90)0.04 (0.95)0.09.910.0008
Craft DR Verbatim0.07 (0.83)0.20 (1.08)−0.01 (0.76)0.76.470.0067
Animal Fluency0.17 (1.14)0.00 (0.97)0.23 (0.99)0.75.470.0066
Vegetable Fluency0.10 (1.01)−0.15 (0.99)0.12 (0.96)1.51.220.0132
Verbal Fluency0.02 (1.04)−0.23 (1.12)−0.17 (0.74)1.53.220.0133
Number Forward0.10 (0.86)0.26 (1.07)0.15 (1.01)0.56.570.0049
Number Backward−0.14 (0.82)0.17 (1.17)0.06 (0.76)2.56.080.0220
 In-personTelephoneVideoFpPartial η2
 Mean (SD)Mean (SD)Mean (SD)   
Craft IR Verbatim0.11 (0.95)0.08 (0.90)0.04 (0.95)0.09.910.0008
Craft DR Verbatim0.07 (0.83)0.20 (1.08)−0.01 (0.76)0.76.470.0067
Animal Fluency0.17 (1.14)0.00 (0.97)0.23 (0.99)0.75.470.0066
Vegetable Fluency0.10 (1.01)−0.15 (0.99)0.12 (0.96)1.51.220.0132
Verbal Fluency0.02 (1.04)−0.23 (1.12)−0.17 (0.74)1.53.220.0133
Number Forward0.10 (0.86)0.26 (1.07)0.15 (1.01)0.56.570.0049
Number Backward−0.14 (0.82)0.17 (1.17)0.06 (0.76)2.56.080.0220

DISCUSSION

This study examined differences between remote versus in-person follow-up testing performance in cognitively normal older adults over a 1-year follow-up period using SRB change indices. We found no significant differences in the frequency of score elevations or drops across follow-up modalities for tests that could be administered by audio only, among cognitively normal older adults. Additionally, there were not significant differences in mean change scores between any of the modalities for any of the tests. Strengths of this work include the novel application of SRB change indices to the growing area of teleneuropsychology, use of a relatively large cohort that completed all baseline assessments in-person and follow-up assessments approximately 1 year later via different modalities, and the ability to compare not only remote versus in-person testing but also specific remote modalities (e.g., telephone and video). In an increasingly remote world, a thorough understanding of remotely administered neuropsychological assessment is more needed than ever. The ability to assess neuropsychological functioning without the individual needing to leave their home has clear implications for increasing research and clinical access for some of the hardest to reach populations. This is particularly important in this current era of an ever-growing aging population and high demand for dementia-related assessments due to novel therapeutics.

Our findings of generally comparable performance for remote versus in-person testing are consistent with prior work demonstrating that remotely administered cognitive tests produce comparable scores and adjudication outcomes compared to cognitive tests that are administered in person (Alegret et al., 2021; Barton et al., 2011; Loh et al., 2007; Sachs et al., 2024; Wadsworth et al., 2016). Comparing change scores with a SRB index is another, and to our knowledge, more precise metric to test the validity of remote cognitive assessment. The use of regression-based change data addresses some of the statistical limitations of prior work on remote assessment validity like more tightly controlling for practice effects, base rates of change, and demographic characteristics. Additional work utilizing a randomized, counterbalanced study design and larger samples is still needed, but, in the interim, these findings provide preliminary support for the use of remote assessment.

Limitations, Challenges, and Future Directions

One limitation of this work is that these analyses were constrained to cognitive tests for which regression-based change formulas had already been published. Though not part of the UDSv3, list learning tests (e.g., the Rey Auditory Verbal Learning Test; Consortium to Establish a Registry for Alzheimer’s Disease list learning test) are often administered by ADRCs to assess memory. The development of SRB change formulas for list learning tests would be helpful in future work exploring the validity of remotely administered list learning tests in older adults with concern for memory decline. In a similar vein, the restriction to UDSv3 tests in an ADRC population in this work limits the generalizability of these other testing batteries and other non-ADRC populations. Another restriction on the generalizability is the use of a clinical cohort that is predominately White and relatively highly educated. Remote assessments may have different implications in more racially and educationally diverse populations.

CONCLUSION

This work provides evidence that on most UDSv3 measures, remote and in-person testing over a 1-year follow-up period produce comparable scores. Ultimately, this suggests that remote research assessments may be a viable alternative to in-person testing. The ability to use remotely administered testing in place of in-person testing has the potential to ease participant/patient burden and increase access to clinical care and participation in research.

FUNDING

This work was supported by the National Institute of Health P30 AG049638 and R01AG075959-01.

CONFLICT OF INTEREST

None declared.

AUTHOR CONTRIBUTIONS

Lauren Latham (Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Writing—original draft), Suzanne Craft (Conceptualization, Data curation, Funding acquisition, Investigation, Resources, Supervision, Writing—review & editing), Stephen Rapp (Conceptualization, Methodology, Supervision, Writing—review & editing), James Bateman (Funding acquisition, Investigation, Writing—review & editing), Maryjo Cleveland (Funding acquisition, Investigation, Writing—review & editing), Samantha Rogers (Funding acquisition, Investigation, Writing—review & editing), Benjamin Williams (Funding acquisition, Investigation, Writing—review & editing), Mia Yang (Funding acquisition, Investigation, Writing—review & editing), and Bonnie Sachs (Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing—review & editing)

References

Alegret
,
M.
,
Espinosa
,
A.
,
Ortega
,
G.
,
Pérez-Cordón
,
A.
,
Sanabria
,
Á.
,
Hernández
,
I.
, et al. (
2021
).
From face-to-face to home-to-home: validity of a teleneuropsychological battery
.
Journal of Alzheimer's Disease
,
81
(
4
),
1541
1553
. .

Barton
,
C.
,
Morris
,
R.
,
Rothlind
,
J.
, &
Yaffe
,
K.
(
2011
).
Video-telemedicine in a memory disorders clinic: Evaluation and management of rural elders with cognitive impairment
.
Telemedicine and e-Health
,
17
(
10
),
789
792
. .

Duff
,
K.
(
2012
).
Evidence-based indicators of neuropsychological change in the individual patient: Relevant concepts and methods
.
Archives of Clinical Neuropsychology
,
27
(
3
),
248
261
.
Duff, K., Atkinson, T. J., Suhrie, K. R., D
. .

Kiselica
,
A. M.
,
Kaser
,
A. N.
,
Webber
,
T. A.
,
Small
,
B. J.
, &
Benge
,
J. F.
(
2025
).
Development and preliminary validation of standardized regression-based change scores as measures of transitional cognitive decline
.
Archives of Clinical Neuropsychology
,
35
(
7
),
1168
1181
. .

Loh
,
P.-K.
,
Donaldson
,
M.
,
Flicker
,
L.
,
Maher
,
S.
, &
Goldswain
,
P.
(
2007
).
Development of a telemedicine protocol for the diagnosis of Alzheimer’s disease
.
Journal of Telemedicine and Telecare
,
13
(
2
),
90
94
. .

McSweeny
,
A. J.
,
Naugle
,
R. I.
,
Chelune
,
G. J.
, &
Lüders
,
H.
(
1993
).
“TScores for change”: An illustration of a regression approach to depicting change in clinical neuropsychology
.
The Clinical Neuropsychologist
,
7
(
3
),
300
312
. .

Rapp
,
S. R.
,
Legault
,
C.
,
Espeland
,
M. A.
,
Resnick
,
S. M.
,
Hogan
,
P. E.
,
Coker
,
L. H.
, et al. (
2012
).
Validation of a cognitive assessment battery administered over the telephone
.
Journal of the American Geriatrics Society
,
60
(
9
),
1616
1623
. .

Sachs
,
B. C.
,
Latham
,
L. A.
,
Bateman
,
J. R.
,
Cleveland
,
M. J.
,
Espeland
,
M. A.
,
Fischer
,
E.
, et al. (
2024
).
Feasibility of remote Administration of the Uniform Data set-version 3 for assessment of older adults with mild cognitive impairment and Alzheimer’s disease
.
Archives of Clinical Neuropsychology
,
39
, 635–643. .

Sperling
,
S. A.
,
Acheson
,
S. K.
,
Fox-Fuller
,
J.
,
Colvin
,
M. K.
,
Harder
,
L.
,
Cullum
,
C. M.
, et al. (
2024
).
Tele-neuropsychology: From science to policy to practice
.
Archives of Clinical Neuropsychology
,
39
(
2
),
227
248
. .

Wadsworth
,
H. E.
,
Galusha-Glasscock
,
J. M.
,
Womack
,
K. B.
,
Quiceno
,
M.
,
Weiner
,
M. F.
,
Hynan
,
L. S.
, et al. (
2016
).
Remote neuropsychological assessment in rural American Indians with and without cognitive impairment
.
Archives of Clinical Neuropsychology
,
31
(
5
),
420
425
. .

Weintraub
,
S.
,
Besser
,
L.
,
Dodge
,
H. H.
,
Teylan
,
M.
,
Ferris
,
S.
,
Goldstein
,
F. C.
, et al. (
2018
).
Version 3 of the Alzheimer disease Centers’ neuropsychological test battery in the uniform data set (UDS)
.
Alzheimer Disease & Associated Disorders
,
32
(
1
),
10
17
. .

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]