-
PDF
- Split View
-
Views
-
Cite
Cite
Xuexin Yu, Laura B Zahodne, Alden L Gross, Belinda L Needham, Kenneth M Langa, Tsai-Chin Cho, Lindsay C Kobayashi, Could differential underreporting of loneliness between men and women bias the gender-specific association between loneliness duration and rate of memory decline? A probabilistic bias analysis of effect modification, American Journal of Epidemiology, Volume 194, Issue 3, March 2025, Pages 811–819, https://doi.org/10.1093/aje/kwae186
- Share Icon Share
Abstract
Gender is an observed effect modifier of the association between loneliness and memory aging. However, this effect modification may be a result of information bias due to differential loneliness underreporting by gender. We applied probabilistic bias analyses to examine whether effect modification of the loneliness–memory decline relationship by gender is retained under three simulation scenarios with various magnitudes of differential loneliness underreporting between men and women. Data were from biennial interviews with adults aged ≥ 50 years in the US Health and Retirement Study from 1996-2016 (5646 women and 3386 men). Loneliness status (yes vs no) was measured from 1996-2004 using the Center for Epidemiologic Studies Depression (CES-D) Scale loneliness item, and memory was measured from 2004-2016. Simulated sensitivity and specificity of the loneliness measure were informed by a validation study using the UCLA Loneliness Scale as a gold standard. The likelihood of observing effect modification by gender was higher than 90% in all simulations, although the likelihood reduced with an increasing difference in magnitude of the loneliness underreporting between men and women. The gender difference in loneliness underreporting did not meaningfully affect the observed effect modification by gender in our simulations. Our simulation approach may be promising to quantify potential information bias in effect modification analyses.
Introduction
As an emerging public health concern, loneliness in mid-life has been associated with accelerated memory aging in later-life.1 However, there are conflicting findings regarding the effect modification of this relationship by gender. Yu et al1 observed a stronger relationship between loneliness and episodic memory aging among women than men in the population-based US Health and Retirement Study (HRS),1 while another study did not observe sex/gender-specific effects of loneliness on dementia risk, which was assessed by the modified Telephone Interview for Cognitive Status, incorporating items to measure episodic memory, working memory, and overall mental status.2 These inconsistent findings may be attributable to domain-specific associations between loneliness and cognitive function.
One noncausal explanation for the conflicting findings could be that the observed effect modification by gender is spurious due to information bias arising from differential loneliness underreporting in men compared with women,1 especially given that Sutin et al2 measured loneliness status using the UCLA Loneliness Scale from the HRS Psychosocial and Lifestyle Questionnaire,2 while Yu et al1 used the loneliness item from the Center for Epidemiologic Studies Depression (CES-D) Scale in the HRS one-on-one interviews. In research study interviews, men may have a lower propensity than women to admit negative emotions such as loneliness,3 fear,4 and depression,5 potentially due to social desirability bias or social stigma related to masculinity imposed on these feelings.3,6 However, as loneliness is an inherently subjective construct, it is challenging to quantify information bias in its measurement due to a lack of gold-standard data to validate self-reports.
We used probabilistic bias analyses to probe the potential direction and magnitude of information bias in the estimation of effect modification measured by statistical interaction on the additive scale. We simulated three scenarios with various magnitudes of differential loneliness underreporting by gender to examine whether the previously observed effect modification of the loneliness–memory aging relationship by gender would be affected.1 We hypothesized that compared with women, men would be less likely to admit loneliness, particularly during a one-on-one interview; and greater underreporting among men may induce greater underestimation of the association between loneliness duration and rate of memory decline among men relative to women, leading to the distorted finding of effect modification by gender. While our probabilistic bias analysis focuses on misclassification of one specific psychosocial exposure, our simulation approach is applicable to a broad range of psychosocial exposures that do not readily have gold-standard data available for validation, and to the study of information bias in effect modification analyses more broadly.
Methods
Study design, population, and sample size
This simulation study replicated an existing prospective cohort study by Yu et al,1 which investigated the relationship between duration of loneliness from 1996-2004 and subsequent rate of memory decline during 2004-2016 among 9032 adults (5646 women and 3386 men) aged ≥ 50 years in the US HRS.1,7
Measures
All measures were consistent with the previous study.1 The exposure, self-reported loneliness status (yes vs no), was measured biennially from 1996-2004 using the loneliness item from the CES-D Scale, and its duration was categorized as never, 1 time point, 2 time points, and ≥ 3 time points.1 The outcome, episodic memory, was measured biennially from 2004-2016 using validated composite memory z scores incorporating both the direct respondent memory assessments (immediate and delayed word recall tests) and proxy memory assessments for respondents who were unable to directly participate in the study interview, usually due to impairment or illness.1,8 Potential confounders measured in 1996 included age, gender, race, marital status, education, employment, household wealth, objective social isolation, CES-D scores (excluding the loneliness item), and limitations to activities of daily living.1 A directed acyclic graph is provided in Figure S1.
Quantitative probabilistic bias analysis
Quantitative probabilistic bias analysis aims to estimate an association of interest that theoretically would have been observed had some presumed nonrandom error (eg, information bias) been minimized given a set of assumed bias parameters (eg, sensitivity and specificity).9 Different from simple bias analysis and multidimensional bias analysis, which assume single or several specific values of bias parameters, probabilistic bias analysis assumes that bias parameters follow known probability distributions (eg, uniform, triangular, and trapezoidal) within specified ranges.9,10 This assumption helps to account for uncertainty in the true values of the bias parameters, which is especially helpful in the absence of gold-standard data for validation.9,10
Our quantitative probabilistic bias analysis entailed four steps: (1) bias parameter specification under three simulated scenarios; (2) the generation of negative and positive predicted values, based on the selected bias parameters and the observed loneliness distribution; (3) record-level correction for loneliness status to simulate new bias-adjusted datasets; and (4) modeling the bias-adjusted effect modification by gender of the association between loneliness duration and rate of memory decline in each simulated dataset. Overall, we proposed three simulation scenarios, each of them included seven situations with varying sensitivity by gender, and for each situation specified in step 1, we repeated steps 2-4 a total of 10 000 times to generate bias-adjusted simulation estimates and intervals in comparison to the observed effect modification by gender in Yu et al.1
Step 1: Bias parameter specification
We proposed three scenarios and selected bias parameter values (sensitivity and specificity) informed by a validation study and with the purpose of maximizing our ability to test plausible magnitudes of potential information bias due to differential loneliness underreporting by gender.
Validation study
Although there is no objective gold-standard data with which to validate the subjective loneliness measure, we performed a validation study and used data from the self-reported 3-item UCLA Loneliness Scale in the HRS Psychosocial and Lifestyle Questionnaire in 2006 as the best-available gold standard to inform potential sensitivity and specificity values for the single-item self-report of loneliness (Table 1).
Sensitivity and specificity of the single-item measure of loneliness using the 3-item UCLA Loneliness Scale as a gold standarda (n = 7445), Health and Retirement Study, United States, 2006.
Bias parameters (CES-D loneliness item) . | Gender . | |
---|---|---|
Men n = 3037 . | Women n = 4408 . | |
Sensitivity | 0.30 | 0.39 |
Specificity | 0.93 | 0.90 |
Bias parameters (CES-D loneliness item) . | Gender . | |
---|---|---|
Men n = 3037 . | Women n = 4408 . | |
Sensitivity | 0.30 | 0.39 |
Specificity | 0.93 | 0.90 |
Abbreviation: CES-D, Center for Epidemiologic Studies Depression.
a Gold-standard data were from the 3-item UCLA Loneliness Scale (range: 3 to 9), where higher scores indicate a higher level of loneliness. A cutoff value of ≥ 6 was used to identify individuals who were “high” in loneliness.
Sensitivity and specificity of the single-item measure of loneliness using the 3-item UCLA Loneliness Scale as a gold standarda (n = 7445), Health and Retirement Study, United States, 2006.
Bias parameters (CES-D loneliness item) . | Gender . | |
---|---|---|
Men n = 3037 . | Women n = 4408 . | |
Sensitivity | 0.30 | 0.39 |
Specificity | 0.93 | 0.90 |
Bias parameters (CES-D loneliness item) . | Gender . | |
---|---|---|
Men n = 3037 . | Women n = 4408 . | |
Sensitivity | 0.30 | 0.39 |
Specificity | 0.93 | 0.90 |
Abbreviation: CES-D, Center for Epidemiologic Studies Depression.
a Gold-standard data were from the 3-item UCLA Loneliness Scale (range: 3 to 9), where higher scores indicate a higher level of loneliness. A cutoff value of ≥ 6 was used to identify individuals who were “high” in loneliness.
The 3-item UCLA Loneliness Scale in the HRS 2006 was used as a gold standard for two reasons. First, it was administered as a pen-and-paper questionnaire that was left behind after the study interview, completed in private by the respondent, and sent back by mail.11 In contrast, the single-item measure of loneliness in the CES-D Scale was administered in face-to-face and telephone interviews, which may be more subject to social desirability bias in reporting than pen-and-paper questionnaires. Second, the 3-item UCLA Loneliness Scale contains 3 items with Likert-style response options, and thus more comprehensively captures the construct of loneliness than the single CES-D item. The 3-item UCLA Loneliness Scale ranged from 3 to 9 with higher scores indicating a higher level of loneliness.12 We used a cutoff value of ≥ 6 to identify individuals who were “high” in loneliness, consistent with prior research.13 Details of the validation study are provided in Appendix S1.
Simulation scenarios under investigation
According to the difference in sensitivity between men and women on the CES-D loneliness item judged against the 3-item UCLA Loneliness Scale (~0.10, as shown in Table 1), we specified the first simulation scenario, including seven situations where the absolute magnitude of loneliness underreporting constantly increased for both men and women, while the difference in the magnitude of underreporting between men and women was held constant at 0.10 (Table 2).
Sensitivity range seta (situation) . | Gender . | |
---|---|---|
Men . | Women . | |
First scenario: increasing underreporting with a constant gender difference | ||
1 | 0.25-0.30 | 0.35-0.40 |
2 | 0.35-0.40 | 0.45-0.50 |
3 | 0.45-0.50 | 0.55-0.60 |
4 | 0.55-0.60 | 0.65-0.70 |
5 | 0.65-0.70 | 0.75-0.80 |
6 | 0.75-0.80 | 0.85-0.90 |
7 | 0.85-0.90 | 0.95-1.00 |
Second scenario: increasing underreporting in women, constant underreporting in men | ||
1 | 0.30-0.35 | 0.30-0.35 |
2 | 0.40-0.45 | |
3 | 0.50-0.55 | |
4 | 0.60-0.65 | |
5 | 0.70-0.75 | |
6 | 0.80-0.85 | |
7 | 0.90-0.95 | |
Third scenario: increasing underreporting in men, constant underreporting in women | ||
1 | 0.25-0.30 | 0.95-1.00 |
2 | 0.35-0.40 | |
3 | 0.45-0.50 | |
4 | 0.55-0.60 | |
5 | 0.65-0.70 | |
6 | 0.75-0.80 | |
7 | 0.85-0.90 |
Sensitivity range seta (situation) . | Gender . | |
---|---|---|
Men . | Women . | |
First scenario: increasing underreporting with a constant gender difference | ||
1 | 0.25-0.30 | 0.35-0.40 |
2 | 0.35-0.40 | 0.45-0.50 |
3 | 0.45-0.50 | 0.55-0.60 |
4 | 0.55-0.60 | 0.65-0.70 |
5 | 0.65-0.70 | 0.75-0.80 |
6 | 0.75-0.80 | 0.85-0.90 |
7 | 0.85-0.90 | 0.95-1.00 |
Second scenario: increasing underreporting in women, constant underreporting in men | ||
1 | 0.30-0.35 | 0.30-0.35 |
2 | 0.40-0.45 | |
3 | 0.50-0.55 | |
4 | 0.60-0.65 | |
5 | 0.70-0.75 | |
6 | 0.80-0.85 | |
7 | 0.90-0.95 | |
Third scenario: increasing underreporting in men, constant underreporting in women | ||
1 | 0.25-0.30 | 0.95-1.00 |
2 | 0.35-0.40 | |
3 | 0.45-0.50 | |
4 | 0.55-0.60 | |
5 | 0.65-0.70 | |
6 | 0.75-0.80 | |
7 | 0.85-0.90 |
a For each sensitivity range set, specificity values were constantly set as 0.92-1.00.
Sensitivity range seta (situation) . | Gender . | |
---|---|---|
Men . | Women . | |
First scenario: increasing underreporting with a constant gender difference | ||
1 | 0.25-0.30 | 0.35-0.40 |
2 | 0.35-0.40 | 0.45-0.50 |
3 | 0.45-0.50 | 0.55-0.60 |
4 | 0.55-0.60 | 0.65-0.70 |
5 | 0.65-0.70 | 0.75-0.80 |
6 | 0.75-0.80 | 0.85-0.90 |
7 | 0.85-0.90 | 0.95-1.00 |
Second scenario: increasing underreporting in women, constant underreporting in men | ||
1 | 0.30-0.35 | 0.30-0.35 |
2 | 0.40-0.45 | |
3 | 0.50-0.55 | |
4 | 0.60-0.65 | |
5 | 0.70-0.75 | |
6 | 0.80-0.85 | |
7 | 0.90-0.95 | |
Third scenario: increasing underreporting in men, constant underreporting in women | ||
1 | 0.25-0.30 | 0.95-1.00 |
2 | 0.35-0.40 | |
3 | 0.45-0.50 | |
4 | 0.55-0.60 | |
5 | 0.65-0.70 | |
6 | 0.75-0.80 | |
7 | 0.85-0.90 |
Sensitivity range seta (situation) . | Gender . | |
---|---|---|
Men . | Women . | |
First scenario: increasing underreporting with a constant gender difference | ||
1 | 0.25-0.30 | 0.35-0.40 |
2 | 0.35-0.40 | 0.45-0.50 |
3 | 0.45-0.50 | 0.55-0.60 |
4 | 0.55-0.60 | 0.65-0.70 |
5 | 0.65-0.70 | 0.75-0.80 |
6 | 0.75-0.80 | 0.85-0.90 |
7 | 0.85-0.90 | 0.95-1.00 |
Second scenario: increasing underreporting in women, constant underreporting in men | ||
1 | 0.30-0.35 | 0.30-0.35 |
2 | 0.40-0.45 | |
3 | 0.50-0.55 | |
4 | 0.60-0.65 | |
5 | 0.70-0.75 | |
6 | 0.80-0.85 | |
7 | 0.90-0.95 | |
Third scenario: increasing underreporting in men, constant underreporting in women | ||
1 | 0.25-0.30 | 0.95-1.00 |
2 | 0.35-0.40 | |
3 | 0.45-0.50 | |
4 | 0.55-0.60 | |
5 | 0.65-0.70 | |
6 | 0.75-0.80 | |
7 | 0.85-0.90 |
a For each sensitivity range set, specificity values were constantly set as 0.92-1.00.
To test plausible magnitudes of information bias due to any differential loneliness underreporting by gender, we additionally simulated two scenarios where the difference in sensitivity between men and women was not constant, although the selected sensitivity values in these two scenarios were not supported by the validation study. In the second scenario, the sensitivity among men was held at 0.30-0.35 (informed by the validation study) across seven situations, and sensitivity values among women ranged from 0.30-0.95 (Table 2). This scenario simulated situations where men had a low propensity to admit loneliness and examined the extent to which the differences in the magnitude of underreporting between men and women could meaningfully bias the estimated effect modification by gender. In the third scenario, the sensitivity values among women were held at 0.95-1.00 across seven situations, and the sensitivity values among men ranged from 0.25 to 0.90 (Table 2). This third scenario simulated situations where women did not underreport loneliness status and examined the extent to which the magnitude of loneliness underreporting among men could meaningfully bias the estimated effect modification by gender.
For all simulations, specificity values were held at 0.92-1.00, as informed by the validation study (Table 1). The specificity values were nondifferential by gender and by memory, as loneliness overreporting is beyond the scope of this study. The reason we chose 0.92 rather than 0.90 for the lower limit of the range was to avoid generating impossible negative and positive predictive values (less than zero).14 All selected bias parameters were assumed to follow a trapezoidal distribution across their specified ranges, as the trapezoidal distribution may be more realistic than the uniform and triangular distributions.15
Assumptions
In specifying sensitivity and specificity values, we made three simplifying assumptions. First, we assumed that the misclassification of loneliness status was nondifferential by the memory outcome for both genders. Differential exposure misclassification by the outcome, in addition to the effect modifier, may substantially increase computational complexity and is beyond the scope of this study. Second, we assumed both men and women were unlikely to overreport loneliness status, as supported by the high specificity values (~0.90) in Table 1. Third, to reduce model complexity, we assumed the bias parameters were constant over the exposure period from 1996-2004 because there is no evidence to suggest age differences in the likelihood of reporting loneliness.
Step 2: Negative and positive predictive value generation
For each situation specified above, we used the Monte Carlo technique to randomly select a value, and calculated positive predictive value (PPV) and negative predictive value (NPV). The PPV is the probability that self-reported lonely individuals are truly lonely, and the NPV is the probability that self-reported nonlonely individuals are correctly classified as truly nonlonely.15 As the NPV and PPV are functions of sensitivity, specificity, and the observed distribution of loneliness status,15 we calculated the NPV and PPV by the memory outcome within each gender group, due to their differential distributions of loneliness according to memory. Specifically, at each exposure time point from 1996-2004, we first calculated the expected number of “truly” lonely and nonlonely individuals according to memory outcomes, using the observed exposure distribution data and selected sensitivity and specificity values (equations shown in Table 3 and details in Appendix S2 and Table S1). As the memory outcome was continuous and calculation of the PPV and NPV requires dichotomous outcomes, we used the median values of the memory scores at each time point during the exposure period from 1996-2004 to identify cases (memory scores below the median) and controls (memory scores above the median). Next, we calculated the expected numbers that were true positive (TP), true negative (TN), false positive (FP), and false negative (FN), separately for the case and control groups, and then we generated the corresponding PPV and NPV for each time point from 1996-2004, using equations 1-12. This step was completed separately for men and women.
Equations for calculating expected true loneliness distribution according to the memory outcome using the observed data, sensitivity, and specificity.a
. | Observed . | Expected . | |||
---|---|---|---|---|---|
Lonely . | Nonlonely . | Total . | Lonely . | Nonlonely . | |
Case | a | b | N1 | A = [a − (1 − sp) × N1]/[se − (1-sp)] | B = N1 − A |
Control | c | d | N0 | C = [c − (1 − sp) × N0)]/[se − (1-sp)] | D = N0 − C |
. | Observed . | Expected . | |||
---|---|---|---|---|---|
Lonely . | Nonlonely . | Total . | Lonely . | Nonlonely . | |
Case | a | b | N1 | A = [a − (1 − sp) × N1]/[se − (1-sp)] | B = N1 − A |
Control | c | d | N0 | C = [c − (1 − sp) × N0)]/[se − (1-sp)] | D = N0 − C |
Abbreviations: se, sensitivity; sp, specificity.
a Cases of the outcome were defined as memory scores below the median; controls were defined as memory scores above the median. We calculated the expected a, b, c, and d cells at each time point during the exposure period from 1996-2004. Details of the equations are provided in the Appendix S2 and Table S1.
Equations for calculating expected true loneliness distribution according to the memory outcome using the observed data, sensitivity, and specificity.a
. | Observed . | Expected . | |||
---|---|---|---|---|---|
Lonely . | Nonlonely . | Total . | Lonely . | Nonlonely . | |
Case | a | b | N1 | A = [a − (1 − sp) × N1]/[se − (1-sp)] | B = N1 − A |
Control | c | d | N0 | C = [c − (1 − sp) × N0)]/[se − (1-sp)] | D = N0 − C |
. | Observed . | Expected . | |||
---|---|---|---|---|---|
Lonely . | Nonlonely . | Total . | Lonely . | Nonlonely . | |
Case | a | b | N1 | A = [a − (1 − sp) × N1]/[se − (1-sp)] | B = N1 − A |
Control | c | d | N0 | C = [c − (1 − sp) × N0)]/[se − (1-sp)] | D = N0 − C |
Abbreviations: se, sensitivity; sp, specificity.
a Cases of the outcome were defined as memory scores below the median; controls were defined as memory scores above the median. We calculated the expected a, b, c, and d cells at each time point during the exposure period from 1996-2004. Details of the equations are provided in the Appendix S2 and Table S1.
Among the case group:
Among the control group:
Step 3: Record-level correction for loneliness underreporting
Based on each PPV and NPV set yielded in step 2, we conducted record-level correction (ie, observation-level correction) by gender and memory outcome status to reclassify loneliness status at each time point from 1996-2004 and simulated the new bias-adjusted dataset. Specifically, for each observation from 1996-2004, we conducted a Bernoulli trial, which assumes individuals have the corresponding probability (1-NPV among those reporting no loneliness and PPV among those reporting loneliness) of being truly lonely.15 Loneliness duration from 1996-2004 was recalculated based on the corrected loneliness status at each time point.
Step 4: Modeling effect modification by gender
Using the bias-adjusted dataset simulated in step 3, we replicated the analyses in Yu et al1 to estimate the bias-adjusted effect modification by gender of the association between loneliness duration and rate of memory decline. Consistent with Yu et al,1 we first estimated gender-stratified mixed-effects linear regression models and used |${\mathrm{\alpha}}_3$| to determine the association between loneliness duration and rate of memory decline among men and women, as shown below:
where |${Cov}_i$| represent covariates measured in 1996 for individual i; |${\mathrm{\varepsilon}}_{ij}$| represents random errors for individual i at time j; |${b}_{0i}$| represents random intercept for individual i; and |${b}_{1i}{year}_{ij}$|represents random slope for individual i.
The subgroup analyses through the simulation process were used to illustrate the magnitude and direction of the information bias across gender groups only. We also conducted additional pooled analyses to derive gender-specific simulation estimates as a sensitivity analysis, which generated similar results (Figure S2).
We then conducted a pooled mixed-effects linear analysis with a 3-way interaction on the additive scale between loneliness duration, years of follow-up, and gender to test the effect modification by gender (|${\mathrm{\beta}}_7$|), as shown below:
where |${Cov}_i$| represent covariates measured in 1996 for individual i; |${\mathrm{\varepsilon}}_{ij}$| represents random errors for individual i at time j; |${b}_{0i}$| represents random intercept for individual i; and |${b}_{1i}{year}_{ij}$|represents random slope for individual i.
Consistent with the prior study,1 we first coded loneliness duration as a continuous variable (ranging from 0 to 3) and conducted the analyses to test the overall linear trend for effect modification. We then treated loneliness duration as a categorical variable (never; 1 time point; 2 time points; ≥ 3 time points) and repeated the analyses (details in the Appendix S3 and Figure S3).
We repeated steps 2-4 a total of 10 000 times to generate simulation estimates and 95% simulation intervals (SIs) for each simulated situation specified in step 1. We accounted for random error by subtracting the product of the standard error of the traditional unadjusted estimate and a random value of a standard normal deviate from the simulation estimates.15 We reported the 50th percentile of the bias-adjusted estimate distribution (median) as the simulation estimate, and the 2.5th and 97.5th percentile of the estimate distribution as the 95% SI. We compared the bias-adjusted simulation estimates with the unadjusted estimate from the analyses in Yu et al (2023), and used the proportion of simulation estimates (|${\mathrm{\beta}}_7$|) below the null as the likelihood of observing effect modification by gender. Sample Stata code for our simulation approach is provided in Appendix S4.
Results
Figure 1 and Table S2 provide unadjusted estimates and bias-adjusted simulation estimates and 95% SIs for the association between loneliness duration (continuous) and rate of memory decline among men and women. The estimates moved away from the null among both men and women as sensitivity declined, suggesting that loneliness underreporting that is nondifferential by the outcome could bias the estimate towards the null for both genders, as expected. However, the differences between the bias-adjusted estimates and the unadjusted estimate were larger in magnitude among women compared with men.

Gender-specific simulation estimates and 95% simulation intervals (SIs) for the interaction between loneliness exposure duration (continuous) and years of follow-up, the US Health and Retirement Study (n = 9032), 1996-2016. These estimates were from gender-stratified subgroup analyses: A) men; B) women. We conducted additional pooled analyses to derive gender-specific simulation estimates (Figure S2). Results were consistent with those shown in Figure 1.
Figure 2 and Table S3 provide the unadjusted estimate and bias-adjusted simulation estimates and 95% SIs for the 3-way interaction between loneliness exposure duration (continuous), years of follow-up, and gender. In all three simulation scenarios, the likelihood of observing effect modification by gender remained as high as over 90%. In the first simulation scenario, where the differences in the sensitivity values between men and women were held at 0.1, the bias-adjusted estimates for the 3-way interaction moved away from the null as sensitivity decreased, indicating an increasing magnitude of effect modification by gender as sensitivity of the loneliness exposure measure decreased (Figure 2A). The unadjusted 3-way interaction was −0.008 (95% CI, −0.013 to −0.003), suggesting that the association between loneliness duration and rate of memory decline was stronger among women rather than men. The bias-adjusted simulation estimate increased in magnitude to −0.010 (95% SI, −0.016 to −0.003) when sensitivity values were set as 0.25-0.30 among men and 0.35-0.40 among women. The unexpected direction of the bias in the 3-way interaction could be largely driven by the large differences between the bias-adjusted estimates and the unadjusted estimate among women, as shown in Figure 1.

Simulation estimates and 95% simulation intervals (SIs) for the 3-way interaction between loneliness exposure duration (continuous), years of follow-up, and gender in each of the three simulation scenarios, the US Health and Retirement Study (n = 9032), 1996-2016. A) First scenario; B) second scenario; C) third scenario. The % below the null represents the proportion of the simulation estimates below the null in the 10 000 replications for each situation.
Results from the second and third scenarios indicate that the likelihood of observing effect modification by gender is slightly lower given greater difference in sensitivity between men and women (Figure 2B-2C and Table S3). In the second scenario, where the sensitivity values for men were constantly held at 0.30-0.35, the likelihood of observing effect modification by gender declined as the sensitivity values among women increased (Figure 2B). In the third scenario, where the sensitivity values for women were constantly held at 0.95-1.00, the simulation estimates for effect modification by gender were adjusted towards the null, as the sensitivity values among men decreased. Results in the third simulation scenario indicate that when women did not underreport loneliness status, loneliness underreporting among men may bias the estimates for the 3-way interaction away from the null, although in a small magnitude (Figure 2C).
Figure 3 and Table S4 provide the unadjusted estimates and bias-adjusted simulation estimates for the association between loneliness duration and rate of memory decline among men and women, with loneliness duration coded as a 4-level categorical variable. The direction of information bias arising from nondifferential loneliness underreporting by memory outcome measures was not consistent across loneliness duration exposure level: Nondifferential loneliness underreporting consistently biased the estimate for the highest exposure level (loneliness at ≥ 3 time points) towards the null among both men and women, while the direction for the middle exposure levels (loneliness at 1 time point and 2 time points) varied by sensitivity values and by gender.

Gender-specific simulation estimates for the interaction between loneliness exposure duration (categorical) and years of follow-up, the US Health and Retirement Study (n = 9032), 1996-2016. These estimates were from gender-stratified subgroup analyses: A) men; B) women. We conducted additional pooled analyses to derive gender-specific simulation estimates (Figure S3). Results were consistent with those shown in Figure 3.
Figure 4 and Table S5 provide the unadjusted estimate and bias-adjusted simulation estimates and 95% SIs for the 3-way interaction between loneliness exposure duration (categorical), years of follow-up, and gender. The overall effect modification by gender was largely driven by the differences in the estimates of loneliness at ≥ 3 time points among men and women. The direction of the effect modification by gender for the middle-level loneliness exposure categories were not consistent with the highest exposure level, indicating the complexity of the nondifferential misclassification of categorical exposures.

Simulation estimates and 95% simulation intervals (SIs) for the 3-way interaction between loneliness exposure duration (categorical), years of follow-up (1, 2, and ≥ 3time points in the respective columns), and gender in each of the three simulation scenarios (first, second, and third scenarios in the respective rows), the US Health and Retirement Study, 1996-2016 (n = 9032). The % below the null represents the proportion of the simulation estimates below the null in the 10, 000 replications for each situation.
Discussion
We conducted probabilistic bias analyses to quantify the potential impact of exposure misclassification in an effect modification analysis, where the exposure misclassification was differential according to the effect modifier. We simulated three scenarios for various magnitudes of loneliness underreporting between men and women to examine the robustness of a previously observed effect modification by gender.1 Although the likelihood of observing effect modification by gender slightly decreased as the gap in sensitivity between men and women increased, the likelihood remained as high as over 90% in all simulation scenarios when coding loneliness exposure duration as a continuous variable. When loneliness duration was coded as a categorical variable, the direction of bias was unpredictable and complex for the middle levels of the loneliness-exposure duration categories.
This study provides three main contributions to the literature. First, while there is a precedent for quantitative bias analyses to investigate the potential information bias when exposure misclassification is differential or nondifferential according to the outcome,16,-20 this study is one of the first to investigate exposure misclassification that is differential according to an effect modifier. Information bias in the context of effect modification is understudied in epidemiology, and this study represents an early approach to quantifying its potential impact. Second, this study provides an approach to probabilistic bias analysis for information bias when the exposure is an inherently subjective measure, and no obvious gold-standard data are available. We were able to use an alternative form of a self-report as a “best-available” gold standard to inform our analysis, but we ultimately simulated a wide range of bias parameters to account for uncertainty. Third, we provide an approach for record-level corrections in probabilistic bias analyses using longitudinal time-varying exposure measures, including sample Stata code (Appendix S4) to aid other analysts.
Our main finding that differential exposure misclassification according to the effect modifier did not meaningfully affect the effect modification analysis was unexpected. However, the bias-adjusted estimates from the gender-stratified models were adjusted in the expected direction, away from the null, in line with current literature on exposure misclassification.21,-23 When the sensitivity difference between men and women was held constant in the first simulation scenario, the absolute differences in the bias-adjusted estimates among men were systematically lower than those among women, leading to stronger effect modification by gender as measured by the 3-way statistical interaction on the additive scale. Further analyses are warranted to test whether the effect modification measured by statistical interaction on the multiplicative scale (eg, comparison of dementia risk due to loneliness on risk ratio scale) would be more sensitive to loneliness underreporting by gender than it would be on the additive scale.
Our findings indicate that the direction of information bias due to nondifferential exposure misclassification should be examined with caution and should not simply rely on the “bias towards the null” heuristic.22 Although nondifferential exposure misclassification (by the outcome) generally attenuates estimates towards the null when the variable is binary or continuous, the exposure is not rare, and the sample size is large,21,24 prior studies have documented exceptions when nondifferential exposure misclassification by the outcome may bias the estimate away from the null.22,25 Although differential loneliness underreporting between men and women does not meaningfully affect the observed effect modification by gender in our simulation analyses, this study makes a novel contribution to the literature by indicating that even when exposure misclassification is nondifferential by the outcome within each effect modifier subgroup, differential exposure misclassification across subgroups of an effect modifier may also bias the estimates for effect modification away from the null in certain situations, such as those in our third simulation scenario where only one of the effect-modifier groups misclassified exposure status. Indeed, only a few studies have conducted a stratified bias analysis across key demographic groups (eg, race and age).26 As the nature and degree of information bias may vary across subgroups, further development of methods for investigating information bias in the context of effect modification is warranted.
This study provides an example of the complexity of nondifferential exposure misclassification by the outcome, especially when the continuous exposure variable is converted into a categorical variable.27,28 Consistent with existing simulation research,28 our sensitivity analyses demonstrated that the direction of the bias in estimation for the middle exposure categories could be either away or towards the null, even when the misclassification was nondifferential by the outcome. This finding is potentially because the direction of bias for categorical exposure misclassification depends on many factors such as the magnitude of true effects, misclassification rate, and exposure distribution.27 As it is not uncommon to categorize a continuous exposure using cutoff values (eg, body mass index and dietary intake), failure to recognize nondifferential exposure misclassification in categorical variables may result in incorrect conclusions. Further research is warranted to incorporate quantitative bias analysis to quantify the direction of information bias due to misclassification of categorical variables.27,28
This study has limitations. First, we did not adjust for other sources of bias (eg, residual confounding bias), as they are beyond the scope of this study. Second, our results cannot reflect the true estimates that would have been observed if the exposure had been correctly specified, due to the unknown bias parameters for the single-item loneliness measure. We made three simplifying assumptions in specifying the bias parameters, which cannot be confirmed or falsified. Thus, our results should be cautiously interpreted with these assumptions in mind. Finally, the second and third simulation scenarios represent extreme and potentially unrealistic situations. However, these simulation scenarios help to demonstrate the theoretically plausible magnitudes and directions of information bias in this effect modification analysis.
Conclusion
This study applied probabilistic bias analyses to examine the robustness of the observed effect modification of the loneliness–memory aging relationship by gender. To the best of our knowledge, this is one of the first studies to investigate the potential for information bias in an effect modification due to exposure misclassification that is differential according to an effect modifier variable. Due to complexity of exposure misclassification, especially in the context of effect modification, further research is warranted to develop applications of quantitative bias analysis to quantify the direction, magnitude, and uncertainty of information bias in epidemiology.
Supplementary material
Supplementary material is available at American Journal of Epidemiology online.
Funding
The Health and Retirement Study is funded by the National Institute on Aging (U01AG009740) and performed at the Institute for Social Research, University of Michigan. Dr. Kobayashi is supported by National Institute on Aging at the National Institutes of Health grants R01AG069128 and R01AG070953.
Conflict of interest
The authors declare no conflicts of interest.
Data availability
The HRS data are publicly available at https://hrs.isr.umich.edu/about.