-
PDF
- Split View
-
Views
-
Cite
Cite
Charlotte Probst, Charlotte Buckley, Aurélie M. Lasserre, William C. Kerr, Nina Mulia, Klajdi Puka, Robin C. Purshouse, Yu Ye, Jürgen Rehm, Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project: Study Design and First Results, American Journal of Epidemiology, Volume 192, Issue 5, May 2023, Pages 690–702, https://doi.org/10.1093/aje/kwad018
- Share Icon Share
Abstract
Since about 2010, life expectancy at birth in the United States has stagnated and begun to decline, with concurrent increases in the socioeconomic divide in life expectancy. The Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project uses a novel microsimulation approach to investigate the extent to which alcohol use, socioeconomic status (SES), and race/ethnicity contribute to unequal developments in US life expectancy and how alcohol control interventions could reduce such inequalities. Representative, secondary data from several sources will be integrated into one coherent, dynamic microsimulation to model life-course changes in SES and alcohol use and cause-specific mortality attributable to alcohol use by SES, race/ethnicity, age, and sex. Markov models will be used to inform transition intensities between levels of SES and drinking patterns. The model will be used to compare a baseline scenario with multiple counterfactual intervention scenarios. The preliminary results indicate that the crucial microsimulation component provides a good fit to observed demographic changes in the population, providing a robust baseline model for further simulation work. By demonstrating the feasibility of this novel approach, the SIMAH Project promises to offer superior integration of relevant empirical evidence to inform public health policy for a more equitable future.
Abbreviations
- ACS
American Community Survey
- BRFSS
Behavioral Risk Factor Surveillance System
- CPS
Current Population Survey
- NESARC
National Epidemiologic Survey on Alcohol and Related Conditions
- PSID
Panel Study of Income Dynamics
- RMSE
root mean squared error
- SES
socioeconomic status
- SIMAH
Simulation of Alcohol Control Policies for Health Equity
- UI
uncertainty interval
- YLL
years of potential life lost
The Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project uses a novel microsimulation approach to investigate the extent to which alcohol use, socioeconomic status (SES), and race/ethnicity contribute to unequal developments in US life expectancy and how alcohol control interventions could reduce such inequalities. Microsimulation approaches have been successfully used to analyze population impacts of public health interventions (1–4). However, the application of microsimulation techniques to matters of public health has only recently picked up speed (5, 6). This project will be the first, to the best of our knowledge, to use microsimulation to 1) model alcohol-attributable mortality on the population level and 2) estimate policy impacts.
Since World War II, mortality rates in the United States have been generally decreasing, and overall, individuals in later generations could expect to live longer and healthier lives than their predecessors (7). However, in recent years, life expectancy at birth has stagnated and begun to decline in the United States, even before the coronavirus disease 2019 (COVID-19) pandemic (8, 9).
One potential explanation for these recent trends is increases in rates of premature mortality among specific demographic subgroups described by SES, race/ethnicity, age, and sex (8). In particular, within some age groups of non-Hispanic White individuals, American Indian/Alaska Native individuals, and low-SES individuals, all-cause mortality rates have increased by more than 1% annually over the past 15 years (10, 11).
Among the causes of death that are contributing most towards reductions in life expectancy since 2010 are poisoning, suicide, motor-vehicle–related and other unintentional injuries, liver disease and cirrhosis, and diabetes mellitus (12)—all of which are causally linked to alcohol use (13). Apart from being one of the most important risk factors for premature mortality (14), alcohol use also contributes to socioeconomic inequalities in mortality (15–18). The alcohol-attributable mortality risk follows a continuous dose-response relationship with SES (17), and the mortality gap between low SES and high SES is up to twice as large for alcohol-attributable mortality as for all-cause mortality (18). Identifying and addressing alcohol-related health disparities is an important health equity issue. The alcohol-attributable disease burden and alcohol-attributable mortality are largely preventable, and there are several effective population- and individual-level alcohol control interventions that can reduce the harmful use of alcohol (19–21). The most cost-effective interventions at the population level are the regulation of prices through taxation or minimum unit prices (a floor price level for the retail sale of a beverage per unit of ethanol) (22) and restriction of the commercial and public availability of alcohol (23–25). On the individual level, screening and brief interventions are effective in decreasing the prevalence of harmful alcohol use (26, 27). While the overall effectiveness of such alcohol control interventions is firmly established, little attention has been paid to the equitable distribution of their health benefits and their ability to reduce health inequalities (28–30).
OBJECTIVES
Objective 1 of the SIMAH Project is to investigate the extent to which alcohol use affected mortality rates underlying the recent stagnation and declines in US life expectancy and the increasing inequalities in life expectancy using a detailed microsimulation model of alcohol-attributable mortality in the United States on the national level, as well as for 15 selected states. Objective 2 is to inform policy design by modeling future mortality reductions in different SES and racial/ethnic groups for alcohol control intervention strategies on a 10-year intervention planning horizon. This article outlines the SIMAH study design and presents initial results on generating a synthetic population that is representative of the adult US population in the year 2000 and simulating dynamics in the population over time under objective 1. No preliminary results will be presented for objective 2, which will be addressed in the later years of the project.
METHODS
Study design and operationalizations
The design of the SIMAH Project (Figure 1) is based on an individual-level microsimulation approach (2, 31) to model cause-specific mortality attributable to alcohol use for the years 2000 through 2018, with a forward modeling time horizon until 2028. The microsimulation approach can systematically integrate disparate data sources into one coherent, dynamic simulation model of the individuals in a population that can be used to compare a baseline with multiple counterfactual scenarios (6). The individuals in the microsimulation constitute a synthetic baseline population that is representative of the real US population on the national and state levels. The advantage of microsimulation models is that they are “able to deal with detail complexity by simulating the life histories of individuals and then estimating the population effect from the sum of the individual effects” (32, p. 326). The 4 interventions to be modeled are alcohol taxation, minimum unit pricing, regulation of the availability of alcohol, and alcohol screening and brief interventions.

Study design, analysis steps, and data sources used in the Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project. ACS, American Community Survey; APIS, Alcohol Policy Information System; BRFSS, Behavioral Risk Factor Surveillance System; NESARC, National Epidemiologic Survey on Alcohol and Related Conditions; PSID, Panel Study of Income Dynamics; SES, socioeconomic status; YLL, years of potential life lost.
The target population of the study is the adult (ages ≥18 years) general population of the United States. For state-level modeling, 15 states will be used (California, Colorado, Florida, Indiana, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Missouri, New York, Oregon, Pennsylvania, Tennessee, and Texas), covering more than half of the adult US population and all 9 US Census divisions.
Educational attainment will be used as the main measure of SES. Although educational attainment is just one of several possible operationalizations of SES, it is the only one for which national mortality statistics are available. Furthermore, a lower educational level is consistently associated with heavier alcohol drinking across race/sex groups (33) and steep increases in the risk of alcohol-attributable mortality (17). Educational attainment will be categorized into high school diploma or less, some college, and college degree or more. Income, occupation, and employment status will be used in sensitivity analyses as additional indicators of SES. Race/ethnicity will be categorized as non-Hispanic White, non-Hispanic Black, Hispanic, and other. A more detailed grouping will not be possible for the main analyses because of the small sample size. However, sensitivity analyses will be performed with separate categories for American Indian/Alaska Native and Asian/Pacific Islander.
Alcohol consumption will be categorized into 5 discrete drinking patterns based on average grams of absolute alcohol per day (hereafter called g/day), according to the standards of the World Health Organization (34): abstainers (for the past 12 months); category 1, comprising ≤20/≤40 g/day for women/men; category 2, comprising 21–40/41–60 g/day for women/men; category 3, comprising 41–60/61–100 g/day for women/men; and category 4, comprising >60/>100 g/day for women/men. Consumption by beverage will be represented as the proportions of beer, coolers, wine, and spirits of an individual’s total consumption (28).
Years of potential life lost (YLL) before the age of 75 will be the outcome measure. The following 9 cause-of-death categories will be investigated: 1) alcohol use disorders (including alcohol poisoning and other 100% alcohol-attributable causes of death); 2) motor vehicle accidents; 3) other unintentional injuries; 4) suicide; 5) liver disease and cirrhosis; 6) diabetes mellitus; 7) ischemic heart disease; 8) ischemic stroke; and 9) hypertensive heart disease. International Classification of Diseases, Tenth Revision, codes are shown in Web Table 1 (available at https://doi.org/10.1093/aje/kwad018). Causes of death were selected to include major causes of death for which alcohol use is a risk factor (13).
Data sources
Several data sources will be used and integrated to inform all microsimulation parameters (Figure 1). The data sources are summarized in Table 1; details are given in Web Appendix 1.
Data Sources Used in the Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project
Data Sourcea and Host . | Domain . | Frequency and Years of Data Collection . | Study Design . | Sampling . |
---|---|---|---|---|
US Census; Bureau of the Census, US Department of Commerce (74) | Population | Two time points; 2000 and 2010 | Cross-sectional | Full assessment of the US population |
American Community Survey; US Census Bureau (37) | Population, migration | Annual; 2000–2018 | Cross-sectional | US civilian population; representative on the national and state levels. Institutionalized populations and people living in grouped quarters have been included since 2006. |
Current Population Survey; US Census Bureau and Bureau of Labor Statistics, US Department of Labor (35, 36) | Population | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
Panel Study of Income Dynamics; University of Michigan (Ann Arbor, Michigan) (38, 39) | Education transitions | Biennial; 1999–2019 | Cohort | US civilian, noninstitutionalized population; nationally representative |
National Vital Statistics System; CDC (40) | Mortality | Annual; 2000–2018 | Registry | Full assessment of individual death records |
Behavioral Risk Factor Surveillance System; CDC (41) | Alcohol exposure, alcohol control interventions | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
NESARC; NIAAA (42, 43) | Alcohol exposure | Three time points: 2001–2002, 2004–2005, and 2012–2013 (NESARC I, II, and III) | Longitudinal (NESARC I and II), cross-sectional (NESARC III) | US civilian, noninstitutionalized population; nationally representative |
Per capita alcohol consumption; N/A (44) | Alcohol exposure | Annual; 2003–2016 | Modeled estimates | National and state-level estimates |
Alcohol Policy Information System; NIAAA (48) | Alcohol control interventions | Exact dates; 2000–2018 | Collated information | State-level information |
Data Sourcea and Host . | Domain . | Frequency and Years of Data Collection . | Study Design . | Sampling . |
---|---|---|---|---|
US Census; Bureau of the Census, US Department of Commerce (74) | Population | Two time points; 2000 and 2010 | Cross-sectional | Full assessment of the US population |
American Community Survey; US Census Bureau (37) | Population, migration | Annual; 2000–2018 | Cross-sectional | US civilian population; representative on the national and state levels. Institutionalized populations and people living in grouped quarters have been included since 2006. |
Current Population Survey; US Census Bureau and Bureau of Labor Statistics, US Department of Labor (35, 36) | Population | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
Panel Study of Income Dynamics; University of Michigan (Ann Arbor, Michigan) (38, 39) | Education transitions | Biennial; 1999–2019 | Cohort | US civilian, noninstitutionalized population; nationally representative |
National Vital Statistics System; CDC (40) | Mortality | Annual; 2000–2018 | Registry | Full assessment of individual death records |
Behavioral Risk Factor Surveillance System; CDC (41) | Alcohol exposure, alcohol control interventions | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
NESARC; NIAAA (42, 43) | Alcohol exposure | Three time points: 2001–2002, 2004–2005, and 2012–2013 (NESARC I, II, and III) | Longitudinal (NESARC I and II), cross-sectional (NESARC III) | US civilian, noninstitutionalized population; nationally representative |
Per capita alcohol consumption; N/A (44) | Alcohol exposure | Annual; 2003–2016 | Modeled estimates | National and state-level estimates |
Alcohol Policy Information System; NIAAA (48) | Alcohol control interventions | Exact dates; 2000–2018 | Collated information | State-level information |
Abbreviations: CDC, Centers for Disease Control and Prevention; N/A, not applicable; NESARC, National Epidemiologic Survey on Alcohol and Related Conditions; NIAAA, National Institute on Alcohol Abuse and Alcoholism; SIMAH, Simulation of Alcohol Control Policies for Health Equity.
a All data sources will include adults aged 18 years or older.
Data Sources Used in the Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project
Data Sourcea and Host . | Domain . | Frequency and Years of Data Collection . | Study Design . | Sampling . |
---|---|---|---|---|
US Census; Bureau of the Census, US Department of Commerce (74) | Population | Two time points; 2000 and 2010 | Cross-sectional | Full assessment of the US population |
American Community Survey; US Census Bureau (37) | Population, migration | Annual; 2000–2018 | Cross-sectional | US civilian population; representative on the national and state levels. Institutionalized populations and people living in grouped quarters have been included since 2006. |
Current Population Survey; US Census Bureau and Bureau of Labor Statistics, US Department of Labor (35, 36) | Population | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
Panel Study of Income Dynamics; University of Michigan (Ann Arbor, Michigan) (38, 39) | Education transitions | Biennial; 1999–2019 | Cohort | US civilian, noninstitutionalized population; nationally representative |
National Vital Statistics System; CDC (40) | Mortality | Annual; 2000–2018 | Registry | Full assessment of individual death records |
Behavioral Risk Factor Surveillance System; CDC (41) | Alcohol exposure, alcohol control interventions | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
NESARC; NIAAA (42, 43) | Alcohol exposure | Three time points: 2001–2002, 2004–2005, and 2012–2013 (NESARC I, II, and III) | Longitudinal (NESARC I and II), cross-sectional (NESARC III) | US civilian, noninstitutionalized population; nationally representative |
Per capita alcohol consumption; N/A (44) | Alcohol exposure | Annual; 2003–2016 | Modeled estimates | National and state-level estimates |
Alcohol Policy Information System; NIAAA (48) | Alcohol control interventions | Exact dates; 2000–2018 | Collated information | State-level information |
Data Sourcea and Host . | Domain . | Frequency and Years of Data Collection . | Study Design . | Sampling . |
---|---|---|---|---|
US Census; Bureau of the Census, US Department of Commerce (74) | Population | Two time points; 2000 and 2010 | Cross-sectional | Full assessment of the US population |
American Community Survey; US Census Bureau (37) | Population, migration | Annual; 2000–2018 | Cross-sectional | US civilian population; representative on the national and state levels. Institutionalized populations and people living in grouped quarters have been included since 2006. |
Current Population Survey; US Census Bureau and Bureau of Labor Statistics, US Department of Labor (35, 36) | Population | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
Panel Study of Income Dynamics; University of Michigan (Ann Arbor, Michigan) (38, 39) | Education transitions | Biennial; 1999–2019 | Cohort | US civilian, noninstitutionalized population; nationally representative |
National Vital Statistics System; CDC (40) | Mortality | Annual; 2000–2018 | Registry | Full assessment of individual death records |
Behavioral Risk Factor Surveillance System; CDC (41) | Alcohol exposure, alcohol control interventions | Annual; 2000–2018 | Cross-sectional | US civilian, noninstitutionalized population; representative on the national and state levels |
NESARC; NIAAA (42, 43) | Alcohol exposure | Three time points: 2001–2002, 2004–2005, and 2012–2013 (NESARC I, II, and III) | Longitudinal (NESARC I and II), cross-sectional (NESARC III) | US civilian, noninstitutionalized population; nationally representative |
Per capita alcohol consumption; N/A (44) | Alcohol exposure | Annual; 2003–2016 | Modeled estimates | National and state-level estimates |
Alcohol Policy Information System; NIAAA (48) | Alcohol control interventions | Exact dates; 2000–2018 | Collated information | State-level information |
Abbreviations: CDC, Centers for Disease Control and Prevention; N/A, not applicable; NESARC, National Epidemiologic Survey on Alcohol and Related Conditions; NIAAA, National Institute on Alcohol Abuse and Alcoholism; SIMAH, Simulation of Alcohol Control Policies for Health Equity.
a All data sources will include adults aged 18 years or older.
Population estimates for each subgroup (defined by SES, race/ethnicity, age, and sex) will be based on decennial US Census data, the annual American Community Survey (ACS), and the annual Current Population Survey (CPS) (March CPS Income Supplement, hereafter called March CPS) (35–37). Transitions between levels of educational attainment by subgroup will be informed by data from the Panel Study of Income Dynamics (PSID) (38, 39).
Cause-specific mortality estimates in each subgroup will be based on individual death records obtained from the National Vital Statistics System (40).
Individual-level data on frequency and quantity of alcohol consumption from the Behavioral Risk Factor Surveillance System (BRFSS) will be used to inform alcohol exposure in each subgroup (41). Data from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC; waves I and II) will be used to inform transition probabilities between drinking patterns by subgroup (42, 43). All 3 waves of the NESARC (I–III) will be used to impute beverage preferences by subgroup. The aggregate total adult per capita consumption, in liters of absolute alcohol (44), will be used to adjust for underreporting of quantity and frequency of alcohol use in population surveys (45).
Note that information on race and ethnicity is based on self-reporting in survey and census data (ACS, March CPS, BRFSS, NESARC, and US Census). In the PSID, race and ethnicity variables are only assessed for household heads and their wives/partners (self-report). The relationship of each participant to the household head will be used to reassign race and ethnicity for all participants who are not the household head or the head’s wife/partner (see Web Appendix 1 for details). Mortality data are based on death records, for which race/ethnicity information is typically filled out by the funeral director, who is asked to consult the decedent’s next of kin but may instead in some cases rely only on observation (46, 47).
Data on alcohol control interventions required for objective 2 will come from the BRFSS to inform the current state regarding the coverage of screening and brief intervention for harmful alcohol use. State-level data on the current implementation of alcohol taxation, minimum unit pricing, and regulation of alcohol availability will be obtained from recent work in the literature and from the Alcohol Policy Information System (48, 49).
Systematic literature reviews for parameter elicitation
Two systematic literature reviews will be performed to elicit model parameters of the microsimulation. Preference for inclusion will be given to high-quality studies based on US data reporting on parameters that are specific to SES, race/ethnicity, and sex. The first review will focus on relative risks of alcohol use in cause-specific mortality (objective 1); the second review will be performed to elicit evidence-based alcohol control intervention effects on individual drinking behavior (objective 2).
Statistical analysis
Objective 1.
A dynamic microsimulation model will be used to model life-course changes in SES and alcohol use and cause-specific mortality attributable to alcohol use by SES, race/ethnicity, age, and sex (50, 51). The baseline date of the microsimulation will be January 1, 2000.
Synthetic population.
The synthetic baseline population will be representative of the US population on the national and state levels in 2000 regarding the joint frequency distribution of SES, race/ethnicity, age, sex, and drinking patterns. The synthetic population will be created using iterative proportional fitting (52, 53), combining multiple data sources (Figure 1).
Microsimulation.
The microsimulation model will progress the synthetic population forward in time by adding new individuals to account for births and inward migration and removing individuals to account for deaths and outward migration (6). It will use transition probabilities to inform individual trajectories through levels of SES and drinking patterns. Annual transition probabilities will be estimated using continuous-time, multistate Markov models with transition intensity covariates adjusted for SES, race/ethnicity, age, and sex (54).
Simulation runs and uncertainty estimation.
The microsimulation will be calibrated within a Bayesian probabilistic framework that accounts for uncertainty in the evidence base. In this approach, the estimated transition probabilities for demographic change, drinking pattern change, and mortality are interpreted as prior beliefs about the true value of these model parameters. An approximate Bayesian computation (55) approach will be used to estimate posterior beliefs about these model parameters based on observed population-level change from an independent source (i.e., target data). Specifically, we will condition beliefs about model parameters using an “implausibility” goodness-of-fit metric that accounts for the observed differences between simulated data and target data, as well as the sampling uncertainty relating to both the simulation and the target (56). This calibrated model will then be used to quantify YLL in each subgroup of the population. The proportion of YLL attributable to alcohol by cause of death, SES, race/ethnicity, age, and sex over time will be calculated using lifetime abstention as the counterfactual scenario (57, 58). Similarly, the proportion of inequalities in YLL rates concerning SES and race/ethnicity that can be explained by alcohol exposure over time will be estimated using the same microsimulation.
Objective 2.
The microsimulation model will be expanded to the state level to generate projections regarding the impact of alcohol control interventions on cause-specific, age-standardized YLL rates, by SES, race/ethnicity, age, and sex. The intervention impacts will be modeled for a 10-year intervention planning horizon (2019–2028) for 15 selected states. The status quo (i.e., alcohol consumption under current alcohol policies) will be used as the baseline scenario to contrast the impact of reduced alcohol use under different counterfactual intervention scenarios. The intervention model is based on an existing conceptual framework for examining the impact of alcohol use on population health and disparities (29). As such, we will investigate 3 upstream interventions: alcohol taxation (with and without inflation correction) (49), minimum unit pricing, and regulation of the availability of alcohol concerning Sunday closures, as well as screening and brief interventions as a downstream intervention.
Forecasting component.
To model YLL for a 10-year intervention planning horizon, the synthetic population will be projected through 2028. Individual-level receipt of screening and brief interventions will be added to the synthetic population using BRFSS data (41, 53). Projections before intervention will be calibrated to existing population and mortality projections (59–61).
Intervention component.
The estimated impacts of all relevant alcohol control interventions will be combined with the posterior estimates for transition probabilities between drinking patterns. Because the effects of alcohol policies will be beverage-specific, their impact on the transition probabilities is mediated by beverage preferences. We will follow the approach used in the Sheffield Alcohol Policy Model to account for beverage preferences (28). This will enable a forward estimation of the impact of alcohol control interventions on alcohol use and alcohol-attributable mortality under different counterfactual intervention scenarios as compared with the baseline scenario (status quo). Machine learning methods (62) will be used to identify the most parsimonious intervention scenarios leading to 1) the highest decreases in alcohol-attributable mortality overall; 2) the highest reduction of inequality in mortality; and 3) a reversal of the declining trends in life expectancy.

Prevalence of alcohol consumption in 4 categories among men (top row) and women (bottom row) in the baseline synthetic population and in Behavioral Risk Factor Surveillance System (BRFSS) data, by educational attainment in 2000, Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project. Category 1: ≤20/≤40 (women/men) g/day; category 2: 21–40/41–60 (women/men) g/day; category 3: 41–60/61–100 (women/men) g/day; category 4: >60/>100 (women/men) g/day.
PRELIMINARY FINDINGS
Preliminary analyses on the demographic foundation of the microsimulation model, including aging, transitioning between levels of SES (educational attainment), migration, and cause-specific mortality, were performed on the national level. The individual-level characteristics implemented in the synthetic population include drinking patterns, educational attainment, race/ethnicity, age, and sex. The synthetic population of 1,000,000 individuals was generated using iterative proportional fitting to integrate data from the BRFSS, US Census, ACS, and PSID and represented approximately 0.6% of adults in the United States in 2000.
The synthetic population was projected forwards in time by adding new synthetic individuals each year and removing individuals based on Monte Carlo sampling using estimated net outward migration and mortality rates. Transition probabilities were used to simulate educational attainment by sex and race/ethnicity. The latter were estimated using a continuous-time, multistate Markov model based on data from the PSID (Dr. Charlotte Buckley, University of Sheffield (Sheffield, United Kingdom), unpublished manuscript, 2023). Overall, the microsimulation model accounted for population developments between 2000 and 2018, including births, migration, and deaths in each subgroup of the population (63). Preliminary findings shown do not include changes in drinking patterns and use mortality rates rather than YLL; these will be added in later iterations of the model. For the first microsimulation model shown here, uncertainty in the transition probabilities between levels of educational attainment was considered to demonstrate uncertainty estimates in the resulting population trends over time (see Web Appendix 2 for details). Population estimates based on ACS, US Census, and PSID data were used as target data for comparison. Observed mortality rates calculated from the National Vital Statistics System and March CPS data were used as target data. While we have not yet performed a full model calibration within a Bayesian probabilistic framework, all simulation runs shown correspond to prior beliefs about the model parameters, all of which were informed by empirical data. The root mean squared error (RMSE) was calculated to summarize differences between modeled and observed target data. Given the space limitations, our preliminary findings are focused on SES; results for race/ethnicity are shown in Web Tables 2–4 and Web Figures 1 and 2.
Figure 2 shows the prevalence of the 4 alcohol consumption categories in the synthetic baseline population and based on BRFSS data by sex and educational attainment. Overall, the prevalence of any current alcohol use matched the BRFSS target data well, with a higher prevalence in men and women with higher levels of educational attainment. Among men, the prevalence of category 3 or category 4 drinking was lower among those with higher educational attainment; the highest prevalence of category 3 or category 4 drinking (combined) in the synthetic population was about 6% among men with a high school diploma or less. Among women, the prevalence of category 3 or category 4 drinking was approximately 2% for women with a high school diploma or less, as well as women with a college degree or more, and approximately 3% for women with some college education.
Figure 3 depicts the microsimulation of the synthetic adult US population from 2000 to 2018. Specifically, the proportion of the population in each subgroup defined by educational attainment and sex is shown over time in comparison with observed data (US Census, ACS, and PSID). Some survey observations lie outside of our uncertainty intervals (UIs) because of differences in category definitions (see Web Appendix 1 for details). Overall, the proportion of individuals in each SES category in the microsimulation showed a good fit to the proportion of individuals in each education category by sex in the different data sources—namely, the US Census (RMSE = 1.7%), the ACS (RMSE = 7.1%), and the PSID (RMSE = 5.6%).

Distributions of men (top row) and women (bottom row) by educational level as an indicator of socioeconomic status over time, modeled via microsimulation (2000–2018), as compared with the US Census (2000, 2010), the American Community Survey (ACS; annual data from 2000–2018), and the Panel Study of Income Dynamics (PSID; biannual data from 1999–2017), Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project. Gray shaded areas with dashed lines indicate 95% uncertainty intervals.
According to the microsimulation model, the proportion of men with a high school diploma or less decreased from 47.3% in 2000 (US Census data) to 37.7% in 2018 (modeled data; 95% UI: 34.2, 40.0). The decreases were even stronger among women, declining from 47.0% (US Census data) in 2000 to 32.8% in 2018 (modeled data; 95% UI: 30.2, 35.2). The proportion of men with a college degree or more increased from 25.1% in 2000 (US Census data) to 30.2% in 2018 (modeled data; 95% UI: 28.0, 34.3). Over the same period, the proportion of women with a college degree or more caught up with the proportion among men, increasing from 21.9% in 2000 (US Census data) to 31.6% in 2018 (modeled data; 95% UI: 27.9, 36.4). The microsimulation also showed a good fit with educational attainment data split by race/ethnicity and sex (Web Table 2, Web Figure 1).
Figure 4 shows results for cause-specific mortality rates modeled by the microsimulation and compared with observed data, by sex and educational attainment. Mortality rates for all 9 cause-of-death categories in the microsimulation model were a good fit to the observed data when split by education category (Web Table 5). For individuals with a high school diploma or less, the RMSE between modeled and observed mortality rates varied by causes between 3.8 deaths per 100,000 population (ischemic heart disease) and 0.6 deaths per 100,000 population (alcohol use disorders and ischemic stroke). Model fit for individuals with some college education was comparable for some categories, with an RMSE of 0.7 deaths per 100,000 population (alcohol use disorders), and worse for others, with an RMSE of 15.5 deaths per 100,000 population (ischemic heart disease). For individuals with a college degree or more, all causes of death were well represented by the microsimulation, and the RMSE ranged from 0.4 deaths per 100,000 population (alcohol use disorders) to 2.9 deaths per 100,000 population (ischemic heart disease).


Age-standardized mortality rates per 100,000 population for 9 cause-of-death categories between 2000 and 2018 as observed (target data; dotted line) and as modeled by microsimulation (solid line), by sex (black, men; gray, women) and educational level (an indicator of socioeconomic status), Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project. Results were age-standardized to the US population in 2018. Panels A–C, alcohol use disorders; panels D–F, hypertensive heart disease; panels G–I, stroke; panels J–L, liver cirrhosis; panels M–O, suicide; panels P–R, other unintentional injury; panels S–U, motor vehicle accidents; panels V–X, diabetes; panels Y–ZB, ischemic heart disease. “Stroke” represents ischemic stroke; “liver cirrhosis” includes liver disease and cirrhosis.
The microsimulation showed (in accordance with observed data) that mortality rates for causes of death closely related to alcohol use increased between 2000 and 2018 for both sexes, with overall higher increases among individuals with only a high school diploma or less (Figure 4). This included alcohol use disorders, liver disease and cirrhosis, suicide, and other unintentional injuries but not motor vehicle accidents. The most notable declines in mortality rates were observed for ischemic heart disease. The latter was also the cause of death with the largest absolute inequalities between education groups, with a difference of 148 (men) and 65 (women) deaths per 100,000 population for individuals with a high school diploma or less compared with those with a college degree or more in 2000. This absolute rate difference declined to 116 (men) and 46 (women) deaths per 100,000 population in 2018. In relative terms, the inequalities between individuals with a high school diploma or less compared with individuals with a college degree or more increased universally for all 9 causes of death, with some variation across the years. The largest relative inequalities among men were observed for motor vehicle accidents, which increased from 2.9-fold higher rates among men with a high school diploma or less in 2000 to 5.2-fold higher rates in 2018. Among women, the relative inequalities were largest for diabetes mellitus, starting with a 2.7-fold higher rate among women with a high school diploma or less (versus women with a college degree or more) and reaching 3.5-fold higher mortality rates in 2018.
Overall, the microsimulation was also a good fit to mortality rates split by race/ethnicity (Web Table 3, Web Figure 2) and to mortality rates split by race/ethnicity and education category (Web Table 4). The model fit was best for non-Hispanic White individuals and worst for persons of non-Hispanic other race/ethnicity (Web Table 3).
DISCUSSION
Premature mortality in the United States has recently been increasing among specific sociodemographic subgroups, especially for causes of death closely related to alcohol use. The SIMAH Project will use a rigorous approach applying innovative microsimulation methodology to investigate trends in alcohol-attributable mortality for major causes of death by SES, race/ethnicity, age, and sex concurrently (Figure 5). This approach represents an advance over approaches commonly used, such as traditional burden-of-disease analyses, based on comparative risk assessments (64). The latter do not account for differences in exposure by factors such as SES, which can affect exposure, risk, or baseline mortality rates (65). Furthermore, the SIMAH microsimulation approach will allow for dynamic modeling of diverse intervention scenarios to inform public health policy decisions.
The preliminary results indicate that the crucial microsimulation component provides a good fit to observed demographic changes in the population, including changes in cause-specific mortality by sex, educational attainment, and race/ethnicity, providing a robust baseline model for further simulation work. The microsimulation provided a generally good fit to mortality rates for non-Hispanic White and Black individuals but a comparatively poorer fit for the mixed race/ethnicity category of “non-Hispanic other.” This is due to this group’s representing a smaller proportion of the population, and since the simulation relies on random number sampling, this can lead to higher inaccuracy with smaller numbers. Additionally, the model showed a comparatively poorer fit for the “some college” educational category for some causes of death (i.e., ischemic heart disease). Such inaccuracies will be reduced in future modeling by running the simulations with a larger sample of individuals to improve model estimations.
Because a model can only be as good as its input data, some limitations have to be acknowledged. First, the representativeness and validity of survey data are limited by the sampling frame, which may exclude portions of the population, such as homeless individuals or persons living in institutions; by low and declining response rates (66); and by misreporting or underreporting (e.g., of alcohol use) (67, 68). A key limitation of the mortality data is the assessment of education and race/ethnicity through a funeral director’s consulting next of kin, leading to inaccuracies in assessment and systematic differences as compared with self-reported data (46, 47, 69). This can lead to “dual data” bias due to a mismatch between population and mortality data, affecting, for example, the accuracy of the modeled inequalities (70, 71). Despite these limitations, the preliminary findings demonstrate the feasibility of this novel approach. With this, the current public health policy modeling paradigm of static, one-factor-at-a-time analyses can be supplemented by the new simulation approach proposed by SIMAH that offers superior integration of relevant empirical evidence.
In the next step, we will model transition probabilities between drinking patterns and the relationship between alcohol use and the specific causes of death (objective 1). In the later phases of the project, scenarios of several alcohol-control interventions will be investigated to evaluate their ability to reverse current decreases in life expectancy with a 10-year forward modeling time horizon (objective 2). Thus, the microsimulation can offer new perspectives on much-debated US public health policies (such as alcohol taxation), in addition to appraising 2 novel interventions for reducing inequalities in alcohol-related mortality: minimum unit pricing and a primary-care program of screening and brief interventions for harmful alcohol use (22). However, one key challenge will be to model the impacts of the coronavirus disease 2019 pandemic, which has affected nearly all aspects relevant to the model, including alcohol consumption and socioeconomic health inequalities (72, 73).
The final microsimulation model will cast light on those subgroups of the population that experienced the highest increases in (alcohol-attributable) mortality but have been neglected by the broad-brushed modeling approaches currently available (25). The use of Bayesian methods to propagate uncertainty in the evidence base through to simulation outputs will provide knowledge users with a robust perspective on the likely direction and magnitude of intervention impacts over time. Instead of providing results in the form of a single point estimate, the simulation model has the potential to flexibly analyze intervention scenarios upon the request of stakeholders. To that end, the microsimulation is an open-source platform that can be expanded on and used by other researchers to explore the impacts of other exposure variables on alcohol-related mortality and major causes of death. As a result, a fine-tuned knowledge translation will be facilitated by the project that can be tailored to the needs of public health authorities on the state level.

Lessons learned from the Simulation of Alcohol Control Policies for Health Equity (SIMAH) Project.
ACKNOWLEDGMENTS
Author affiliations: Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada (Charlotte Probst, Aurélie M. Lasserre, Klajdi Puka, Jürgen Rehm); Heidelberg Institute for Global Health, Heidelberg University, Heidelberg, Germany (Charlotte Probst); Department of Psychiatry, Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada (Charlotte Probst, Jürgen Rehm); Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, Ontario, Canada (Charlotte Probst, Jürgen Rehm); Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, United Kingdom (Charlotte Buckley, Robin C. Purshouse); Alcohol Research Group, Public Health Institute, Emeryville, California, United States (William C. Kerr, Nina Mulia, Yu Ye); Institute of Clinical Psychology and Psychotherapy, Technische Universität Dresden, Dresden, Germany (Jürgen Rehm); Epidemiology Division, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada (Jürgen Rehm); Center for Interdisciplinary Addiction Research, Department of Psychiatry and Psychotherapy, University Medical Center Hamburg-Eppendorf, Hamburg, Germany (Jürgen Rehm); and Department of International Health Projects, Institute for Leadership and Health Management, I.M. Sechenov First Moscow State Medical University, Moscow, Russian Federation (Jürgen Rehm).
This research was supported by the National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, under award R01AA028009.
This study used data from several sources, most of which are publicly available. Data from the US Census, the American Community Survey, and the Current Population Survey are available from the website of the US Census Bureau (https://www.census.gov/data.html). Data from the Panel Study of Income Dynamics can be accessed from the University of Michigan (https://psidonline.isr.umich.edu/). Multiple Cause of Death Files can be obtained from the National Center for Health Statistics, Centers for Disease Control and Prevention (https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm#Mortality_Multiple). Data from the Behavioral Risk Factor Surveillance System are also accessible through the Centers for Disease Control and Prevention (https://www.cdc.gov/brfss/data_documentation/index.htm). Data from the National Epidemiologic Survey on Alcohol and Related Conditions are published by the National Institutes of Health and can be accessed through https://catalog.data.gov/.
We thank the SIMAH team for their input into wider discussions involved in generating this research.
This study protocol was presented at the fourth European Conference on Addictive Behaviours and Dependencies (Lisbon Addictions 2022), Lisbon, Portugal, November 23–25, 2022.
The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Conflict of interest: none declared.