Abstract

This paper describes how the risk of experiencing heart attacks varies across gender and ethnicity in New Zealand. We estimate dynamic hazard models using administrative data. We deal with left-censored data using recently developed maximum simulated likelihood methods. The models allow risk to vary with age, previous heart attack history and unobserved individual heterogeneity. We find that the risk of subsequent events is far higher than the risk of the first event, particularly high within 1 year after an event, and that unobserved heterogeneity is important. Generally, male Maoris have the highest risk, followed by female Maoris, then ethnically European males, while ethnically European females have the lowest risk.

1 INTRODUCTION

Acute myocardial infarction (AMI), or heart attacks, are an important public health issue. It has been estimated that the cost of a person’s first AMI-related hospitalization is about 4500 USD for 1999–2001 in New Zealand (Kauf et al., 2006), which is more than twice as large as the total health expenditure per capita of about 1,700 USD in 2000/2001 (Ministry of Health, 2010). AMI is the most important subclass of the ischaemic heart diseases, followed by angina (Morrow, 2017). In New Zealand, about 110 million USD (228 million NZD) was spent on treating ischaemic heart diseases in hospitals in 2002/2003 (National Health Committee, 2013). Ischaemic heart diseases are a leading cause of death in all of the developed world (Naghavi et al., 2015). Using hospital admission data, death registrations and census data, we estimate that about 10% of all deaths in New Zealand during 2002–2012 were directly caused by AMI.

This paper describes and quantifies how the risk of experiencing AMI events varies across gender and ethnicity in New Zealand. We consider four groups, namely male and female people of Maori and European descent (referred to as ‘Maoris’ and ‘Europeans’ in the remainder of the paper). Our analysis is challenged by a high degree of missing data, which we handle by using maximum simulated likelihood (MSL) estimation. The data set is constructed by combining nationwide administrative data on hospital admissions and death registrations during 2002–2012 with census data from 2001 such that our analysis data have the same population by age as the 2001 census. We consider several aspects of risk. Using event history (hazard) models, we investigate how risk is related to time (i.e. age and cohort), previous AMI history and unobserved individual heterogeneity (‘random effects’ in the econometrics literature, ‘frailty’ in the statistics literature). Furthermore, we consider three regimes of risk: the risk of experiencing the first AMI, the risk of a subsequent AMI within 1 year following a previous attack, and the risk of a subsequent AMI >1 year after a previous attack. We discuss time-specific risk and, taking a synthetic cohort approach, we compute lifetime cumulative outcomes until age 85.

In general, the estimation of dynamic event history models is hampered by problems of missing data. Often the data available concern the events that happened for a given population during a given period, and there is no information about events that happened before or after the observation period. These problems are called left censoring and right censoring respectively. In the present study, left censoring means that we do not know which regime an individual is in during the first part of their observation period, because we do not know whether the individual experienced any events prior to their observation period, and if they did, whether that event is within 1 year of becoming under observation. In general, dealing with right censoring is relatively straightforward, but left censoring is a very difficult problem. Most authors simply drop all left-censored histories from the analysis (e.g. Bhuller et al., 2017; Cockx & Picchio, 2013; Doiron & Gørgens, 2008), which can lead to large data and efficiency losses. Here we overcome the left-censoring problem by estimating the models using recently developed MSL methods (Lee & Gørgens, 2021). The idea is that the same model that is used to explain the observed patterns in the data can be used to simulate events that are unobserved. Intuitively, the missing data are integrated out of the so-called complete-data likelihood function by simulating pseudo-histories for each person whose history is left censored. In principle, the simulation technique allows us to compute an arbitrarily good approximation to the exact likelihood function for the data that are observed. In practice, available computing resources and time restrict the feasible accuracy.

Regarding right censoring, for our basic estimates we assume that the individual observation periods and the events are independent. In most cases, the observation period ends when the study period ends on 30 June 2012; however, some people are right censored when they die from an AMI event. We provide estimates from extended models to show that the influence of AMI-related mortality is negligible. Note that since we observe the cause of death, AMI events are not systematically underreported in our data.

The empirical results can be briefly summarized as follows. Our main finding is that there are large gender and ethnic disparities in AMI risk. The general ranking is that male Maoris tend to have the highest risk, followed by female Maoris, then male Europeans, and finally female Europeans have the lowest risk. This is true both for time-specific risk and from the perspective of synthetic lifetime cumulative risk. Not surprisingly, the AMI risk increases strongly with time. Regarding history dependence, in terms of the three regimes the risk is lowest for the first AMI event, highest for events within 1 year after an event, while still high for events >1 year after an event. Finally, it is notoriously difficult to obtain reliable estimates of the influence of unobserved heterogeneity, but our results suggest that within-group variation in risk is far greater than between-group differences. Unfortunately, our data do not allow us to explore which biological, socioeconomic, behavioural or other factors might explain these disparities across gender and ethnicity.

The rest of the paper is organized as follows. Section 2 provides a review of related literature. In Section 3, we describe our data. In Section 4, we discuss the model specification and the estimation methods. In Section 5, we present and discuss the estimation results. Section 6 concludes.

2 RELATED LITERATURE

The literature on gender and ethnic differences in the incidence and reoccurrence of AMIs is relatively small. Wang et al. (2012) compare AMI incidence rates in the United States and discover that male whites have the highest risk, followed by male blacks, then female blacks, and female whites have the lowest risk. Smolina et al. (2012) find a gender gap for first AMIs but no gap for subsequent AMIs in the United Kingdom. For New Zealand, Chan, Wright, Tobias et al. (2008) also distinguish between first and subsequent AMIs and find that the rate of AMI readmissions per 100,000 population increases during the 1990s; however, they do not consider gender and ethnic differences. The policy implications of high incidence rates for first and subsequent AMI are different: the former supports primary prevention and the latter supports secondary prevention (Avendano & Soerjomataram, 2008). The distinction between the first and subsequent AMIs is becoming more important as more people survive AMIs and are at risk of having subsequent AMIs. Chan, Wright, Riddell, et al. (2008) study differences in the prevalence of cardiovascular diseases (including AMI) across gender and ethnic groups in New Zealand, but do not distinguish between first and subsequent events.

The literature on gender and ethnic differences in mortality during the first 30 days or 1 year after an AMI event is larger than the literature on AMI incidence itself. Gender disparities in mortality have been studied for many countries using different kinds of data sets; for example, nationwide data on Finland (Kytö et al., 2015), Israel (Gottlieb et al., 2000), Scotland (MacIntyre et al., 2001), and England (Smolina et al., 2012) city-level data in Germany (Herman et al., 1997) and hospital-level data in Vietnam (Nguyen et al., 2014). Ethnic disparities in mortality have been studied for pertinent countries; for example, disparities between blacks and whites for the United States (Vaccarino et al., 2005), between people of European, Chinese and South Asian descent for Canada (Anand et al., 2000), and between Chinese, Malay and Indians for Singapore (Mak et al., 2003).

While we only consider mortality incidentally in this paper, it is interesting to note that the patterns in AMI incidence and AMI case fatality are not necessarily the same across gender, ethnicity and age. For example, Alderman et al. (2000) find that young black males in the United States have lower risk of AMI events than do young white males, but higher 30-day case fatality rates. Comparing US studies that focus on AMI incidence (e.g. Wang et al., 2012) with those on case fatality (e.g. Manhapra et al., 2004) reveals further instances where relatively low/high AMI incidence rates are associated with opposite high/low case fatality rates.

The statistical methods used in this literature include mean comparison, logistic regression, Kaplan–Meier estimation and Cox regression. These methods are appropriate for summarizing outcomes when there are no issues of missing data other than right censoring. For this reason, the literature has generally focused on outcomes that are fully observed during a relatively short period, say 30 days or 1 year, following an AMI event (e.g. Chang et al., 2006; Pokorney et al., 2012). A few small-scale follow-up studies have been able to track patients for several decades (e.g. Klein et al., 1992). Sometimes models of AMI risk allow for unobserved heterogeneity among families and hospitals. We have found only one study that has considered individual-level unobserved heterogeneity (Hougaard, 1986).

Our study contributes to the literature in several dimensions. We examine gender and ethnic disparities in AMI risk utilizing an event history approach. We analyse high-quality nationally representative data from New Zealand. We show that hazard models, in general, can be estimated using the MSL method, despite overwhelming left censoring. We use the estimated models to discuss and compare different aspects of risk including time dependence, dynamic effects of the AMI history and the role of unobserved individual heterogeneity. We also use the estimated models to examine synthetic lifetime cumulative outcomes.

3 DATA

Our primary data source is the National Minimum Data Set (NMDS) maintained by the Ministry of Health New Zealand. The NMDS includes administrative data on all hospital admissions in New Zealand (including both in-patients and day-patients). There are up to 20 diagnosis entries (for recording complications) for each admission. The original NMDS was created in 1993. The current format with 20 diagnosis entries was introduced in June 2002 (National Health Board Business Unit, 2011). Our study period is between 1 July 2002 and 30 June 2012. We merge the NMDS with death registrations. The latter include a (single) entry for the cause of death. These administrative data are well suited for studying the incidence of AMI events, because everyone who experiences an event usually present at an emergency department and are admitted to the hospital, or they die on the way to the hospital and so appear in the death data.

About 60% of New Zealanders appear in the administrative data during the study period. For example, there are 2.7 million people alive in the data in 2012 compared to 4.4 million estimated total population in 2012. We use the age distribution (by gender and ethnic group) in the census conducted on 6 March 2001 to add records for people with no hospital admission during the study period. As a result, the combined data set has the same age distribution in 2002 as the 2001 census. Migration in and out of New Zealand is substantial, but we expect that this is less of an issue for our analysis. Temporary emigration is common especially among young people, but they are less likely to suffer AMIs.

The combined data set includes date of birth (rounded to the 15th of the month for confidentiality), gender, ethnicity, date of hospital admission and diagnosis codes indicating AMI if admitted, and date of death and cause of death code if died. Gender is the biological sex reported. Ethnicity is self-identified. Date of admission is when patients are first seen by clinicians at a hospital. We use time of hospital admission or time of death as the timing of AMI events.

To define the study population, we restrict the combined data set as follows. First, we consider only ethnically European and Maori people. In the 2001 census, people of European and Maori descent constitute 83% and 14%. For the estimation sample, we further restrict the data to people over the age of 40 and under the age of 85. Estimating the risk of having the first AMI event requires us to specify an age when, according to the model, a person first becomes at risk. The age first at risk (AFAR) must be chosen such that relatively few events occur before this age (since they are ignored in the modelling) and such that relatively many events occur during the study period for the people who reach this age (since their experiences identify model parameters related to the first event). We set AFAR to 40 years in our main estimates, and consider AFAR at 35 years in the supplementary materials. There is no detailed information about the age distribution for those over 85 in the 2001 census and the sample sizes are small. It is therefore difficult to estimate risk for people over age 85.

Figure 1 illustrates general aspects of the data using a Lexis diagram with age on the horizontal and calendar time on the vertical axis. Each 45 degree line represents the life of a person. Symbols *, and indicate birth, death and right censoring. Dots represent AMI events. The data provide information about all life lived and all events between 1 July 2002 and 30 June 2012. The estimation sample considers the subset of life and events between ages 40 and 85. In the figure, the dashed part of the life lines represents time outside the estimation sample, while the solid part represents the time considered in the estimation. All persons born after June 1972 are excluded. Persons born between July 1962 and June 1972 contribute the months after they turn 40. For example, a person born in June 1972 contributes only the month of June 2012. Persons born between June 1962 and July 1927 contribute the full 10 years between July 2002 and June 2012, unless they die during this period. Persons who turn 85 years of age contribute the time between July 2002 and the month of their 85th birthday. Events that occur before 1 July 2002 are missing because of left censoring (e.g. d), events that occur after 30 June 2012 or after age 85 are missing because of right censoring (e.g. c, f) and events that occur before age 40 are ignored (e.g. a).

Lexis diagram with four individuals [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 1

Lexis diagram with four individuals [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Table 1 shows summary statistics for our study population and the estimation sample. By construction, the study population is essentially representative of the New Zealand population of Maoris and Europeans as of the census date, 6 March 2001. The first panel in Table 1 shows the study population by age on 1 July 2002. (The first row shows people born between 1 July 2002 and 30 June 2012 who were admitted to a hospital.) The cohorts aged 30–39 on 1 July 2002 are important, because after they turn 40 they provide the main identification of the risk of having a first event. This group is relatively large in all four subpopulations. Older cohorts are relatively smaller. Comparing the number of people aged 30–39 and 30–85 years reveals that 62–76% of the AMI histories in our estimation samples are left censored, so the problem is substantial.

TABLE 1

Summary statistics for the study population

Male MaorisMale EuropeansFemale MaorisFemale Europeans
N%N%N%N%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a71,26021.4170,45810.866,20519.6161,9379.9
Age 0–29164,36549.3565,37036.0162,93648.2554,70833.8
Age 30–3936,14110.8198,76412.641,63112.3219,72313.4
Age 40–4929,0548.7206,99613.232,2629.5217,12413.2
Age 50–5917,4365.2175,50611.218,2655.4178,76610.9
Age 60–6910,0603.0121,4477.710,7233.2125,9547.7
Age 70–793,9771.291,7235.84,8301.4105,3266.4
Age 80+8900.342,2522.71,4760.478,2744.8
Total333,183100.01,572,516100.0338,328100.01,641,812100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None271,49699.9933,96499.9270,66199.9936,21699.9
12360.05830.01070.01400.0
2240.0350.030.0100.0
3+100.0100.010.020.0
Total271,766100.0934,592100.0270,772100.0936,368100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None91,34794.4772,32594.4104,42296.5859,07297.1
14,5184.738,3224.731362.921,7342.5
26520.755270.74970.53,2390.4
3+2070.217930.22010.211380.1
Total96,724100.0817,967100.0108,256100.0885,183100.0
Number of deaths between ages 40–85 by cause of death
AMIb1,42612.39,06211.27527.75,3008.4
Other10,17987.771,81888.89,01992.357,55191.6
Total11,605100.080,880100.09,771100.062,851100.0
Male MaorisMale EuropeansFemale MaorisFemale Europeans
N%N%N%N%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a71,26021.4170,45810.866,20519.6161,9379.9
Age 0–29164,36549.3565,37036.0162,93648.2554,70833.8
Age 30–3936,14110.8198,76412.641,63112.3219,72313.4
Age 40–4929,0548.7206,99613.232,2629.5217,12413.2
Age 50–5917,4365.2175,50611.218,2655.4178,76610.9
Age 60–6910,0603.0121,4477.710,7233.2125,9547.7
Age 70–793,9771.291,7235.84,8301.4105,3266.4
Age 80+8900.342,2522.71,4760.478,2744.8
Total333,183100.01,572,516100.0338,328100.01,641,812100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None271,49699.9933,96499.9270,66199.9936,21699.9
12360.05830.01070.01400.0
2240.0350.030.0100.0
3+100.0100.010.020.0
Total271,766100.0934,592100.0270,772100.0936,368100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None91,34794.4772,32594.4104,42296.5859,07297.1
14,5184.738,3224.731362.921,7342.5
26520.755270.74970.53,2390.4
3+2070.217930.22010.211380.1
Total96,724100.0817,967100.0108,256100.0885,183100.0
Number of deaths between ages 40–85 by cause of death
AMIb1,42612.39,06211.27527.75,3008.4
Other10,17987.771,81888.89,01992.357,55191.6
Total11,605100.080,880100.09,771100.062,851100.0
a

Excludes infants who do not survive 1 month.

b

Include persons whose official cause of death is acute myocardial infarction (AMI) and a few persons who died with AMI comorbidity on day of hospital admission.

TABLE 1

Summary statistics for the study population

Male MaorisMale EuropeansFemale MaorisFemale Europeans
N%N%N%N%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a71,26021.4170,45810.866,20519.6161,9379.9
Age 0–29164,36549.3565,37036.0162,93648.2554,70833.8
Age 30–3936,14110.8198,76412.641,63112.3219,72313.4
Age 40–4929,0548.7206,99613.232,2629.5217,12413.2
Age 50–5917,4365.2175,50611.218,2655.4178,76610.9
Age 60–6910,0603.0121,4477.710,7233.2125,9547.7
Age 70–793,9771.291,7235.84,8301.4105,3266.4
Age 80+8900.342,2522.71,4760.478,2744.8
Total333,183100.01,572,516100.0338,328100.01,641,812100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None271,49699.9933,96499.9270,66199.9936,21699.9
12360.05830.01070.01400.0
2240.0350.030.0100.0
3+100.0100.010.020.0
Total271,766100.0934,592100.0270,772100.0936,368100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None91,34794.4772,32594.4104,42296.5859,07297.1
14,5184.738,3224.731362.921,7342.5
26520.755270.74970.53,2390.4
3+2070.217930.22010.211380.1
Total96,724100.0817,967100.0108,256100.0885,183100.0
Number of deaths between ages 40–85 by cause of death
AMIb1,42612.39,06211.27527.75,3008.4
Other10,17987.771,81888.89,01992.357,55191.6
Total11,605100.080,880100.09,771100.062,851100.0
Male MaorisMale EuropeansFemale MaorisFemale Europeans
N%N%N%N%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a71,26021.4170,45810.866,20519.6161,9379.9
Age 0–29164,36549.3565,37036.0162,93648.2554,70833.8
Age 30–3936,14110.8198,76412.641,63112.3219,72313.4
Age 40–4929,0548.7206,99613.232,2629.5217,12413.2
Age 50–5917,4365.2175,50611.218,2655.4178,76610.9
Age 60–6910,0603.0121,4477.710,7233.2125,9547.7
Age 70–793,9771.291,7235.84,8301.4105,3266.4
Age 80+8900.342,2522.71,4760.478,2744.8
Total333,183100.01,572,516100.0338,328100.01,641,812100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None271,49699.9933,96499.9270,66199.9936,21699.9
12360.05830.01070.01400.0
2240.0350.030.0100.0
3+100.0100.010.020.0
Total271,766100.0934,592100.0270,772100.0936,368100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None91,34794.4772,32594.4104,42296.5859,07297.1
14,5184.738,3224.731362.921,7342.5
26520.755270.74970.53,2390.4
3+2070.217930.22010.211380.1
Total96,724100.0817,967100.0108,256100.0885,183100.0
Number of deaths between ages 40–85 by cause of death
AMIb1,42612.39,06211.27527.75,3008.4
Other10,17987.771,81888.89,01992.357,55191.6
Total11,605100.080,880100.09,771100.062,851100.0
a

Excludes infants who do not survive 1 month.

b

Include persons whose official cause of death is acute myocardial infarction (AMI) and a few persons who died with AMI comorbidity on day of hospital admission.

The second panel in Table 1 shows the number of people aged 0–39 in June 2002 by their number of observed AMIs while aged 0–39 during the study period. As mentioned, since these life experiences are ignored in the estimation, it is important that there are relatively few AMI events before age 40. The third panel concerns our estimation sample. It shows the number of people aged 30–85 in June 2002 by their number of observed AMIs while aged 40–85 during the study period. (People born between July 1962 and June 1972 are included in both panels, provided they are alive at age 40.) Regrettably, AMI events are not rare for people over age 40.

Panel A in Figure 2 compares the age-specific AMI incidence rate across gender and ethnic groups. The incidence rate is computed as the ratio of the number of observed AMIs to the total time ‘at risk’ in 2-year age intervals, rescaled to rates per year. For ages under 80, male Maoris have the highest AMI incidence rates, female Maoris and male Europeans have similar intermediate rates, and female Europeans have the lowest rates. After age 80, the gender and ethnic disparities become less clear. Partly, this is because the number of people over 80 in the sample is small, so the estimates are noisy.

Incidence rates and cumulative hazard rates [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 2

Incidence rates and cumulative hazard rates [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Another look at age-specific risk is provided in panel B in Figure 2, which shows Nelson–Aalen estimates of the cumulative hazard rates. The cumulative hazard rate at a given age can be interpreted as the expected total number of events a representative person from the synthetic cohort has experienced by that age. The ranking of risk revealed in panel B is the same as in panel A, except the complications after age 80 are too small to upset the ranking of cumulative outcomes before age 80. We return to the synthetic cohorts in Section 5.2.

The last panel in Table 1 shows the number of people who died during the study period, about 6.8–11.9% of people in the estimation sample. About 7.7–12.3% of those deaths are caused by AMI. As mentioned, right censoring is less of an issue in this study, since we have data for most people until the end of the study period on 30 June 2012 and we observe the cause of death for those who die before that date.

4 MODEL AND ESTIMATION

4.1 Model specification and complete-data likelihood function

In this section, we discuss our model specification and estimation methods. A key risk factor is time. Time captures age effects on risk as bodies become older and the biology wears down, and time captures cohort effects on risk as different birth cohorts have been exposed to different environments, made different life style choices, and had access to different medical technologies. There may also be calendar time effects to the extent that, for example, the weather affects vulnerable people (e.g. Madrigano et al., 2013), but we expect them to be relatively less important. With data for only 10 calendar years, we do not attempt to estimate separate age and cohort effects. The data identify cohort effects by comparing people from different cohorts at the same age, which is only possible for cohorts up to 10 years apart. The oldest and the youngest birth cohorts in the estimation sample are 55 years apart. An attempt to separate age and cohort effects would therefore involve considerable extrapolation. Instead, we define analysis time as t = (age−40)/10 for age > 40 and interpret analysis time as representing age, cohort and calendar time effects combined. (Normalizing the time unit to a decade makes the scale of certain parameters more readable.)

It is well known that there are dynamic patterns in risk. Those having experienced an AMI event are more likely to experience subsequent events, and the risk is particularly high for some time immediately after that event. Therefore, we specify separate models for the first and subsequent AMI events, and the equation for subsequent AMIs is allowed to depend on the timing of the most recent event.

We estimate separate models for each gender and ethnic group, but for notational simplicity we suppress subscripts indicating the groups in the following. Each model consists of two equations. The first equation represents the hazard function, h1, of the first (actual as opposed to observed) AMI:

(1)

where θ denotes the entire unknown parameter vector to be estimated. Parameter α1 captures time dependence in risk, and parameter μ1 captures the overall level of risk for first AMIs. The second equation represents the hazard function, h2, of subsequent AMIs:

(2)

where t- is the timing of the most recent AMI and the variable Recent is defined by Recent=1(tt-+τ) where the value of τ corresponds to 1 year (i.e. τ = 0.1). Parameter α2 captures time dependence in risk, parameter γ indicates the dynamic effect of the most recent AMI, and parameter μ2 embodies the overall level of risk of subsequent AMIs.

The Gompertz specifications embodied in Equation (1) and (2) assume that the hazard function progresses exponentially with time. The law of exponential progression is suitable for many common patterns in actuarial, biological and demographic applications (e.g. Wienke, 2010). It is also appropriate in the context of AMI risk at least until about age 85, as shown in Figure 2. We expect positive signs of α1 and α2 given that AMI risk is higher for older people. (Note that analysis time is not reset after an AMI event.)

We capture history dependence partly by distinguishing between h1 and h2 and partly by including the time-varying covariate Recent. The latter allows for elevated risk proportional to exp (γ) within 1 year following the most recent event. The cut-off between the two regimes for the risk of subsequent events is somewhat arbitrary but follows the literature (e.g. Vlaar et al., 2008). There is no sound basis for assuming an abrupt change in risk after 1 year, but this specification allows us to distinguish short-term and long-term risks in a simple way.

Since we estimate separate models for each group, effectively all parameters are interacted with gender and ethnicity. In particular, gender and ethnicity are not assumed to have a simple proportional effect on risk.

As mentioned, the main problem for estimating the models is left censoring. For people whose histories are left censored, we cannot tell whether the first observed AMI is the first experienced AMI or a subsequent AMI, nor do we know the value of Recent for the first 1 year of the observation period.

Left censoring and history dependence mean that the likelihood function for the observed data is analytically intractable. Therefore, we estimate the models using recent maximum simulated likelihood (MSL) methods (Lee & Gørgens, 2021). To discuss this method some additional notation is needed. Let ci=0 indicate that the history for individual i is not left censored, let (0,bi1o) denote the analysis time of their observation period, and let bi1=(bi11,,bi1ki1) denote their event history, where bi1k is the analysis time when person i had event k and ki1 is their total number of events (possibly 0). The subscript 1 on these variables may seem redundant, but it simplifies the discussion later. Persons with ci=0 are under age 40 on 1 July 2002, and bi1 represents their event history from the date they turn 40 until 30 June 2012 or until the date of their death, whichever is earlier. In terms of Figure 1, the events in bi1 happen within the triangle bounded by the lines for age 40, for 30 June 2012, and for the person born in 1962. Let ci=1 indicate that the history for individual i is left censored, let (bi1o,bi2o) denote the analysis time of their observation period, and let (bi1u,bi2)=(bi11u,,bi1ki1u,bi21,,bi2ki2) denote their event history, where bi2 is observed and bi1u is unobserved. Persons with ci=1 are those who are over age 40 on 1 July 2002, and bi2 represents their event history from 1 July 2002 until 30 June 2012 or until the date of death, while bi1u represents their AMIs from age 40 to 1 July 2002. In terms of Figure 1, events in bi2 happen within the trapezium bounded by the lines for 1 July 2002, for 30 June 2012, for the person born in 1962 and for age 85. Events in bi1u happen within the triangle bounded by the lines for age 40, for 1 July 2002, and (not indicated) for a person born in 1917.

Let g1 denote density function of bi1, and let g2 denote density function of bi2 given bi1u. (The density for bi1u is the same as that for bi1.) They can be derived from the hazard functions given in Equations (1) and (2). Let H1(t|t-,θ)=t-th1(y | θ)dy for t>t- denote the value of the cumulative hazard function from time t- until time t. Similarly, define H2(t|t-,θ)=t-th2(y|t-,θ)dy for t>t-. Then the density g1 of bi1 evaluated at b1 when k1>1 is

(3)

If k1=1, the product over k in the middle is void, and if k1=0, then only the very last exponential term is present with H1 replacing H2. When k2>1, the conditional density g2 of bi2 given bi1u=b1 evaluated at b2 is

(4)

The modifications for individuals with other values of k2 are relatively straightforward; see Lee and Gørgens (2021) for details.

With these definitions, the log-likelihood function for N observed histories can be written

(5)

The first term in the sum on the right-hand side of Equation (5) is the likelihood contribution if individual i is non-left censored. The second term is the likelihood contribution if individual i is left censored. Here the integral is over the unobserved history.

In Equation (5), we assume that right censoring and AMI events are independent, and we do not model the process of right censoring explicitly. As mentioned, most individuals are right censored because the study period ends on 30 June 2012, but a small number are right censored when they die before 30 June 2012. If mortality risk and AMI risk are not independent, then the likelihood function is misspecified (see Section 4.4 below). However, the misspecification bias is likely to be small since the death rate is small.

4.2 MSL estimation

There are no closed-form solutions to the integral in Equation (5), and analytical evaluation of the likelihood function is not possible. Lee and Gørgens (2021) show that L(θ) can be evaluated using simulation methods. They discuss two importance sampling simulation methods, unnormalized (ISU) and normalized (ISN). For the ISU method, the simulated log-likelihood function is

(6)

where the bi1rs are simulated pseudo-event histories. The idea of importance sampling is to draw bi1r from g1(·|θ*) using a fixed θ* instead of drawing from g1(·|θ) using the θ at which the likelihood function is evaluated, and correct the ‘mismatch’ through the adjustment factor g1(bi1r|θ)/g1(bi1r|θ*). One of the advantages of using importance sampling is that the simulated likelihood function is continuous in θ, so gradient-based algorithms can be used to find the maximum. Since event timings depend on prior history, it is not possible to draw an entire history bi1r from g1(·|θ*) in a single step. Instead, it is necessary to draw the individual pseudo-event timings sequentially (i.e. first bi11r, then bi12r conditional on bi11r etc., stopping when the draw falls beyond the unobserved period); see Lee and Gørgens (2021) for details.

For the ISN method, the simulated log-likelihood function is essentially the same; the only difference is that the adjustment factors are normalized so that they sum to R. That is, g1(bi1r|θ)/g1(bi1r|θ*) in Equation (6) is replaced by

(7)

Since ISU and ISN are just different ways of approximating the exact likelihood function in Equation (5), both methods should provide similar results. However, there may be some differences in practice.

In general, the MSL estimation method is computationally burdensome, because a large number of draws is required in order to obtain a satisfactory approximation to the exact likelihood function for the observed data. In the present application, however, many people have the same unobserved periods and can share the same bi1r draws. Since birthdays are rounded at the monthly level, the total number of different unobserved periods is relatively small. Drawing bi1r for each of about 540 distinct unobserved periods rather than for each of the 96,724 to 885,183 individuals in the estimation samples (see Table 1) means that it is computationally feasible to set R very large.

There is little formal advice in the literature on how to choose θ* and R in general. In some simple settings, it is possible to improve computational efficiency greatly by choosing θ* such that rare outcomes occur either more or less often in the simulations than in the data (e.g. Hesterberg, 1995). However, it is difficult to relate these specific results to the present context. The informal advice is to choose R large enough that further increases have no practical effect on the estimates. We report our choices of θ* and R in the empirical section and provide sensitivity analysis in the supplementary materials.

As far as we are aware, there is no clear theoretical basis for choosing between the ISU and ISN estimators. Hesterberg (1995) simulates the properties of several normalized and unnormalized importance sampling estimators in a range of simple settings and found no estimator to be uniformly best. Lee and Gørgens (2021) carry out Monte Carlo experiments for designs closely related to the present application and find that the ISU and ISN estimates have similar distributional properties. For simplicity, we present only ISN estimates in this paper. In the supplementary materials, we benchmark the ISN likelihood functions against simulated likelihood functions that do not depend on importance sampling parameters and verify that the location of the maxima are not unduly influenced by the choice of θ*.

4.3 MSL standard errors

The MSL estimates are computed from a simulated likelihood function, and they are therefore affected by simulation noise in addition to the usual sampling noise. The simulation error in the likelihood function, and hence in the MSL estimates, is a decreasing function of R. The estimates presented in Section 5 are computed with R very high (R = 1000), and the contribution of simulation noise to the standard errors is therefore negligible. Accordingly, the reported standard errors are simply computed from the outer product of the simulated score functions without adjustments.

To show that the simulation noise is negligible, we report on four Monte Carlo experiments in Table 2. The data-generating processes (DGPs) are based on the empirical estimates presented later in the paper. Specifically, the sample sizes and the distribution of birthdays are the same as in each of the four subpopulations, and the DGP parameters are the estimated coefficients presented in Table 3 below (with AFAR = 40 and τ = 0.1). For simplicity, we set θ* equal to the DGP parameters.

TABLE 2

Monte Carlo experiments

EstimatorCoverage probabilitya
DGPBiasRSME90%95%99%
Male Maoris (N = 96724)
α10.5430.0010.0160.9160.9530.985
μ1−3.7000.0000.0300.9070.9630.990
α20.1710.0010.0200.9120.9580.992
γ1.0260.0020.0450.8970.9570.986
μ2−0.995−0.0050.0560.9170.9650.991
Male Europeans (N = 817967)
α10.6540.0010.0060.8960.9370.992
μ1−4.352−0.0010.0140.9050.9520.991
α20.5490.0000.0090.8950.9610.992
γ1.1320.0010.0170.9110.9470.989
μ2−2.462−0.0010.0340.8900.9450.992
Female Maoris (N = 108256)
α10.7200.0010.0170.8920.9440.990
μ1−4.5760.0000.0390.9000.9500.985
α20.2140.0010.0220.9060.9560.990
γ1.0760.0030.0530.8790.9340.983
μ2−1.024−0.0070.0730.9040.9450.990
Female Europeans (N = 885183)
α11.0110.0010.0080.8870.9480.986
μ1−6.114−0.0030.0240.8960.9440.987
α20.5680.0010.0140.9010.9500.992
γ1.0820.0010.0210.9040.9480.992
μ2−2.556−0.0080.0560.9080.9490.990
EstimatorCoverage probabilitya
DGPBiasRSME90%95%99%
Male Maoris (N = 96724)
α10.5430.0010.0160.9160.9530.985
μ1−3.7000.0000.0300.9070.9630.990
α20.1710.0010.0200.9120.9580.992
γ1.0260.0020.0450.8970.9570.986
μ2−0.995−0.0050.0560.9170.9650.991
Male Europeans (N = 817967)
α10.6540.0010.0060.8960.9370.992
μ1−4.352−0.0010.0140.9050.9520.991
α20.5490.0000.0090.8950.9610.992
γ1.1320.0010.0170.9110.9470.989
μ2−2.462−0.0010.0340.8900.9450.992
Female Maoris (N = 108256)
α10.7200.0010.0170.8920.9440.990
μ1−4.5760.0000.0390.9000.9500.985
α20.2140.0010.0220.9060.9560.990
γ1.0760.0030.0530.8790.9340.983
μ2−1.024−0.0070.0730.9040.9450.990
Female Europeans (N = 885183)
α11.0110.0010.0080.8870.9480.986
μ1−6.114−0.0030.0240.8960.9440.987
α20.5680.0010.0140.9010.9500.992
γ1.0820.0010.0210.9040.9480.992
μ2−2.556−0.0080.0560.9080.9490.990
a

Actual coverage rates for confidence intervals with nominal rates 90%, 95%, 99%. The DGP parameters are based on the estimates in Table 3 and AFAR = 40, τ = 0.1, θ*=DGP, R = 1000. Each experiment has 1000 replications, except convergence failed in 2 cases for male Europeans and 2 cases for female Europeans.

TABLE 2

Monte Carlo experiments

EstimatorCoverage probabilitya
DGPBiasRSME90%95%99%
Male Maoris (N = 96724)
α10.5430.0010.0160.9160.9530.985
μ1−3.7000.0000.0300.9070.9630.990
α20.1710.0010.0200.9120.9580.992
γ1.0260.0020.0450.8970.9570.986
μ2−0.995−0.0050.0560.9170.9650.991
Male Europeans (N = 817967)
α10.6540.0010.0060.8960.9370.992
μ1−4.352−0.0010.0140.9050.9520.991
α20.5490.0000.0090.8950.9610.992
γ1.1320.0010.0170.9110.9470.989
μ2−2.462−0.0010.0340.8900.9450.992
Female Maoris (N = 108256)
α10.7200.0010.0170.8920.9440.990
μ1−4.5760.0000.0390.9000.9500.985
α20.2140.0010.0220.9060.9560.990
γ1.0760.0030.0530.8790.9340.983
μ2−1.024−0.0070.0730.9040.9450.990
Female Europeans (N = 885183)
α11.0110.0010.0080.8870.9480.986
μ1−6.114−0.0030.0240.8960.9440.987
α20.5680.0010.0140.9010.9500.992
γ1.0820.0010.0210.9040.9480.992
μ2−2.556−0.0080.0560.9080.9490.990
EstimatorCoverage probabilitya
DGPBiasRSME90%95%99%
Male Maoris (N = 96724)
α10.5430.0010.0160.9160.9530.985
μ1−3.7000.0000.0300.9070.9630.990
α20.1710.0010.0200.9120.9580.992
γ1.0260.0020.0450.8970.9570.986
μ2−0.995−0.0050.0560.9170.9650.991
Male Europeans (N = 817967)
α10.6540.0010.0060.8960.9370.992
μ1−4.352−0.0010.0140.9050.9520.991
α20.5490.0000.0090.8950.9610.992
γ1.1320.0010.0170.9110.9470.989
μ2−2.462−0.0010.0340.8900.9450.992
Female Maoris (N = 108256)
α10.7200.0010.0170.8920.9440.990
μ1−4.5760.0000.0390.9000.9500.985
α20.2140.0010.0220.9060.9560.990
γ1.0760.0030.0530.8790.9340.983
μ2−1.024−0.0070.0730.9040.9450.990
Female Europeans (N = 885183)
α11.0110.0010.0080.8870.9480.986
μ1−6.114−0.0030.0240.8960.9440.987
α20.5680.0010.0140.9010.9500.992
γ1.0820.0010.0210.9040.9480.992
μ2−2.556−0.0080.0560.9080.9490.990
a

Actual coverage rates for confidence intervals with nominal rates 90%, 95%, 99%. The DGP parameters are based on the estimates in Table 3 and AFAR = 40, τ = 0.1, θ*=DGP, R = 1000. Each experiment has 1000 replications, except convergence failed in 2 cases for male Europeans and 2 cases for female Europeans.

Table 2 shows that the MSL estimators are nearly unbiased and the root mean square errors (RMSEs) are small, which is not surprising given the large sample sizes. Crucially, the empirical coverage probabilities of the confidence intervals are very close to the nominal targets. Thus simulation noise is not an important issue here.

In other applications, the influence of simulation noise can be examined in Monte Carlo studies like the ones we have presented here. If simulation noise is found to be an issue, perhaps the simplest solution is to increase R until the influence is deemed acceptable. This can be implemented as an asymptotically efficient two-step estimator, where R need only be huge in the second step (Hajivassiliou, 2000; Kristensen & Salanié, 2017).

4.4 Extension: Non-independent mortality

As mentioned, in our main analysis we assume that right censoring due to death and AMIs events is independent, and we do not model mortality. This is a reasonable simplification, given that the death rates are low. In principle, however, AMI-related mortality could affect the distribution of pseudo-histories for left-censored observations if people with more events are less likely to survive. To investigate whether our main results are robust to the simplifying assumption, we consider extended models that allow for AMI-related mortality. The extended models distinguish between deaths that are directly caused by an AMI event and other deaths. We refer to these deaths as AMI related and AMI unrelated. We maintain the assumption that AMI events and AMI-unrelated deaths (and other causes of right censoring) are independent.

Let di be an indicator variable with di=1 for an AMI-related death, and di=0 for an AMI-unrelated death and for right censoring without death. For simplicity, we allow AMI-related mortality risk to depend on time but assume it does not depend on AMI history prior to the event that caused death. Let f denote the conditional probability of dying given that an AMI event occurs:

(8)

where t is the time of the event and ζ=(ζ0,ζ1,ζ2).

Let bj be a history with kj events, either b1 or b2. The probability m evaluated at (bj,d) of surviving kj-1 AMI events and surviving (d ≠ 1) or not surviving (d = 1) the kjth event is

(9)

The log-likelihood function for the sample of people alive on 1 July 2002 can now be written as

(10)

where the prime on b1 is simply to distinguish it from the other integration dummy b1. For non-left-censored histories, the likelihood contribution is the density of (bi1,di). For left-censored histories, the likelihood contribution is the conditional density of (bi2,di) given survival until the start of the observation period b1o (i.e. the analysis time between AFAR and 1 July 2002). In particular, the integral in the denominator is the probability of surviving all AMI events until b1o.

If AMI events and AMI-related mortality are independent, then m(bj,d|ζ) does not depend on bj and the likelihood function becomes separable in θ and ζ. Therefore, Equation (5) is a special case of Equation (10) where independence is assumed and the terms involving ζ are omitted.

To evaluate the log-likelihood function in Equation (10), the unobserved histories can be integrated out using importance sampling simulation methods as before. For the ISU method, the simulated log-likelihood function is

(11)

with

(12)

where the (bi1r,dir)s are simulated pseudo-event histories. We increase the number of pseudo-histories for each i such that r=1Ri1(dir=0)=R. The ISN estimator is constructed by normalizing the ratio term in Equation (11) to sum to R across r, similar to Equation (7). The ISN estimator is simpler to compute than the ISU estimator, because the denominator in Equation (12) does not depend on r, so this term drops out of the normalized importance sampling adjustment factors and does not have to be computed.

4.5 Extension: Unobserved heterogeneity

Heterogeneity in risk can be considerable. Unfortunately, we do not have other risk markers in our data, be they biological, socioeconomic or behavioural factors. (See Lee and Gørgens, 2021, for a brief discussion of how to incorporate observed heterogeneity.) To investigate the role of unobserved individual heterogeneity, we estimate models with so-called random effects (frailty). Specifically, we include an unobserved random variable vN(0, 1) in the models.

To keep the notation simple, we reuse the names h1, h2, g1, g2 and L for analogous functions that depend on the random effect. We include the random effect linearly in the hazard functions as follows:

(13)
(14)

where the new parameters σ1 and σ2 represent the influence of the random effect and where θ has been augmented to include σ1 and σ2. For identification, we normalize σ10 and σ20.

To state the likelihood function, redefine the density g1 of bi1 in Equation (3) and the conditional density g2 of bi2 given bi1u=b1 in Equation (4) to depend on the random effect via the new h1 and h2. Let Φ denote the standard normal cumulative distribution function. Assuming right censoring is independent of AMI events and of random effects, the log-likelihood function for N observed histories can be written

(15)

This is similar to Equation (5), except here the random effect v is integrated out for each history in the sample. In fact, Equation (5) is the special case of Equation (15) where σ1=0 and σ2=0.

To evaluate the log-likelihood function in Equation (15), the integrals over the random effects are one-dimensional and can be handled by, for example, Gaussian quadrature, while the integral over the unobserved history can be handled using importance sampling simulation methods as before. For the ISU method, the simulated log-likelihood function is

(16)

where the pqs are Gauss–Hermite quadrature points and the wqs are the corresponding weights, and the bi1qrs are simulated pseudo-event histories conditional on v=pq. The bi1qrs are created as before, but separately for each pq value. The ISN estimator is constructed by normalizing the ratio term in Equation (16) to sum to R across r, as in Equation (7).

4.6 Alternative estimators

As mentioned in Section 1, common practice is simply to drop all left-censored histories from the analysis and base estimation on the subset of observations with ci=0. This non-left-censored (NLC) estimator leads to valid inference as long as censoring is independent and there are enough non-left-censored observations.

Another Heckman (HKM) estimator is based on Heckman’s (1981) approach to a related ‘initial conditions’ problem. By plugging in for g1 and g2, the likelihood function in Equation (5) can be broken into terms involving one of h1(t|θ), H1(t|t-,θ), h2(t|t-,θ), or H2(t|t-,θ) for different values of t and t-. Suppose we classify such terms of the likelihood function according to whether they are ‘computable’ or ‘uncomputable’, where uncomputable terms are those that depend on events that might have happened before 1 July 2002. Specifically, for left-censored histories we do not know whether the first observed event is the individual’s first actual event or a subsequent event, so we do not know whether (h1,H1) or (h2,H2) applies. Also, if the first observed event is within the first 1 year of the observation period, we do not know whether the person has had a recent event, so we do not know the value of Recent.

Adapting Heckman’s idea means specifying an auxiliary model for the uncomputable terms that does not require unobserved information. The auxiliary model can use whatever information is available about the person at time t. A simple specification is

(17)

where ξ=(α3,μ3). Let H3(t|t-,ξ)=t-th3(y|ξ)dy denote the value of the cumulative hazard function. The likelihood function is constructed by substituting h3(t|ξ) or H3(t|ξ) for all uncomputable terms and incorporating ξ into θ. The hope is that the resulting likelihood function is a good approximation to the true likelihood function in the dimensions of the parameters of interest.

In their Monte Carlo simulations, Lee and Gørgens (2021) find that the RMSEs can be several times higher for the NLC and HKM estimators compared to the ISN estimator in experiments where the degree of censoring is high and events are relatively rare. In particular, the NLC estimator tends to provide unreliable estimates of parameters related to the risk subsequent events, although the HKM estimator can be in the ballpark even if the RMSEs are high. (Neither the NLC nor the HKM estimator provide good estimates for models with random effects, with the estimation algorithm often failing to converge within a generous time limit.)

We provide NLC and HKM estimates for our models in the supplementary materials for comparison. As expected, they have much larger standard errors than do the ISN estimates and they tend to fit the data poorly. The HKM estimates are used as the importance sampling parameter θ* for our main ISN estimates. We verify in robustness analysis that this works well.

5 RESULTS

5.1 Time-specific risk

We begin by discussing estimates using basic models that ignore AMI-related mortality and unobserved heterogeneity. These models fit the overall patterns in the data very well, as shown below and in the supplementary materials. For the ISN estimation, we use HKM estimates as the importance sampling parameter θ* and take R = 1000. When computing ISN estimates, we generally try at least two different starting values for the optimization algorithm (e.g. θ*±0.2) in order to help discover multiple local maxima.

Table 3 reports ISN parameter estimates. Recall that μ1 captures the levels of AMI risk at age 40 for the 1962–1972 birth cohort. The estimates of μ1 in Table 3 are small (large negative), reflecting the low but not quite zero incidence rates around age 40 (c.f. Figure 2). The estimates of μ1 are not the same, however, and imply that male Maoris have the highest risk of experiencing the first event, followed by male Europeans and female Maoris whose risk is similar, while female Europeans have the lowest risk. Recall that α1 captures changes in the risk of experiencing the first AMI event across time, which here represent the effects of age as well as cohort and calendar time effects. The estimates of α1 in Table 3 are all positive, implying that the risk of having the first AMI event increases with time. The risk increase is lowest for male Maoris, followed by male Europeans and female Maoris, while female Europeans have the steepest profile. Since the estimate of μ1 is smallest and the estimate of α1 is largest for female Europeans, it is possible that the risk gap between female Europeans and the other groups vanishes for older people. However, as shown graphically below, the differences in the estimates of μ1 are too large and the differences in α1 are too small for this to happen within ordinary human lifetimes.

TABLE 3

Parameter estimates

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.540.650.721.01272.24
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.35−4.58−6.111445.42
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.170.550.220.57564.75
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.081.084.11
(0.06)(0.02)(0.06)(0.02)[0.25]
μ2−1.00−2.46−1.03−2.551012.44
(0.06)(0.03)(0.07)(0.05)[0.00]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.540.650.721.01272.24
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.35−4.58−6.111445.42
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.170.550.220.57564.75
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.081.084.11
(0.06)(0.02)(0.06)(0.02)[0.25]
μ2−1.00−2.46−1.03−2.551012.44
(0.06)(0.03)(0.07)(0.05)[0.00]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.
TABLE 3

Parameter estimates

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.540.650.721.01272.24
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.35−4.58−6.111445.42
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.170.550.220.57564.75
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.081.084.11
(0.06)(0.02)(0.06)(0.02)[0.25]
μ2−1.00−2.46−1.03−2.551012.44
(0.06)(0.03)(0.07)(0.05)[0.00]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.540.650.721.01272.24
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.35−4.58−6.111445.42
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.170.550.220.57564.75
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.081.084.11
(0.06)(0.02)(0.06)(0.02)[0.25]
μ2−1.00−2.46−1.03−2.551012.44
(0.06)(0.03)(0.07)(0.05)[0.00]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.

The estimates of γ are similar for all four groups and suggest that the risk during the first year after an event is nearly three times as large as the risk more than a year after the event (exp(1) ≈ 2.7). The parameter μ2 essentially reflects the risk of a subsequent event at age 40 for people who had an event at or before age 38 (i.e. >1 year before turning 40). Although this is out of sample, note that the estimates tend to be higher than the estimates for μ1. The estimates of α2 indicate that the risk for Europeans increases with time much faster than the risk for Maoris.

Table 3 also reports Wald test statistics for the joint null hypotheses that the respective parameters are the same across all four groups. For example, the Wald statistic for testing equality of α1 across the four groups is

where α̂1MM is the estimate for male Maoris etc. By standard maximum likelihood theory, the asymptotic distribution is χ2(3). As the table shows, the null is rejected for α1, α2, μ1 and μ2, but not for γ.

Since the model is non-linear and it is difficult to interpret some of the parameters, we translate the estimates in Table 3 into hazard rates. Figure 3 shows the estimated hazards for the four groups. Panel A shows the hazards of experiencing the first AMI event. Male Maoris have the highest risk, followed by male Europeans and female Maoris who have similar intermediate-level risk, while female Europeans have the lowest risk. Panel B shows the hazard of experiencing an event within 1 year of another, while panels C and C’ show the hazard of subsequent events after 1 year. (The only difference between C and C’ is the scale of the vertical axis.) It is clear that the risk of subsequent events is much larger than the risk of the first event, especially within 1 year of a previous event. This pattern suggests that either having an event makes the body more susceptible to further events (causal effect) or that having an event is a marker for being a high-risk person (unobserved heterogeneity), or both. Note that male Europeans and female Maoris do not have similar risk of subsequent events. Instead, the risk of subsequent events is similar for male and female Maoris on the one hand and male and female Europeans on the other hand.

Hazard functions (per 10 years) [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 3

Hazard functions (per 10 years) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

A different view is provided in Figure 4, which shows the risk of subsequent events relative to the risk of the first event for the four groups. In our flexible two-equation systems, the effect of having the first event is not restricted to be proportional to a given baseline risk. The relative change in risk before and after the first event depends on time and history:

(18)

Figure 4 shows that the relative risk of subsequent events is extremely large, but falls steeply with time. This is especially true for younger females, whose relative risk is literally off the scale. The relative risk is also large for male Maoris, and falls with time at a slower rate. By comparison, relative risk appears much flatter for male Europeans. The large values for younger people arise because the risk of having the first event is very low, so the denominator is small (μ1<μ2). The effect of time is negative, because the estimates imply α2<α1. A formal Wald tests of the proportional hazards restriction α2=α1 is rejected for all groups (p = 0.00). Clearly, the risk of the first and subsequent event should not be modelled in a single equation with proportional hazards.

Ratio of hazard functions for subsequent acute myocardial infarctions (AMIs) over first AMI [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 4

Ratio of hazard functions for subsequent acute myocardial infarctions (AMIs) over first AMI [Colour figure can be viewed at https://dbpia.nl.go.kr/]

5.2 Lifetime cumulative risk

So far we have compared outcomes across gender and ethnic groups in terms of time-specific risk. The estimated models allow us to simulate and compare synthetic lifetime cumulative outcomes.

Such an exercise raises two questions. First, the model is estimated on a cross-section and may not be representative of any particular birth cohort. As mentioned, there may have been significant changes in both life styles that affect risk and in the medical know-how in treating symptoms and preventing disease. However, as in other areas of demographic and epidemiological analysis, synthetic cohort analysis provides a useful summary of outcomes during a given period of time. Besides, we are comparing four groups which have lived in similar environments and experienced similar changes.

The second question is what to do about differences in mortality across the four groups. In reality, people do not live forever and mortality is likely to be related to both a person’s innate frailty and to their previous medical history. As discussed earlier, it is possible to augment the basic model with equations representing mortality risk. However, here we proceed by assuming no one dies until age 85, because this provides a better foundation for isolating and comparing the risk of AMI events. Were we to implement group-specific non-independent mortality, the internal composition of the four groups in terms of high-risk and low-risk people would vary over time at differential rates. As a result, the predicted outcomes would partly reflect differences in AMI risk and partly differences in mortality.

We compute lifetime cumulative outcomes by dynamically simulating AMI events for each gender and ethnic group over the age range from 40 to 85 using the ISN estimates in Table 3 as the data-generating process. Technically, the simulation method is similar to that used internally for computing the likelihood function when we estimate the models. We generate 100,000 individual pseudo-histories for each group.

Table 4 shows the frequency distributions of the lifetime cumulative number of AMI events for the four groups. The majority of people do not experience any events, but there are significant differences across groups. Male Maoris are most likely to ever have an AMI event (38%), followed by female Maoris and male Europeans whose probabilities are similar (30%), while female Europeans are least likely to experience any events (18%). All four distributions are heavily skewed to the right, with a small fraction of people experiencing a large number of events. The skewness implies that the mean number of events are much larger than the medians (which are all 0). Specifically, the average number of AMIs before age 85 is about 0.86 for male Maoris, 0.67 for female Maoris, 0.62 for male Europeans and 0.34 for female Europeans. When looking at the subsets of people with events, female Maoris experience the same number of events as male Maoris (2.3), higher than male Europeans (2.1) and female Europeans (1.9).

TABLE 4

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs)

Male MaorisMale EuropeansFemale MaorisFemale Europeans
Number of AMIs frequency distribution
061.8370.1570.3481.64
114.9412.6711.809.18
210.338.587.965.16
36.364.814.832.42
43.472.232.561.02
5+3.071.562.510.58
Number of AMIs
Mean0.860.620.670.34
Number of AMIs for those with at least one
Mean2.252.072.261.85
Male MaorisMale EuropeansFemale MaorisFemale Europeans
Number of AMIs frequency distribution
061.8370.1570.3481.64
114.9412.6711.809.18
210.338.587.965.16
36.364.814.832.42
43.472.232.561.02
5+3.071.562.510.58
Number of AMIs
Mean0.860.620.670.34
Number of AMIs for those with at least one
Mean2.252.072.261.85
Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 3 as the data-generating process.
TABLE 4

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs)

Male MaorisMale EuropeansFemale MaorisFemale Europeans
Number of AMIs frequency distribution
061.8370.1570.3481.64
114.9412.6711.809.18
210.338.587.965.16
36.364.814.832.42
43.472.232.561.02
5+3.071.562.510.58
Number of AMIs
Mean0.860.620.670.34
Number of AMIs for those with at least one
Mean2.252.072.261.85
Male MaorisMale EuropeansFemale MaorisFemale Europeans
Number of AMIs frequency distribution
061.8370.1570.3481.64
114.9412.6711.809.18
210.338.587.965.16
36.364.814.832.42
43.472.232.561.02
5+3.071.562.510.58
Number of AMIs
Mean0.860.620.670.34
Number of AMIs for those with at least one
Mean2.252.072.261.85
Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 3 as the data-generating process.

Since the estimated models fit the data very well, the incidence rates and cumulative hazard rates for the synthetic populations (not shown) are similar to those for the study populations presented in Figure 2. More detailed views of the average outcomes are provided in Figure 5. The left panel follows people as they age and shows the proportion who have ever had an AMI event (i.e. the extensive margin). The proportions are highest for male Maoris, since they have the highest time-specific AMI risk of the first event. Then follow female Maoris and male Europeans whose outcomes are similar, and finally female Europeans are least likely to have had any AMI events. The right panel shows the average number of AMI events people have had when they reach a certain age, for those who have experienced at least one AMI at that age (i.e. the intensive margin). Interestingly, male Europeans and female Maoris do not have similar experiences when conditioning on having had an event. Instead, the outcomes for female Maoris are very similar to those for male Maoris, while the outcomes for male Europeans are similar to those for female Europeans.

Synthetic lifetime cumulative outcomes [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 5

Synthetic lifetime cumulative outcomes [Colour figure can be viewed at https://dbpia.nl.go.kr/]

5.3 Non-independent mortality

This section briefly consider estimates from models that allow for non-independent mortality as discussed in Section 4.4. Again, we use HKM estimates as the importance sampling parameter θ* and take R = 1000.

Table 5 shows estimates for the four groups. The probability of death conditional on an event occurring is specified as the logistic function of time. Not surprisingly, the probability of dying of a heart attack is higher for older people, as indicated by the positive estimates of ζ1 and ζ2. As expected, the estimates of parameters related to AMI events are similar to those presented in Table 3. Figure 6 shows the hazard rates corresponding to the estimates in Table 3. The estimated hazard rates are nearly indistinguishable from those in Figure 3, although close inspection reveals that the estimated hazard rates are slightly higher for older people in Figure 6. Since their death rates are higher (see Table 1), it is perhaps not surprising that these differences are relatively higher for Maoris than for Europeans. In any case, the differences are very small and negligible for purposes of this paper.

Hazard functions (per 10 years) (models with acute myocardial infarction-related mortality) [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 6

Hazard functions (per 10 years) (models with acute myocardial infarction-related mortality) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

TABLE 5

Parameter estimates for models with acute myocardial infarction (AMI)-related mortality

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.570.680.741.03298.85
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.37−4.58−6.141548.01
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.190.560.230.58523.97
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.071.084.51
(0.06)(0.02)(0.06)(0.02)[0.21]
μ2−1.03−2.49−1.05−2.58994.63
(0.06)(0.03)(0.08)(0.05)[0.00]
ζ0−1.80−2.84−2.43−2.9362.42
(0.11)(0.08)(0.20)(0.15)[0.00]
ζ10.110.320.020.145.99
(0.11)(0.06)(0.17)(0.10)[0.11]
ζ20.040.020.080.064.34
(0.02)(0.01)(0.03)(0.02)[0.23]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.570.680.741.03298.85
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.37−4.58−6.141548.01
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.190.560.230.58523.97
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.071.084.51
(0.06)(0.02)(0.06)(0.02)[0.21]
μ2−1.03−2.49−1.05−2.58994.63
(0.06)(0.03)(0.08)(0.05)[0.00]
ζ0−1.80−2.84−2.43−2.9362.42
(0.11)(0.08)(0.20)(0.15)[0.00]
ζ10.110.320.020.145.99
(0.11)(0.06)(0.17)(0.10)[0.11]
ζ20.040.020.080.064.34
(0.02)(0.01)(0.03)(0.02)[0.23]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.
TABLE 5

Parameter estimates for models with acute myocardial infarction (AMI)-related mortality

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.570.680.741.03298.85
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.37−4.58−6.141548.01
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.190.560.230.58523.97
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.071.084.51
(0.06)(0.02)(0.06)(0.02)[0.21]
μ2−1.03−2.49−1.05−2.58994.63
(0.06)(0.03)(0.08)(0.05)[0.00]
ζ0−1.80−2.84−2.43−2.9362.42
(0.11)(0.08)(0.20)(0.15)[0.00]
ζ10.110.320.020.145.99
(0.11)(0.06)(0.17)(0.10)[0.11]
ζ20.040.020.080.064.34
(0.02)(0.01)(0.03)(0.02)[0.23]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.570.680.741.03298.85
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.70−4.37−4.58−6.141548.01
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.190.560.230.58523.97
(0.02)(0.01)(0.02)(0.01)[0.00]
γ1.021.131.071.084.51
(0.06)(0.02)(0.06)(0.02)[0.21]
μ2−1.03−2.49−1.05−2.58994.63
(0.06)(0.03)(0.08)(0.05)[0.00]
ζ0−1.80−2.84−2.43−2.9362.42
(0.11)(0.08)(0.20)(0.15)[0.00]
ζ10.110.320.020.145.99
(0.11)(0.06)(0.17)(0.10)[0.11]
ζ20.040.020.080.064.34
(0.02)(0.01)(0.03)(0.02)[0.23]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.

5.4 Unobserved heterogeneity

In this section we discuss extended models that include random effects in an attempt to decompose risk into time-related and dynamic factors on the one hand and unobserved heterogeneity on the other. This analysis is more speculative, because the decomposition is identified through ‘functional form’ assumptions. It is well known that random effect distributions are difficult to estimate, and estimates of σ1 and σ2 may not be very reliable in general. Nevertheless, the estimates may provide a ballpark measure of the influence of unobserved heterogeneity.

The ISN estimates of models with random effects use the estimates from the corresponding models without random effects as importance sampling parameter θ*, with the modification that lnσ1*=lnσ2*=0, and take R = 1000 and Q = 10.

In our case, the likelihood function is clearly maximized on the boundary of the parameter space where σ1=0 for both male and female Europeans. For male and female Maoris, the likelihood function has two local maxima, one with σ1=0 and one with σ1>0. The highest likelihood value is associated with the former for male Maoris and with the latter for female Maoris. On the other hand, the model fit is better with σ1=0 for both groups. For that reason, and to maintain comparability, we present the estimates with σ1=0 for all four groups. Estimates for models with σ1>0 are provided in the supplementary materials.

Table 6 reports ISN parameter estimates for the four groups. The estimates of lnσ2 imply σ21.5, which means that the hazard rate at the third quartile of the distribution of random effects is >7 times the hazard rate at the first quartile. This suggests that unobserved heterogeneity is important and some people are much more at risk of having AMI events than others. In other words, the risk of having events is not evenly distributed within the groups.

TABLE 6

Parameter estimates for models with random effects (σ1=0)

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.590.700.761.04269.71
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.72−4.40−4.60−6.151517.16
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.140.540.200.58300.72
(0.03)(0.01)(0.04)(0.02)[0.00]
γ0.730.820.740.723.32
(0.06)(0.02)(0.06)(0.03)[0.34]
μ2−1.44−2.99−1.52−3.13460.83
(0.10)(0.05)(0.12)(0.08)[0.00]
lnσ20.370.380.400.440.51
(0.06)(0.02)(0.07)(0.02)[0.92]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.590.700.761.04269.71
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.72−4.40−4.60−6.151517.16
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.140.540.200.58300.72
(0.03)(0.01)(0.04)(0.02)[0.00]
γ0.730.820.740.723.32
(0.06)(0.02)(0.06)(0.03)[0.34]
μ2−1.44−2.99−1.52−3.13460.83
(0.10)(0.05)(0.12)(0.08)[0.00]
lnσ20.370.380.400.440.51
(0.06)(0.02)(0.07)(0.02)[0.92]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.
TABLE 6

Parameter estimates for models with random effects (σ1=0)

Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.590.700.761.04269.71
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.72−4.40−4.60−6.151517.16
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.140.540.200.58300.72
(0.03)(0.01)(0.04)(0.02)[0.00]
γ0.730.820.740.723.32
(0.06)(0.02)(0.06)(0.03)[0.34]
μ2−1.44−2.99−1.52−3.13460.83
(0.10)(0.05)(0.12)(0.08)[0.00]
lnσ20.370.380.400.440.51
(0.06)(0.02)(0.07)(0.02)[0.92]
Male MaorisMale EuropeansFemale MaorisFemale EuropeansWald
α10.590.700.761.04269.71
(0.02)(0.01)(0.02)(0.01)[0.00]
μ1−3.72−4.40−4.60−6.151517.16
(0.03)(0.01)(0.04)(0.02)[0.00]
α20.140.540.200.58300.72
(0.03)(0.01)(0.04)(0.02)[0.00]
γ0.730.820.740.723.32
(0.06)(0.02)(0.06)(0.03)[0.34]
μ2−1.44−2.99−1.52−3.13460.83
(0.10)(0.05)(0.12)(0.08)[0.00]
lnσ20.370.380.400.440.51
(0.06)(0.02)(0.07)(0.02)[0.92]
Standard errors in parentheses and p-values in brackets. Wald: χ2 test statistic for the null hypothesis that the parameter is the same across groups.

Figure 7 shows the corresponding hazard functions evaluated at the median of the random effects. While the risk patterns and rankings are similar, the median hazard rates in Figure 7 are much lower than the overall rates in Figure 3. This reflects the non-linear properties of the hazard models Equations (13) and (14), where the symmetrically distributed random effects v are transformed through the exponential function, leading to a highly skewed distribution of hazard rates. Specifically, the risk for the median person is much lower than the mean risk.

Median hazard functions (per 10 years) (models with random effects) [Colour figure can be viewed at https://dbpia.nl.go.kr/]
FIGURE 7

Median hazard functions (per 10 years) (models with random effects) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Figures showing hazard rates at the first and third quartile of the distribution of the random effects are provided in the supplementary materials. Since the estimates of lnσ2 are similar for all four groups, the ranking of risk across groups is similar at the 75th percentile of the distribution of unobserved heterogeneity. The main impression from these graphs is that there are enormous differences between high- and low-risk people. Presumably it would be possible to explain or reduce this variation if data were available on individual biology, life style choices, etc.

Another way of summarizing the inequality in risk within groups is to simulate cumulative outcomes at different points in the distribution of random effects. The first panel in Table 7 shows the proportion of (synthetic) people who have ever had an AMI event between ages 40 and 85 (i.e. the extensive margin). Since σ1=0, this proportion is the same throughout the random effects distribution. The estimates are similar to those in Table 4. The second panel shows the average total number of AMI events for those (synthetic) people who have at least one AMI in their lifetime (i.e. the intensive margin). Since σ2>0 for all groups, the numbers are different at the three quartiles of the random effects distribution. At the lower quartile, people total about 1.2 events. At the upper quartile, the numbers range from 2.3 to 2.9 across the four groups. Comparison with Table 4 shows that allowing for random effects reduces the estimated differences in the intensive margin at the 25th and 50th percentiles, while the differences are a bit larger between the four groups at the 75th percentile of the distribution of random effects.

TABLE 7

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs) (models with random effects)

Male MaorisMale EuropeansFemale MaorisFemale Europeans
Proportion of people with at least one AMI (%)
vN(0,1)41.432.132.219.3
Average number of AMIs for people with at least one AMI
v = -0.6741.241.201.231.15
v = 01.641.541.651.44
v = 0.6742.812.542.882.31
Male MaorisMale EuropeansFemale MaorisFemale Europeans
Proportion of people with at least one AMI (%)
vN(0,1)41.432.132.219.3
Average number of AMIs for people with at least one AMI
v = -0.6741.241.201.231.15
v = 01.641.541.651.44
v = 0.6742.812.542.882.31
Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 6 as the data-generating process.
TABLE 7

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs) (models with random effects)

Male MaorisMale EuropeansFemale MaorisFemale Europeans
Proportion of people with at least one AMI (%)
vN(0,1)41.432.132.219.3
Average number of AMIs for people with at least one AMI
v = -0.6741.241.201.231.15
v = 01.641.541.651.44
v = 0.6742.812.542.882.31
Male MaorisMale EuropeansFemale MaorisFemale Europeans
Proportion of people with at least one AMI (%)
vN(0,1)41.432.132.219.3
Average number of AMIs for people with at least one AMI
v = -0.6741.241.201.231.15
v = 01.641.541.651.44
v = 0.6742.812.542.882.31
Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 6 as the data-generating process.

5.5 Goodness of fit and robustness

Simulating histories from the estimated models in Table 3 can also be used as an informal check of their within-sample fit. For this, we draw five random histories for each of the individuals in the estimation samples, taking their date of birth and their observation period as given (including the date of death if they die before 30 June 2012). We then compute and compare summary statistics for the simulated samples and the estimation samples.

The first panel in Table 8 shows the actual and the simulated distributions of AMI events people experience during their observation period. The fit is very good, although all the estimated models tend to slightly underpredict the proportion of people with a single event and slightly overpredict the proportion with two or more events.

TABLE 8

Within-sample goodness-of-fit statistics

Male MaorisMale EuropeansFemale MaorisFemale Europeans
DataModelDataModelDataModelDataModel
Number of AMIs frequency distribution
094.4494.5594.4294.5496.4696.5497.0597.10
14.673.984.694.112.902.412.462.13
20.671.130.681.010.460.770.370.57
3+0.210.340.220.340.190.290.130.20
Incidence rates
Age 40–440.030.030.010.020.010.010.000.00
Age 45–490.050.050.020.020.020.020.010.01
Age 50–540.070.070.040.030.040.040.010.01
Age 55–590.100.100.050.050.060.060.020.02
Age 60–640.140.140.080.080.100.090.030.03
Age 65–690.190.190.110.110.140.130.050.05
Age 70–740.220.270.160.180.190.200.090.09
Age 75–790.300.350.250.280.230.310.160.17
Age 80+0.370.480.390.450.340.450.270.31
Cumulative hazard rates
Age 500.040.040.020.020.020.020.000.00
Age 550.070.080.040.040.030.040.010.01
Age 600.120.130.060.060.070.070.020.02
Age 650.200.200.100.100.110.110.030.03
Age 700.290.300.160.160.180.180.050.06
Age 750.400.430.240.250.280.290.100.10
Age 800.550.610.370.390.400.440.180.19
Age 850.750.860.570.620.570.670.310.34
Male MaorisMale EuropeansFemale MaorisFemale Europeans
DataModelDataModelDataModelDataModel
Number of AMIs frequency distribution
094.4494.5594.4294.5496.4696.5497.0597.10
14.673.984.694.112.902.412.462.13
20.671.130.681.010.460.770.370.57
3+0.210.340.220.340.190.290.130.20
Incidence rates
Age 40–440.030.030.010.020.010.010.000.00
Age 45–490.050.050.020.020.020.020.010.01
Age 50–540.070.070.040.030.040.040.010.01
Age 55–590.100.100.050.050.060.060.020.02
Age 60–640.140.140.080.080.100.090.030.03
Age 65–690.190.190.110.110.140.130.050.05
Age 70–740.220.270.160.180.190.200.090.09
Age 75–790.300.350.250.280.230.310.160.17
Age 80+0.370.480.390.450.340.450.270.31
Cumulative hazard rates
Age 500.040.040.020.020.020.020.000.00
Age 550.070.080.040.040.030.040.010.01
Age 600.120.130.060.060.070.070.020.02
Age 650.200.200.100.100.110.110.030.03
Age 700.290.300.160.160.180.180.050.06
Age 750.400.430.240.250.280.290.100.10
Age 800.550.610.370.390.400.440.180.19
Age 850.750.860.570.620.570.670.310.34
Data: Computed directly from the estimation sample. Model: Computed from simulating outcomes using the estimates in Table 3 as the data-generating process and taking the individual observation periods in the estimation sample as given.
TABLE 8

Within-sample goodness-of-fit statistics

Male MaorisMale EuropeansFemale MaorisFemale Europeans
DataModelDataModelDataModelDataModel
Number of AMIs frequency distribution
094.4494.5594.4294.5496.4696.5497.0597.10
14.673.984.694.112.902.412.462.13
20.671.130.681.010.460.770.370.57
3+0.210.340.220.340.190.290.130.20
Incidence rates
Age 40–440.030.030.010.020.010.010.000.00
Age 45–490.050.050.020.020.020.020.010.01
Age 50–540.070.070.040.030.040.040.010.01
Age 55–590.100.100.050.050.060.060.020.02
Age 60–640.140.140.080.080.100.090.030.03
Age 65–690.190.190.110.110.140.130.050.05
Age 70–740.220.270.160.180.190.200.090.09
Age 75–790.300.350.250.280.230.310.160.17
Age 80+0.370.480.390.450.340.450.270.31
Cumulative hazard rates
Age 500.040.040.020.020.020.020.000.00
Age 550.070.080.040.040.030.040.010.01
Age 600.120.130.060.060.070.070.020.02
Age 650.200.200.100.100.110.110.030.03
Age 700.290.300.160.160.180.180.050.06
Age 750.400.430.240.250.280.290.100.10
Age 800.550.610.370.390.400.440.180.19
Age 850.750.860.570.620.570.670.310.34
Male MaorisMale EuropeansFemale MaorisFemale Europeans
DataModelDataModelDataModelDataModel
Number of AMIs frequency distribution
094.4494.5594.4294.5496.4696.5497.0597.10
14.673.984.694.112.902.412.462.13
20.671.130.681.010.460.770.370.57
3+0.210.340.220.340.190.290.130.20
Incidence rates
Age 40–440.030.030.010.020.010.010.000.00
Age 45–490.050.050.020.020.020.020.010.01
Age 50–540.070.070.040.030.040.040.010.01
Age 55–590.100.100.050.050.060.060.020.02
Age 60–640.140.140.080.080.100.090.030.03
Age 65–690.190.190.110.110.140.130.050.05
Age 70–740.220.270.160.180.190.200.090.09
Age 75–790.300.350.250.280.230.310.160.17
Age 80+0.370.480.390.450.340.450.270.31
Cumulative hazard rates
Age 500.040.040.020.020.020.020.000.00
Age 550.070.080.040.040.030.040.010.01
Age 600.120.130.060.060.070.070.020.02
Age 650.200.200.100.100.110.110.030.03
Age 700.290.300.160.160.180.180.050.06
Age 750.400.430.240.250.280.290.100.10
Age 800.550.610.370.390.400.440.180.19
Age 850.750.860.570.620.570.670.310.34
Data: Computed directly from the estimation sample. Model: Computed from simulating outcomes using the estimates in Table 3 as the data-generating process and taking the individual observation periods in the estimation sample as given.

The second and third panels in Table 8 show the incidence rates and cumulative hazard rates for the estimation sample and for simulated data. (The supplementary materials provide graphical comparisons of these incidence rates and cumulative hazard rates.) Again, the fit is remarkably good, although the models slightly overpredict the number of events for older people. The proportions of older people in the estimation samples are small, especially for Maoris, so a slightly higher estimation uncertainty and a slightly poorer fit for them is expected.

Naturally, allowing for AMI-related mortality (see Table 5) reduces the predicted number of people with many events. However, we show in the supplementary materials that the effect is small. In terms of the age-specific unconditional probability of dying of AMI and the age-specific conditional probability of dying given that an AMI event occurs, these models fit the data very well.

The supplementary materials also include a discussion of goodness of fit for the estimated models with random effects in Table 6. Despite their much higher likelihood values and the statistical significance of the estimate of lnσ2, the models with random effects do not fit the data as well as the models without random effects. In particular, while they get the proportion of people with no events about right, they overestimate risk and overpredict the total number of events in the population to a far greater extent.

The ISN estimates rely on the choice of θ*, R, and Q. The supplementary materials present estimation results for using first-stage importance sampling estimates as importance sampling parameters θ* in a second stage or taking R = 3000 or taking Q = 12 yields similar estimates. We also consider lowering AFAR to 35 years. None of these changes have any practical influence on results presented here.

As mentioned, the 1 year cut-off for Recent is arbitrary but in accordance with the literature. In the supplementary materials, we show that lowering the cut-off to 6 months (τ = 0.05) leads to an increase in the estimates of γ and μ2, whereas raising the cut-off to 2 years (τ = 0.2) leads to a decrease. This pattern is consistent with a world where the risk of further events increases sharply immediately after an event and gradually tapers off thereafter. However, the practical implications and the overall fit of the models are similar for the three different value of τ.

The time needed to compute the MSL estimate varies greatly depending on the sample sizes, the choice of θ*, R and Q, the starting values for the optimization algorithm, and the computing resources available. As an indication, the estimates in Table 3 take 1–7 min each to compute while those in Table 6 take 70–266 min on a 2016 workstation with 8 dual core CPUs and 64 GB of RAM. Some of the estimates shown in the supplementary materials take 6–7 h.

6 CONCLUSION

In this study, we investigate gender and ethnic disparities in several aspects of AMI risk in New Zealand. Our study complements the literature on gender and ethnic disparities in other areas of public health and health economics. AMIs, commonly known as heart attacks, are one of the leading causes of disability and mortality and therefore an important issue in public health.

We estimate hazard models of AMI risk separately for male and female people of Maori and European descent using high-quality administrative data on hospital admissions and death registrations combined with census data. The main challenge with analysing these data is the high degree of left censoring. For left-censored observations, the early part of the event histories is missing. Left censoring therefore poses a problem for studying history dependence and dynamic effects. Ad hoc approaches to handling left censoring do not yield reliable estimates. Recent methodological advances allow us to overcome the left-censoring problem effectively, by using MSL estimation and importance sampling methods.

Using these hazard models, we examine how AMI risk is distributed within each subpopulation and we compare patterns across subpopulations. We find, as expected, that older people have much higher risk than younger people. This partly reflects biological effects and partly time effects as different cohorts have been exposed to different environments, made different life style choices, and had access to different medical technologies. We find that there are important dynamic effects in that the risk of experiencing the first AMI event is much lower than the risk of having subsequent events, and the risk is particularly high in the first year following an event. The risks of the first and subsequent events are not proportional across time. In addition, we find that there is considerable within-group variation in risk, as measured by the influence of random effects. These two aspects, history dependence and unobserved heterogeneity, pull in the same direction of concentrating risk among relatively fewer people within the subpopulations.

Comparing the four gender and ethnic groups, we discover large disparities in AMI time-specific risk between male and female people of Maori and European descent. For the first event, the ranking is that male Maoris tend to have the highest risk, followed by female Maoris and male Europeans whose risk levels are similar, and finally female Europeans have the lowest risk levels. The pattern is somewhat different for subsequent events, where male and female Maoris have similar and high time-specific risk levels on the one hand while male and female Europeans have similar and comparatively lower risk levels on the other hand.

Using the estimated hazard models, we compute lifetime cumulative outcomes for the synthetic cohorts. Here, the gender and ethnic disparities are also clear, and the ranking between groups reflects the ranking of time-specific risk. The proportion of people who experience any event (the extensive margin) is highest for male Maoris, followed by female Maoris and male Europeans who have similar proportions, while the proportion of female Europeans who experience at least one event is the smallest. The average number of events for those who experience at least one (the intensive margin) is highest for male and female Maoris and lowest for male and female Europeans.

Our findings motivate further research on gender and ethnic disparities. For example, if the data are found or become available, it would be interesting and useful to investigate potential biological, socioeconomic or environmental factors which may explain some of the heterogeneity in risk.

Supporting Information

Additional supporting information may be found in the online version of the article at the publisher’s website.

ACKNOWLEDGEMENTS

This research was supported in part by Australian Research Council Grant DP1096862 and BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE) of Korea and National Research Foundation (NRF) of Korea. We thank joint editor James Carpenter and three anonymous referees for comments and suggestions that greatly improved the paper.

REFERENCES

Alderman
,
M.H.
,
Cohen
,
H.W.
&
Madhavan
,
S.
(
2000
)
Myocardial infarction in treated hypertensive patients: the paradox of lower incidence but higher mortality in young blacks compared with whites
.
Circulation
,
101
(
10
),
1109
1114
.

Anand
,
S.S.
,
Yusuf
,
S.
,
Vuksan
,
V.
,
Devanesen
,
S.
,
Teo
,
K.K.
,
Montague
,
P.A.
et al. (
2000
)
Differences in risk factors, atherosclerosis, and cardiovascular disease between ethnic groups in Canada: the study of health assessment and risk in ethnic groups (SHARE)
.
Lancet
,
356
(
9226
),
279
284
.

Avendano
,
M.
&
Soerjomataram
,
I.
(
2008
)
Monitoring trends in acute coronary syndromes: can we use hospital admission registries
?
Heart
,
94
(
12
),
1524
1525
.

Bhuller
,
M.
,
Brinch
,
C.N.
&
Königs
,
S.
(
2017
)
Time aggregation and state dependence in welfare receipt
.
Economic Journal
,
127
(
604
),
1833
1873
.

Chan
,
W.C.
,
Wright
,
C.
,
Riddell
,
T.
,
Wells
,
S.
,
Kerr
,
A.J.
,
Gala
,
G.
et al. (
2008
)
Ethnic and socioeconomic disparities in the prevalence of cardiovascular disease in New Zealand
.
New Zealand Medical Journal
,
121
(
1285
),
11
20
.

Chan
,
W.C.
,
Wright
,
C.
,
Tobias
,
M.
,
Mann
,
S.
&
Jackson
,
R.
(
2008
)
Explaining trends in coronary heart disease hospitalisations in New Zealand: admissions and incidence can trend in opposite directions
.
Heart
,
94
(
12
),
1589
1593
.

Chang
,
W.C.
,
Kaul
,
P.
,
Fu
,
Y.
,
Westerhout
,
C.M.
,
Granger
,
C.B.
,
Mahaffey
,
K.W.
et al. (
2006
)
Forecasting mortality: dynamic assessment of risk in ST-segment elevation acute myocardial infarction
.
European Heart Journal
,
27
(
4
),
419
426
.

Cockx
,
B.
&
Picchio
,
M.
(
2013
)
Scarring effects of remaining unemployed for long-term unemployed school-leavers
.
Journal of the Royal Statistical Society Series A
,
176
(
4
),
951
980
.

Doiron
,
D.
&
Gørgens
,
T.
(
2008
)
State dependence in youth labor market experiences, and the evaluation of policy interventions
.
Journal of Econometrics
,
145
,
81
97
.

Gottlieb
,
S.
,
Harpaz
,
D.
,
Shotan
,
A.
,
Boyko
,
V.
,
Leor
,
J.
,
Cohen
,
M.
et al. (
2000
)
Sex differences in management and outcome after acute myocardial infarction in the 1990s: a prospective observational community-based study
.
Circulation
,
102
(
20
),
2484
2490
.

Hajivassiliou
,
V.A.
(
2000
) Some practical issues in maximum simulated likelihood. In:
Mariano
,
R.
,
Weeks
,
M.
&
Schuermann
,
T.
(Eds),
Simulation-based inference in econometrics: methods and applications
.
Cambridge
:
Cambridge University Press
.

Heckman
,
J.J.
(
1981
) The incidental parameter problem and the problem of initial conditions in estimating a discrete time–discrete data stochastic process. In:
Manski
,
C.F.
&
McFadden
,
D.
(Eds),
Structural analysis of discrete data with econometric applications
,
Cambridge, Massachusetts
:
MIT Press
, pp.
179
195
.

Herman
,
B.
,
Greiser
,
E.
&
Pohiabeln
,
H.
(
1997
)
A sex difference in short-term survival after initial acute myocardial infarction: the MONICA-Bremen acute myocardial infarction register, 1985–1990
.
European Heart Journal
,
18
(
6
),
963
970
.

Hesterberg
,
T.
(
1995
)
Weighted average importance sampling and defensive mixture distributions
.
Technometrics
,
37
(
2
),
185
194
.

Hougaard
,
P.
(
1986
)
Survival models for heterogeneous populations derived from stable distributions
.
Biometrika
,
73
(
2
),
387
396
.

Kauf
,
T.L.
,
Velazquez
,
E.J.
,
Crosslin
,
D.R.
,
Weaver
,
W.D.
,
Diaz
,
R.
,
Granger
,
C.B.
et al. (
2006
)
The cost of acute myocardial infarction in the new millennium: evidence from a multinational registry
.
American Heart Journal
,
151
(
1
),
206
212
.

Klein
,
J.P.
,
Moeschberger
,
M.
,
Li
,
Y.H.
,
Wang
,
S.T.
&
Flournoy
,
N.
(
1992
) Estimating random effects in the framingham heart study. In:
Survival Analysis: State of the Art
,
Berlin
:
Springer
, pp.
99
120
.

Kristensen
,
D.
&
Salanié
,
B.
(
2017
)
Higher-order properties of approximate estimators
.
Journal of Econometrics
,
198
(
2
),
189
208
.

Kytö
,
V.
,
Sipilä
,
J.
&
Rautava
,
P.
(
2015
)
Gender and in-hospital mortality of ST-segment elevation myocardial infarction (from a multihospital nationwide registry study of 31,689 patients)
.
American Journal of Cardiology
,
115
(
3
),
303
306
.

Lee
,
S.
&
Gørgens
,
T.
(
2021
)
Estimation of dynamic models of recurrent events with censored data
.
Econometrics Journal
,
24
(
2
),
199
224
.

MacIntyre
,
K.
,
Stewart
,
S.
,
Capewell
,
S.
(
2001
)
Gender and survival: a populationbased study of 20,1114 men and women following a first acute myocardial infarction
.
Journal of the American College of Cardiology
,
38
(
3
),
729
735
.

Madrigano
,
J.
,
Mittleman
,
M.A.
,
Baccarelli
,
A.
,
Goldberg
,
R.
,
Melly
,
S.
,
von Klot
,
S.
et al. (
2013
)
Temperature, myocardial infarction, and mortality: effect modification by individual and area-level characteristics
.
Epidemiology
,
24
(
3
),
439
446
.

Mak
,
K.H.
,
Chia
,
K.S.
,
Kark
,
J.D.
,
Chua
,
T.
,
Tan
,
C.
,
Foong
,
B.H.
et al. (
2003
)
Ethnic differences in acute myocardial infarction in Singapore
.
European Heart Journal
,
24
(
2
),
151
160
.

Manhapra
,
A.
,
Canto
,
J.G.
,
Vaccarino
,
V.
,
Parsons
,
L.
,
Kiefe
,
C.I.
,
Barron
,
H.V.
et al. (
2004
)
Relation of age and race with hospital death after acute myocardial infarction
.
American Heart Journal
,
148
(
1
),
92
98
.

Ministry of Health
. (
2010
)
Health expenditure trends in New Zealand 1997–2007
.
Wellington
.

Morrow
,
D.A.
(
2017
)
Myocardial infarction: a companion to braunwald’s heart disease E-book
.
Amsterdam
:
Elsevier Health Sciences
.

Naghavi
,
M.
,
Wang
,
H.
,
Lozano
,
R.
,
Davis
,
A.
,
Liang
,
X.
,
Zhou
,
M.
et al. (
2015
)
Global, regional, and national age-sex specific allcause and cause-specific mortality for 240 causes of death, 1990–2013: A systematic analysis for the Global Burden of Disease Study 2013
.
Lancet
385
(
9963
),
117
171
.

National Health Board Business Unit
. (
2011
)
National minimum dataset (hospital events) data dictionary
.
Wellington
:
Ministry of Health
.

National Health Committee
. (
2013
)
Strategic overview: cardiovascular disease in New Zealand (working draft)
.

Nguyen
,
H.L.
,
Ha
,
D.A.
,
Phan
,
D.T.
,
Nguyen
,
Q.N.
,
Nguyen
,
V.L.
,
Nguyen
,
N.H.
et al. (
2014
)
Sex differences in clinical characteristics, hospital management practices, and in-hospital outcomes in patients hospitalized in a Vietnamese hospital with a first acute myocardial infarction. e95631
.
PLOS ONE
9
(
4
),

Pokorney
,
S.D.
,
Rodriguez
,
J.F.
,
Ortiz
,
J.T.
,
Lee
,
D.C.
,
Bonow
,
R.O.
&
Wu
,
E.
(
2012
)
Infarct healing is a dynamic process following acute myocardial infarction
.
Journal of Cardiovascular Magnetic Resonance
,
14
(
1
),
62
.

Smolina
,
K.
,
Wright
,
F.L.
,
Rayner
,
M.
&
Goldacre
,
M.J.
(
2012
)
Incidence and 30-day case fatality for acute myocardial infarction in England in 2010: National-linked database study
.
European Journal of Public Health
,
22
(
6
),
848
853
.

Vaccarino
,
V.
,
Rathore
,
S.S.
,
Wenger
,
N.K.
,
Frederick
,
P.D.
,
Abramson
,
J.L.
,
Barron
,
H.V.
et al. (
2005
)
Sex and racial differences in the management of acute myocardial infarction, 1994 through 2002
.
New England Journal of Medicine
,
353
(
7
),
671
682
.

Vlaar
,
P.J.
,
Svilaas
,
T.
,
der Horst
,
I.C.C.
,
Diercks
,
G.F.
,
Fokkema
,
M.L.
,
de Smet
,
B.J.
et al. (
2008
)
Cardiac death and reinfarction after 1 year in the Thrombus Aspiration during Percutaneous coronary intervention in Acute myocardial infarction study (TAPAS): a 1-year follow-up study
.
Lancet
,
371
(
9628
),
1915
1920
.

Wang
,
O.J.
,
Wang
,
Y.
,
Chen
,
J.
&
Krumholz
,
H.M.
(
2012
)
Recent trends in hospitalization for acute myocardial infarction
.
American Journal of Cardiology
,
109
(
11
),
1589
1593
.

Wienke
,
A.
(
2010
)
Frailty models in survival analysis
.
Boca Raton
:
Chapman and Hall/CRC
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data