Using Maximum Simulated Likelihood Methods to Overcome Left Censoring: Dynamic Event History Models of Heart Attack Risk in New Zealand

Lee, Sanghyeok; Gørgens, Tue

doi:10.1111/rssa.12758

Abstract

This paper describes how the risk of experiencing heart attacks varies across gender and ethnicity in New Zealand. We estimate dynamic hazard models using administrative data. We deal with left-censored data using recently developed maximum simulated likelihood methods. The models allow risk to vary with age, previous heart attack history and unobserved individual heterogeneity. We find that the risk of subsequent events is far higher than the risk of the first event, particularly high within 1 year after an event, and that unobserved heterogeneity is important. Generally, male Maoris have the highest risk, followed by female Maoris, then ethnically European males, while ethnically European females have the lowest risk.

event history, frailty, hazard models, importance sampling, left censoring, maximum simulated likelihood estimation, missing data, Myocardial infarction, random effects, unobserved heterogeneity

1 INTRODUCTION

Acute myocardial infarction (AMI), or heart attacks, are an important public health issue. It has been estimated that the cost of a person’s first AMI-related hospitalization is about 4500 USD for 1999–2001 in New Zealand (Kauf et al., 2006), which is more than twice as large as the total health expenditure per capita of about 1,700 USD in 2000/2001 (Ministry of Health, 2010). AMI is the most important subclass of the ischaemic heart diseases, followed by angina (Morrow, 2017). In New Zealand, about 110 million USD (228 million NZD) was spent on treating ischaemic heart diseases in hospitals in 2002/2003 (National Health Committee, 2013). Ischaemic heart diseases are a leading cause of death in all of the developed world (Naghavi et al., 2015). Using hospital admission data, death registrations and census data, we estimate that about 10% of all deaths in New Zealand during 2002–2012 were directly caused by AMI.

This paper describes and quantifies how the risk of experiencing AMI events varies across gender and ethnicity in New Zealand. We consider four groups, namely male and female people of Maori and European descent (referred to as ‘Maoris’ and ‘Europeans’ in the remainder of the paper). Our analysis is challenged by a high degree of missing data, which we handle by using maximum simulated likelihood (MSL) estimation. The data set is constructed by combining nationwide administrative data on hospital admissions and death registrations during 2002–2012 with census data from 2001 such that our analysis data have the same population by age as the 2001 census. We consider several aspects of risk. Using event history (hazard) models, we investigate how risk is related to time (i.e. age and cohort), previous AMI history and unobserved individual heterogeneity (‘random effects’ in the econometrics literature, ‘frailty’ in the statistics literature). Furthermore, we consider three regimes of risk: the risk of experiencing the first AMI, the risk of a subsequent AMI within 1 year following a previous attack, and the risk of a subsequent AMI >1 year after a previous attack. We discuss time-specific risk and, taking a synthetic cohort approach, we compute lifetime cumulative outcomes until age 85.

In general, the estimation of dynamic event history models is hampered by problems of missing data. Often the data available concern the events that happened for a given population during a given period, and there is no information about events that happened before or after the observation period. These problems are called left censoring and right censoring respectively. In the present study, left censoring means that we do not know which regime an individual is in during the first part of their observation period, because we do not know whether the individual experienced any events prior to their observation period, and if they did, whether that event is within 1 year of becoming under observation. In general, dealing with right censoring is relatively straightforward, but left censoring is a very difficult problem. Most authors simply drop all left-censored histories from the analysis (e.g. Bhuller et al., 2017; Cockx & Picchio, 2013; Doiron & Gørgens, 2008), which can lead to large data and efficiency losses. Here we overcome the left-censoring problem by estimating the models using recently developed MSL methods (Lee & Gørgens, 2021). The idea is that the same model that is used to explain the observed patterns in the data can be used to simulate events that are unobserved. Intuitively, the missing data are integrated out of the so-called complete-data likelihood function by simulating pseudo-histories for each person whose history is left censored. In principle, the simulation technique allows us to compute an arbitrarily good approximation to the exact likelihood function for the data that are observed. In practice, available computing resources and time restrict the feasible accuracy.

Regarding right censoring, for our basic estimates we assume that the individual observation periods and the events are independent. In most cases, the observation period ends when the study period ends on 30 June 2012; however, some people are right censored when they die from an AMI event. We provide estimates from extended models to show that the influence of AMI-related mortality is negligible. Note that since we observe the cause of death, AMI events are not systematically underreported in our data.

The empirical results can be briefly summarized as follows. Our main finding is that there are large gender and ethnic disparities in AMI risk. The general ranking is that male Maoris tend to have the highest risk, followed by female Maoris, then male Europeans, and finally female Europeans have the lowest risk. This is true both for time-specific risk and from the perspective of synthetic lifetime cumulative risk. Not surprisingly, the AMI risk increases strongly with time. Regarding history dependence, in terms of the three regimes the risk is lowest for the first AMI event, highest for events within 1 year after an event, while still high for events >1 year after an event. Finally, it is notoriously difficult to obtain reliable estimates of the influence of unobserved heterogeneity, but our results suggest that within-group variation in risk is far greater than between-group differences. Unfortunately, our data do not allow us to explore which biological, socioeconomic, behavioural or other factors might explain these disparities across gender and ethnicity.

The rest of the paper is organized as follows. Section 2 provides a review of related literature. In Section 3, we describe our data. In Section 4, we discuss the model specification and the estimation methods. In Section 5, we present and discuss the estimation results. Section 6 concludes.

2 RELATED LITERATURE

The literature on gender and ethnic differences in the incidence and reoccurrence of AMIs is relatively small. Wang et al. (2012) compare AMI incidence rates in the United States and discover that male whites have the highest risk, followed by male blacks, then female blacks, and female whites have the lowest risk. Smolina et al. (2012) find a gender gap for first AMIs but no gap for subsequent AMIs in the United Kingdom. For New Zealand, Chan, Wright, Tobias et al. (2008) also distinguish between first and subsequent AMIs and find that the rate of AMI readmissions per 100,000 population increases during the 1990s; however, they do not consider gender and ethnic differences. The policy implications of high incidence rates for first and subsequent AMI are different: the former supports primary prevention and the latter supports secondary prevention (Avendano & Soerjomataram, 2008). The distinction between the first and subsequent AMIs is becoming more important as more people survive AMIs and are at risk of having subsequent AMIs. Chan, Wright, Riddell, et al. (2008) study differences in the prevalence of cardiovascular diseases (including AMI) across gender and ethnic groups in New Zealand, but do not distinguish between first and subsequent events.

The literature on gender and ethnic differences in mortality during the first 30 days or 1 year after an AMI event is larger than the literature on AMI incidence itself. Gender disparities in mortality have been studied for many countries using different kinds of data sets; for example, nationwide data on Finland (Kytö et al., 2015), Israel (Gottlieb et al., 2000), Scotland (MacIntyre et al., 2001), and England (Smolina et al., 2012) city-level data in Germany (Herman et al., 1997) and hospital-level data in Vietnam (Nguyen et al., 2014). Ethnic disparities in mortality have been studied for pertinent countries; for example, disparities between blacks and whites for the United States (Vaccarino et al., 2005), between people of European, Chinese and South Asian descent for Canada (Anand et al., 2000), and between Chinese, Malay and Indians for Singapore (Mak et al., 2003).

While we only consider mortality incidentally in this paper, it is interesting to note that the patterns in AMI incidence and AMI case fatality are not necessarily the same across gender, ethnicity and age. For example, Alderman et al. (2000) find that young black males in the United States have lower risk of AMI events than do young white males, but higher 30-day case fatality rates. Comparing US studies that focus on AMI incidence (e.g. Wang et al., 2012) with those on case fatality (e.g. Manhapra et al., 2004) reveals further instances where relatively low/high AMI incidence rates are associated with opposite high/low case fatality rates.

The statistical methods used in this literature include mean comparison, logistic regression, Kaplan–Meier estimation and Cox regression. These methods are appropriate for summarizing outcomes when there are no issues of missing data other than right censoring. For this reason, the literature has generally focused on outcomes that are fully observed during a relatively short period, say 30 days or 1 year, following an AMI event (e.g. Chang et al., 2006; Pokorney et al., 2012). A few small-scale follow-up studies have been able to track patients for several decades (e.g. Klein et al., 1992). Sometimes models of AMI risk allow for unobserved heterogeneity among families and hospitals. We have found only one study that has considered individual-level unobserved heterogeneity (Hougaard, 1986).

Our study contributes to the literature in several dimensions. We examine gender and ethnic disparities in AMI risk utilizing an event history approach. We analyse high-quality nationally representative data from New Zealand. We show that hazard models, in general, can be estimated using the MSL method, despite overwhelming left censoring. We use the estimated models to discuss and compare different aspects of risk including time dependence, dynamic effects of the AMI history and the role of unobserved individual heterogeneity. We also use the estimated models to examine synthetic lifetime cumulative outcomes.

3 DATA

Our primary data source is the National Minimum Data Set (NMDS) maintained by the Ministry of Health New Zealand. The NMDS includes administrative data on all hospital admissions in New Zealand (including both in-patients and day-patients). There are up to 20 diagnosis entries (for recording complications) for each admission. The original NMDS was created in 1993. The current format with 20 diagnosis entries was introduced in June 2002 (National Health Board Business Unit, 2011). Our study period is between 1 July 2002 and 30 June 2012. We merge the NMDS with death registrations. The latter include a (single) entry for the cause of death. These administrative data are well suited for studying the incidence of AMI events, because everyone who experiences an event usually present at an emergency department and are admitted to the hospital, or they die on the way to the hospital and so appear in the death data.

About 60% of New Zealanders appear in the administrative data during the study period. For example, there are 2.7 million people alive in the data in 2012 compared to 4.4 million estimated total population in 2012. We use the age distribution (by gender and ethnic group) in the census conducted on 6 March 2001 to add records for people with no hospital admission during the study period. As a result, the combined data set has the same age distribution in 2002 as the 2001 census. Migration in and out of New Zealand is substantial, but we expect that this is less of an issue for our analysis. Temporary emigration is common especially among young people, but they are less likely to suffer AMIs.

The combined data set includes date of birth (rounded to the 15th of the month for confidentiality), gender, ethnicity, date of hospital admission and diagnosis codes indicating AMI if admitted, and date of death and cause of death code if died. Gender is the biological sex reported. Ethnicity is self-identified. Date of admission is when patients are first seen by clinicians at a hospital. We use time of hospital admission or time of death as the timing of AMI events.

To define the study population, we restrict the combined data set as follows. First, we consider only ethnically European and Maori people. In the 2001 census, people of European and Maori descent constitute 83% and 14%. For the estimation sample, we further restrict the data to people over the age of 40 and under the age of 85. Estimating the risk of having the first AMI event requires us to specify an age when, according to the model, a person first becomes at risk. The age first at risk (AFAR) must be chosen such that relatively few events occur before this age (since they are ignored in the modelling) and such that relatively many events occur during the study period for the people who reach this age (since their experiences identify model parameters related to the first event). We set AFAR to 40 years in our main estimates, and consider AFAR at 35 years in the supplementary materials. There is no detailed information about the age distribution for those over 85 in the 2001 census and the sample sizes are small. It is therefore difficult to estimate risk for people over age 85.

Figure 1 illustrates general aspects of the data using a Lexis diagram with age on the horizontal and calendar time on the vertical axis. Each 45 degree line represents the life of a person. Symbols *, $† and^{\circ}$ indicate birth, death and right censoring. Dots represent AMI events. The data provide information about all life lived and all events between 1 July 2002 and 30 June 2012. The estimation sample considers the subset of life and events between ages 40 and 85. In the figure, the dashed part of the life lines represents time outside the estimation sample, while the solid part represents the time considered in the estimation. All persons born after June 1972 are excluded. Persons born between July 1962 and June 1972 contribute the months after they turn 40. For example, a person born in June 1972 contributes only the month of June 2012. Persons born between June 1962 and July 1927 contribute the full 10 years between July 2002 and June 2012, unless they die during this period. Persons who turn 85 years of age contribute the time between July 2002 and the month of their 85th birthday. Events that occur before 1 July 2002 are missing because of left censoring (e.g. d), events that occur after 30 June 2012 or after age 85 are missing because of right censoring (e.g. c, f) and events that occur before age 40 are ignored (e.g. a).

FIGURE 1

Lexis diagram with four individuals [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

Table 1 shows summary statistics for our study population and the estimation sample. By construction, the study population is essentially representative of the New Zealand population of Maoris and Europeans as of the census date, 6 March 2001. The first panel in Table 1 shows the study population by age on 1 July 2002. (The first row shows people born between 1 July 2002 and 30 June 2012 who were admitted to a hospital.) The cohorts aged 30–39 on 1 July 2002 are important, because after they turn 40 they provide the main identification of the risk of having a first event. This group is relatively large in all four subpopulations. Older cohorts are relatively smaller. Comparing the number of people aged 30–39 and 30–85 years reveals that 62–76% of the AMI histories in our estimation samples are left censored, so the problem is substantial.

TABLE 1

Open in new tab

Summary statistics for the study population

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	N	%	N	%	N	%	N	%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a	71,260	21.4	170,458	10.8	66,205	19.6	161,937	9.9
Age 0–29	164,365	49.3	565,370	36.0	162,936	48.2	554,708	33.8
Age 30–39	36,141	10.8	198,764	12.6	41,631	12.3	219,723	13.4
Age 40–49	29,054	8.7	206,996	13.2	32,262	9.5	217,124	13.2
Age 50–59	17,436	5.2	175,506	11.2	18,265	5.4	178,766	10.9
Age 60–69	10,060	3.0	121,447	7.7	10,723	3.2	125,954	7.7
Age 70–79	3,977	1.2	91,723	5.8	4,830	1.4	105,326	6.4
Age 80+	890	0.3	42,252	2.7	1,476	0.4	78,274	4.8
Total	333,183	100.0	1,572,516	100.0	338,328	100.0	1,641,812	100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None	271,496	99.9	933,964	99.9	270,661	99.9	936,216	99.9
1	236	0.0	583	0.0	107	0.0	140	0.0
2	24	0.0	35	0.0	3	0.0	10	0.0
3+	10	0.0	10	0.0	1	0.0	2	0.0
Total	271,766	100.0	934,592	100.0	270,772	100.0	936,368	100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None	91,347	94.4	772,325	94.4	104,422	96.5	859,072	97.1
1	4,518	4.7	38,322	4.7	3136	2.9	21,734	2.5
2	652	0.7	5527	0.7	497	0.5	3,239	0.4
3+	207	0.2	1793	0.2	201	0.2	1138	0.1
Total	96,724	100.0	817,967	100.0	108,256	100.0	885,183	100.0
Number of deaths between ages 40–85 by cause of death
AMIb	1,426	12.3	9,062	11.2	752	7.7	5,300	8.4
Other	10,179	87.7	71,818	88.8	9,019	92.3	57,551	91.6
Total	11,605	100.0	80,880	100.0	9,771	100.0	62,851	100.0

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	N	%	N	%	N	%	N	%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a	71,260	21.4	170,458	10.8	66,205	19.6	161,937	9.9
Age 0–29	164,365	49.3	565,370	36.0	162,936	48.2	554,708	33.8
Age 30–39	36,141	10.8	198,764	12.6	41,631	12.3	219,723	13.4
Age 40–49	29,054	8.7	206,996	13.2	32,262	9.5	217,124	13.2
Age 50–59	17,436	5.2	175,506	11.2	18,265	5.4	178,766	10.9
Age 60–69	10,060	3.0	121,447	7.7	10,723	3.2	125,954	7.7
Age 70–79	3,977	1.2	91,723	5.8	4,830	1.4	105,326	6.4
Age 80+	890	0.3	42,252	2.7	1,476	0.4	78,274	4.8
Total	333,183	100.0	1,572,516	100.0	338,328	100.0	1,641,812	100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None	271,496	99.9	933,964	99.9	270,661	99.9	936,216	99.9
1	236	0.0	583	0.0	107	0.0	140	0.0
2	24	0.0	35	0.0	3	0.0	10	0.0
3+	10	0.0	10	0.0	1	0.0	2	0.0
Total	271,766	100.0	934,592	100.0	270,772	100.0	936,368	100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None	91,347	94.4	772,325	94.4	104,422	96.5	859,072	97.1
1	4,518	4.7	38,322	4.7	3136	2.9	21,734	2.5
2	652	0.7	5527	0.7	497	0.5	3,239	0.4
3+	207	0.2	1793	0.2	201	0.2	1138	0.1
Total	96,724	100.0	817,967	100.0	108,256	100.0	885,183	100.0
Number of deaths between ages 40–85 by cause of death
AMIb	1,426	12.3	9,062	11.2	752	7.7	5,300	8.4
Other	10,179	87.7	71,818	88.8	9,019	92.3	57,551	91.6
Total	11,605	100.0	80,880	100.0	9,771	100.0	62,851	100.0

a

Excludes infants who do not survive 1 month.

b

Include persons whose official cause of death is acute myocardial infarction (AMI) and a few persons who died with AMI comorbidity on day of hospital admission.

TABLE 1

Open in new tab

Summary statistics for the study population

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	N	%	N	%	N	%	N	%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a	71,260	21.4	170,458	10.8	66,205	19.6	161,937	9.9
Age 0–29	164,365	49.3	565,370	36.0	162,936	48.2	554,708	33.8
Age 30–39	36,141	10.8	198,764	12.6	41,631	12.3	219,723	13.4
Age 40–49	29,054	8.7	206,996	13.2	32,262	9.5	217,124	13.2
Age 50–59	17,436	5.2	175,506	11.2	18,265	5.4	178,766	10.9
Age 60–69	10,060	3.0	121,447	7.7	10,723	3.2	125,954	7.7
Age 70–79	3,977	1.2	91,723	5.8	4,830	1.4	105,326	6.4
Age 80+	890	0.3	42,252	2.7	1,476	0.4	78,274	4.8
Total	333,183	100.0	1,572,516	100.0	338,328	100.0	1,641,812	100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None	271,496	99.9	933,964	99.9	270,661	99.9	936,216	99.9
1	236	0.0	583	0.0	107	0.0	140	0.0
2	24	0.0	35	0.0	3	0.0	10	0.0
3+	10	0.0	10	0.0	1	0.0	2	0.0
Total	271,766	100.0	934,592	100.0	270,772	100.0	936,368	100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None	91,347	94.4	772,325	94.4	104,422	96.5	859,072	97.1
1	4,518	4.7	38,322	4.7	3136	2.9	21,734	2.5
2	652	0.7	5527	0.7	497	0.5	3,239	0.4
3+	207	0.2	1793	0.2	201	0.2	1138	0.1
Total	96,724	100.0	817,967	100.0	108,256	100.0	885,183	100.0
Number of deaths between ages 40–85 by cause of death
AMIb	1,426	12.3	9,062	11.2	752	7.7	5,300	8.4
Other	10,179	87.7	71,818	88.8	9,019	92.3	57,551	91.6
Total	11,605	100.0	80,880	100.0	9,771	100.0	62,851	100.0

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	N	%	N	%	N	%	N	%
Number of people born March 1911–June 2012 by age in June 2002
After June 2002a	71,260	21.4	170,458	10.8	66,205	19.6	161,937	9.9
Age 0–29	164,365	49.3	565,370	36.0	162,936	48.2	554,708	33.8
Age 30–39	36,141	10.8	198,764	12.6	41,631	12.3	219,723	13.4
Age 40–49	29,054	8.7	206,996	13.2	32,262	9.5	217,124	13.2
Age 50–59	17,436	5.2	175,506	11.2	18,265	5.4	178,766	10.9
Age 60–69	10,060	3.0	121,447	7.7	10,723	3.2	125,954	7.7
Age 70–79	3,977	1.2	91,723	5.8	4,830	1.4	105,326	6.4
Age 80+	890	0.3	42,252	2.7	1,476	0.4	78,274	4.8
Total	333,183	100.0	1,572,516	100.0	338,328	100.0	1,641,812	100.0
Number of people born July 1962–June 2012 by observed AMIs when aged 0–40
None	271,496	99.9	933,964	99.9	270,661	99.9	936,216	99.9
1	236	0.0	583	0.0	107	0.0	140	0.0
2	24	0.0	35	0.0	3	0.0	10	0.0
3+	10	0.0	10	0.0	1	0.0	2	0.0
Total	271,766	100.0	934,592	100.0	270,772	100.0	936,368	100.0
Number of people born July 1917–June 1972 by observed AMIs when aged 40–85
None	91,347	94.4	772,325	94.4	104,422	96.5	859,072	97.1
1	4,518	4.7	38,322	4.7	3136	2.9	21,734	2.5
2	652	0.7	5527	0.7	497	0.5	3,239	0.4
3+	207	0.2	1793	0.2	201	0.2	1138	0.1
Total	96,724	100.0	817,967	100.0	108,256	100.0	885,183	100.0
Number of deaths between ages 40–85 by cause of death
AMIb	1,426	12.3	9,062	11.2	752	7.7	5,300	8.4
Other	10,179	87.7	71,818	88.8	9,019	92.3	57,551	91.6
Total	11,605	100.0	80,880	100.0	9,771	100.0	62,851	100.0

a

Excludes infants who do not survive 1 month.

b

Include persons whose official cause of death is acute myocardial infarction (AMI) and a few persons who died with AMI comorbidity on day of hospital admission.

The second panel in Table 1 shows the number of people aged 0–39 in June 2002 by their number of observed AMIs while aged 0–39 during the study period. As mentioned, since these life experiences are ignored in the estimation, it is important that there are relatively few AMI events before age 40. The third panel concerns our estimation sample. It shows the number of people aged 30–85 in June 2002 by their number of observed AMIs while aged 40–85 during the study period. (People born between July 1962 and June 1972 are included in both panels, provided they are alive at age 40.) Regrettably, AMI events are not rare for people over age 40.

Panel A in Figure 2 compares the age-specific AMI incidence rate across gender and ethnic groups. The incidence rate is computed as the ratio of the number of observed AMIs to the total time ‘at risk’ in 2-year age intervals, rescaled to rates per year. For ages under 80, male Maoris have the highest AMI incidence rates, female Maoris and male Europeans have similar intermediate rates, and female Europeans have the lowest rates. After age 80, the gender and ethnic disparities become less clear. Partly, this is because the number of people over 80 in the sample is small, so the estimates are noisy.

FIGURE 2

Incidence rates and cumulative hazard rates [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

Another look at age-specific risk is provided in panel B in Figure 2, which shows Nelson–Aalen estimates of the cumulative hazard rates. The cumulative hazard rate at a given age can be interpreted as the expected total number of events a representative person from the synthetic cohort has experienced by that age. The ranking of risk revealed in panel B is the same as in panel A, except the complications after age 80 are too small to upset the ranking of cumulative outcomes before age 80. We return to the synthetic cohorts in Section 5.2.

The last panel in Table 1 shows the number of people who died during the study period, about 6.8–11.9% of people in the estimation sample. About 7.7–12.3% of those deaths are caused by AMI. As mentioned, right censoring is less of an issue in this study, since we have data for most people until the end of the study period on 30 June 2012 and we observe the cause of death for those who die before that date.

4 MODEL AND ESTIMATION

4.1 Model specification and complete-data likelihood function

In this section, we discuss our model specification and estimation methods. A key risk factor is time. Time captures age effects on risk as bodies become older and the biology wears down, and time captures cohort effects on risk as different birth cohorts have been exposed to different environments, made different life style choices, and had access to different medical technologies. There may also be calendar time effects to the extent that, for example, the weather affects vulnerable people (e.g. Madrigano et al., 2013), but we expect them to be relatively less important. With data for only 10 calendar years, we do not attempt to estimate separate age and cohort effects. The data identify cohort effects by comparing people from different cohorts at the same age, which is only possible for cohorts up to 10 years apart. The oldest and the youngest birth cohorts in the estimation sample are 55 years apart. An attempt to separate age and cohort effects would therefore involve considerable extrapolation. Instead, we define analysis time as t = (age−40)/10 for age > 40 and interpret analysis time as representing age, cohort and calendar time effects combined. (Normalizing the time unit to a decade makes the scale of certain parameters more readable.)

It is well known that there are dynamic patterns in risk. Those having experienced an AMI event are more likely to experience subsequent events, and the risk is particularly high for some time immediately after that event. Therefore, we specify separate models for the first and subsequent AMI events, and the equation for subsequent AMIs is allowed to depend on the timing of the most recent event.

We estimate separate models for each gender and ethnic group, but for notational simplicity we suppress subscripts indicating the groups in the following. Each model consists of two equations. The first equation represents the hazard function, $h_{1}$ ⁠, of the first (actual as opposed to observed) AMI:

h_{1} (t | θ) = \exp (t α_{1} + μ_{1}),

(1)

where θ denotes the entire unknown parameter vector to be estimated. Parameter $α_{1}$ captures time dependence in risk, and parameter $μ_{1}$ captures the overall level of risk for first AMIs. The second equation represents the hazard function, $h_{2}$ ⁠, of subsequent AMIs:

h_{2} (t | t^{-}, θ) = \exp (t α_{2} + R e c e n t γ + μ_{2}) for t > t^{-},

(2)

where $t^{-}$ is the timing of the most recent AMI and the variable Recent is defined by $R e c e n t = 1 (t \leq t^{-} + τ)$ where the value of τ corresponds to 1 year (i.e. τ = 0.1). Parameter $α_{2}$ captures time dependence in risk, parameter γ indicates the dynamic effect of the most recent AMI, and parameter $μ_{2}$ embodies the overall level of risk of subsequent AMIs.

The Gompertz specifications embodied in Equation (1) and (2) assume that the hazard function progresses exponentially with time. The law of exponential progression is suitable for many common patterns in actuarial, biological and demographic applications (e.g. Wienke, 2010). It is also appropriate in the context of AMI risk at least until about age 85, as shown in Figure 2. We expect positive signs of $α_{1}$ and $α_{2}$ given that AMI risk is higher for older people. (Note that analysis time is not reset after an AMI event.)

We capture history dependence partly by distinguishing between $h_{1}$ and $h_{2}$ and partly by including the time-varying covariate Recent. The latter allows for elevated risk proportional to exp (γ) within 1 year following the most recent event. The cut-off between the two regimes for the risk of subsequent events is somewhat arbitrary but follows the literature (e.g. Vlaar et al., 2008). There is no sound basis for assuming an abrupt change in risk after 1 year, but this specification allows us to distinguish short-term and long-term risks in a simple way.

Since we estimate separate models for each group, effectively all parameters are interacted with gender and ethnicity. In particular, gender and ethnicity are not assumed to have a simple proportional effect on risk.

As mentioned, the main problem for estimating the models is left censoring. For people whose histories are left censored, we cannot tell whether the first observed AMI is the first experienced AMI or a subsequent AMI, nor do we know the value of Recent for the first 1 year of the observation period.

Left censoring and history dependence mean that the likelihood function for the observed data is analytically intractable. Therefore, we estimate the models using recent maximum simulated likelihood (MSL) methods (Lee & Gørgens, 2021). To discuss this method some additional notation is needed. Let $c_{i} = 0$ indicate that the history for individual i is not left censored, let $(0, b_{i 1}^{o})$ denote the analysis time of their observation period, and let $b_{i 1} = (b_{i 11}, \dots, b_{i 1 k_{i 1}})$ denote their event history, where $b_{i 1 k}$ is the analysis time when person i had event k and $k_{i 1}$ is their total number of events (possibly 0). The subscript 1 on these variables may seem redundant, but it simplifies the discussion later. Persons with $c_{i} = 0$ are under age 40 on 1 July 2002, and $b_{i 1}$ represents their event history from the date they turn 40 until 30 June 2012 or until the date of their death, whichever is earlier. In terms of Figure 1, the events in $b_{i 1}$ happen within the triangle bounded by the lines for age 40, for 30 June 2012, and for the person born in 1962. Let $c_{i} = 1$ indicate that the history for individual i is left censored, let $(b_{i 1}^{o}, b_{i 2}^{o})$ denote the analysis time of their observation period, and let $(b_{i 1}^{u}, b_{i 2}) = (b_{i 11}^{u}, \dots, b_{i 1 k_{i 1}}^{u}, b_{i 21}, \dots, b_{i 2 k_{i 2}})$ denote their event history, where $b_{i 2}$ is observed and $b_{i 1}^{u}$ is unobserved. Persons with $c_{i} = 1$ are those who are over age 40 on 1 July 2002, and $b_{i 2}$ represents their event history from 1 July 2002 until 30 June 2012 or until the date of death, while $b_{i 1}^{u}$ represents their AMIs from age 40 to 1 July 2002. In terms of Figure 1, events in $b_{i 2}$ happen within the trapezium bounded by the lines for 1 July 2002, for 30 June 2012, for the person born in 1962 and for age 85. Events in $b_{i 1}^{u}$ happen within the triangle bounded by the lines for age 40, for 1 July 2002, and (not indicated) for a person born in 1917.

Let $g_{1}$ denote density function of $b_{i 1}$ ⁠, and let $g_{2}$ denote density function of $b_{i 2}$ given $b_{i 1}^{u}$ ⁠. (The density for $b_{i 1}^{u}$ is the same as that for $b_{i 1}$ ⁠.) They can be derived from the hazard functions given in Equations (1) and (2). Let $H_{1} (t | t^{-}, θ) = \int_{t^{-}}^{t} h_{1} (y | θ) d y$ for $t > t^{-}$ denote the value of the cumulative hazard function from time $t^{-}$ until time t. Similarly, define $H_{2} (t | t^{-}, θ) = \int_{t^{-}}^{t} h_{2} (y | t^{-}, θ) d y$ for $t > t^{-}$ ⁠. Then the density $g_{1}$ of $b_{i 1}$ evaluated at $b_{1}$ when $k_{1} > 1$ is

\begin{matrix} g_{1} (b_{1} | θ) = & h_{1} (b_{11} | θ) \exp (- H_{1} (b_{11} | 0, θ)) \\ \times (\prod_{k = 2}^{k_{1}} h_{2} (b_{1 k} | b_{1, k - 1}, θ) \exp (- H_{2} (b_{1 k} | b_{1, k - 1}, θ))) \exp (- H_{2} (b_{1}^{o} | b_{1 k_{1}}, θ)) . \end{matrix}

(3)

If $k_{1} = 1$ ⁠, the product over k in the middle is void, and if $k_{1} = 0$ ⁠, then only the very last exponential term is present with $H_{1}$ replacing $H_{2}$ ⁠. When $k_{2} > 1$ ⁠, the conditional density $g_{2}$ of $b_{i 2}$ given $b_{i 1}^{u} = b_{1}$ evaluated at $b_{2}$ is

\begin{matrix} g_{2} (b_{2} | b_{1}, θ) = & {[h_{1} (b_{21} | θ) \exp (- H_{1} (b_{21} | b_{1}^{o}, θ))]}^{1 (k_{1} = 0)} \\ \times {[h_{2} (b_{21} | θ) \exp (- H_{2} (b_{21} | b_{1 k_{1}}, θ)) / \exp (- H_{2} (b_{20} | b_{1 k_{1}}, θ))]}^{1 (k_{1} > 0)} \\ \times (\prod_{k = 2}^{k_{2}} h_{2} (b_{2 k} | b_{2, k - 1}, θ) \exp (- H_{2} (b_{2 k} | b_{2, k - 1}, θ))) \exp (- H_{2} (b_{2}^{o} | b_{2 k_{2}}, θ)) . \end{matrix}

(4)

The modifications for individuals with other values of $k_{2}$ are relatively straightforward; see Lee and Gørgens (2021) for details.

With these definitions, the log-likelihood function for N observed histories can be written

L (θ) = \sum_{i = 1}^{N} [(1 - c_{i}) \ln g_{1} (b_{i 1} | θ) + c_{i} \ln \int_{supp (b_{1})} g_{2} (b_{i 2} | b_{1}, θ) g_{1} (b_{1} | θ) d b_{1}] .

(5)

The first term in the sum on the right-hand side of Equation (5) is the likelihood contribution if individual i is non-left censored. The second term is the likelihood contribution if individual i is left censored. Here the integral is over the unobserved history.

In Equation (5), we assume that right censoring and AMI events are independent, and we do not model the process of right censoring explicitly. As mentioned, most individuals are right censored because the study period ends on 30 June 2012, but a small number are right censored when they die before 30 June 2012. If mortality risk and AMI risk are not independent, then the likelihood function is misspecified (see Section 4.4 below). However, the misspecification bias is likely to be small since the death rate is small.

4.2 MSL estimation

There are no closed-form solutions to the integral in Equation (5), and analytical evaluation of the likelihood function is not possible. Lee and Gørgens (2021) show that L(θ) can be evaluated using simulation methods. They discuss two importance sampling simulation methods, unnormalized (ISU) and normalized (ISN). For the ISU method, the simulated log-likelihood function is

L (θ) \approx \sum_{i = 1}^{N} [(1 - c_{i}) \ln (g_{1} (b_{i 1} | θ)) + c_{i} \ln (\frac{1}{R} \sum_{r = 1}^{R} g_{2} (b_{i 2} | b_{i 1}^{r}, θ) \frac{g_{1} (b_{i 1}^{r} | θ)}{g_{1} (b_{i 1}^{r} | θ^{*})})],

(6)

where the $b_{i 1}^{r}$ s are simulated pseudo-event histories. The idea of importance sampling is to draw $b_{i 1}^{r}$ from $g_{1} (\cdot | θ^{*})$ using a fixed $θ^{*}$ instead of drawing from $g_{1} (\cdot | θ)$ using the θ at which the likelihood function is evaluated, and correct the ‘mismatch’ through the adjustment factor $g_{1} (b_{i 1}^{r} | θ) / g_{1} (b_{i 1}^{r} | θ^{*})$ ⁠. One of the advantages of using importance sampling is that the simulated likelihood function is continuous in θ, so gradient-based algorithms can be used to find the maximum. Since event timings depend on prior history, it is not possible to draw an entire history $b_{i 1}^{r}$ from $g_{1} (\cdot | θ^{*})$ in a single step. Instead, it is necessary to draw the individual pseudo-event timings sequentially (i.e. first $b_{i 11}^{r}$ ⁠, then $b_{i 12}^{r}$ conditional on $b_{i 11}^{r}$ etc., stopping when the draw falls beyond the unobserved period); see Lee and Gørgens (2021) for details.

For the ISN method, the simulated log-likelihood function is essentially the same; the only difference is that the adjustment factors are normalized so that they sum to R. That is, $g_{1} (b_{i 1}^{r} | θ) / g_{1} (b_{i 1}^{r} | θ^{*})$ in Equation (6) is replaced by

\frac{g_{1} (b_{i 1}^{r} | θ)}{g_{1} (b_{i 1}^{r} | θ^{*})} /\frac{1}{R} \sum_{r = 1}^{R} \frac{g_{1} (b_{i 1}^{r} | θ)}{g_{1} (b_{i 1}^{r} | θ^{*})} .

(7)

Since ISU and ISN are just different ways of approximating the exact likelihood function in Equation (5), both methods should provide similar results. However, there may be some differences in practice.

In general, the MSL estimation method is computationally burdensome, because a large number of draws is required in order to obtain a satisfactory approximation to the exact likelihood function for the observed data. In the present application, however, many people have the same unobserved periods and can share the same $b_{i 1}^{r}$ draws. Since birthdays are rounded at the monthly level, the total number of different unobserved periods is relatively small. Drawing $b_{i 1}^{r}$ for each of about 540 distinct unobserved periods rather than for each of the 96,724 to 885,183 individuals in the estimation samples (see Table 1) means that it is computationally feasible to set R very large.

There is little formal advice in the literature on how to choose $θ^{*}$ and R in general. In some simple settings, it is possible to improve computational efficiency greatly by choosing $θ^{*}$ such that rare outcomes occur either more or less often in the simulations than in the data (e.g. Hesterberg, 1995). However, it is difficult to relate these specific results to the present context. The informal advice is to choose R large enough that further increases have no practical effect on the estimates. We report our choices of $θ^{*}$ and R in the empirical section and provide sensitivity analysis in the supplementary materials.

As far as we are aware, there is no clear theoretical basis for choosing between the ISU and ISN estimators. Hesterberg (1995) simulates the properties of several normalized and unnormalized importance sampling estimators in a range of simple settings and found no estimator to be uniformly best. Lee and Gørgens (2021) carry out Monte Carlo experiments for designs closely related to the present application and find that the ISU and ISN estimates have similar distributional properties. For simplicity, we present only ISN estimates in this paper. In the supplementary materials, we benchmark the ISN likelihood functions against simulated likelihood functions that do not depend on importance sampling parameters and verify that the location of the maxima are not unduly influenced by the choice of $θ^{*}$ ⁠.

4.3 MSL standard errors

The MSL estimates are computed from a simulated likelihood function, and they are therefore affected by simulation noise in addition to the usual sampling noise. The simulation error in the likelihood function, and hence in the MSL estimates, is a decreasing function of R. The estimates presented in Section 5 are computed with R very high (R = 1000), and the contribution of simulation noise to the standard errors is therefore negligible. Accordingly, the reported standard errors are simply computed from the outer product of the simulated score functions without adjustments.

To show that the simulation noise is negligible, we report on four Monte Carlo experiments in Table 2. The data-generating processes (DGPs) are based on the empirical estimates presented later in the paper. Specifically, the sample sizes and the distribution of birthdays are the same as in each of the four subpopulations, and the DGP parameters are the estimated coefficients presented in Table 3 below (with AFAR = 40 and τ = 0.1). For simplicity, we set $θ^{*}$ equal to the DGP parameters.

TABLE 2

Open in new tab

Monte Carlo experiments

		Estimator		Coverage probability^a
	DGP	Bias	RSME	90%	95%	99%
Male Maoris (N = 96724)
$α_{1}$	0.543	0.001	0.016	0.916	0.953	0.985
$μ_{1}$	−3.700	0.000	0.030	0.907	0.963	0.990
$α_{2}$	0.171	0.001	0.020	0.912	0.958	0.992
γ	1.026	0.002	0.045	0.897	0.957	0.986
$μ_{2}$	−0.995	−0.005	0.056	0.917	0.965	0.991
Male Europeans (N = 817967)
$α_{1}$	0.654	0.001	0.006	0.896	0.937	0.992
$μ_{1}$	−4.352	−0.001	0.014	0.905	0.952	0.991
$α_{2}$	0.549	0.000	0.009	0.895	0.961	0.992
γ	1.132	0.001	0.017	0.911	0.947	0.989
$μ_{2}$	−2.462	−0.001	0.034	0.890	0.945	0.992
Female Maoris (N = 108256)
$α_{1}$	0.720	0.001	0.017	0.892	0.944	0.990
$μ_{1}$	−4.576	0.000	0.039	0.900	0.950	0.985
$α_{2}$	0.214	0.001	0.022	0.906	0.956	0.990
γ	1.076	0.003	0.053	0.879	0.934	0.983
$μ_{2}$	−1.024	−0.007	0.073	0.904	0.945	0.990
Female Europeans (N = 885183)
$α_{1}$	1.011	0.001	0.008	0.887	0.948	0.986
$μ_{1}$	−6.114	−0.003	0.024	0.896	0.944	0.987
$α_{2}$	0.568	0.001	0.014	0.901	0.950	0.992
γ	1.082	0.001	0.021	0.904	0.948	0.992
$μ_{2}$	−2.556	−0.008	0.056	0.908	0.949	0.990

		Estimator		Coverage probability^a
	DGP	Bias	RSME	90%	95%	99%
Male Maoris (N = 96724)
$α_{1}$	0.543	0.001	0.016	0.916	0.953	0.985
$μ_{1}$	−3.700	0.000	0.030	0.907	0.963	0.990
$α_{2}$	0.171	0.001	0.020	0.912	0.958	0.992
γ	1.026	0.002	0.045	0.897	0.957	0.986
$μ_{2}$	−0.995	−0.005	0.056	0.917	0.965	0.991
Male Europeans (N = 817967)
$α_{1}$	0.654	0.001	0.006	0.896	0.937	0.992
$μ_{1}$	−4.352	−0.001	0.014	0.905	0.952	0.991
$α_{2}$	0.549	0.000	0.009	0.895	0.961	0.992
γ	1.132	0.001	0.017	0.911	0.947	0.989
$μ_{2}$	−2.462	−0.001	0.034	0.890	0.945	0.992
Female Maoris (N = 108256)
$α_{1}$	0.720	0.001	0.017	0.892	0.944	0.990
$μ_{1}$	−4.576	0.000	0.039	0.900	0.950	0.985
$α_{2}$	0.214	0.001	0.022	0.906	0.956	0.990
γ	1.076	0.003	0.053	0.879	0.934	0.983
$μ_{2}$	−1.024	−0.007	0.073	0.904	0.945	0.990
Female Europeans (N = 885183)
$α_{1}$	1.011	0.001	0.008	0.887	0.948	0.986
$μ_{1}$	−6.114	−0.003	0.024	0.896	0.944	0.987
$α_{2}$	0.568	0.001	0.014	0.901	0.950	0.992
γ	1.082	0.001	0.021	0.904	0.948	0.992
$μ_{2}$	−2.556	−0.008	0.056	0.908	0.949	0.990

a

Actual coverage rates for confidence intervals with nominal rates 90%, 95%, 99%. The DGP parameters are based on the estimates in Table 3 and AFAR = 40, τ = 0.1, $θ^{*} = DGP$ ⁠, R = 1000. Each experiment has 1000 replications, except convergence failed in 2 cases for male Europeans and 2 cases for female Europeans.

TABLE 2

Open in new tab

Monte Carlo experiments

		Estimator		Coverage probability^a
	DGP	Bias	RSME	90%	95%	99%
Male Maoris (N = 96724)
$α_{1}$	0.543	0.001	0.016	0.916	0.953	0.985
$μ_{1}$	−3.700	0.000	0.030	0.907	0.963	0.990
$α_{2}$	0.171	0.001	0.020	0.912	0.958	0.992
γ	1.026	0.002	0.045	0.897	0.957	0.986
$μ_{2}$	−0.995	−0.005	0.056	0.917	0.965	0.991
Male Europeans (N = 817967)
$α_{1}$	0.654	0.001	0.006	0.896	0.937	0.992
$μ_{1}$	−4.352	−0.001	0.014	0.905	0.952	0.991
$α_{2}$	0.549	0.000	0.009	0.895	0.961	0.992
γ	1.132	0.001	0.017	0.911	0.947	0.989
$μ_{2}$	−2.462	−0.001	0.034	0.890	0.945	0.992
Female Maoris (N = 108256)
$α_{1}$	0.720	0.001	0.017	0.892	0.944	0.990
$μ_{1}$	−4.576	0.000	0.039	0.900	0.950	0.985
$α_{2}$	0.214	0.001	0.022	0.906	0.956	0.990
γ	1.076	0.003	0.053	0.879	0.934	0.983
$μ_{2}$	−1.024	−0.007	0.073	0.904	0.945	0.990
Female Europeans (N = 885183)
$α_{1}$	1.011	0.001	0.008	0.887	0.948	0.986
$μ_{1}$	−6.114	−0.003	0.024	0.896	0.944	0.987
$α_{2}$	0.568	0.001	0.014	0.901	0.950	0.992
γ	1.082	0.001	0.021	0.904	0.948	0.992
$μ_{2}$	−2.556	−0.008	0.056	0.908	0.949	0.990

		Estimator		Coverage probability^a
	DGP	Bias	RSME	90%	95%	99%
Male Maoris (N = 96724)
$α_{1}$	0.543	0.001	0.016	0.916	0.953	0.985
$μ_{1}$	−3.700	0.000	0.030	0.907	0.963	0.990
$α_{2}$	0.171	0.001	0.020	0.912	0.958	0.992
γ	1.026	0.002	0.045	0.897	0.957	0.986
$μ_{2}$	−0.995	−0.005	0.056	0.917	0.965	0.991
Male Europeans (N = 817967)
$α_{1}$	0.654	0.001	0.006	0.896	0.937	0.992
$μ_{1}$	−4.352	−0.001	0.014	0.905	0.952	0.991
$α_{2}$	0.549	0.000	0.009	0.895	0.961	0.992
γ	1.132	0.001	0.017	0.911	0.947	0.989
$μ_{2}$	−2.462	−0.001	0.034	0.890	0.945	0.992
Female Maoris (N = 108256)
$α_{1}$	0.720	0.001	0.017	0.892	0.944	0.990
$μ_{1}$	−4.576	0.000	0.039	0.900	0.950	0.985
$α_{2}$	0.214	0.001	0.022	0.906	0.956	0.990
γ	1.076	0.003	0.053	0.879	0.934	0.983
$μ_{2}$	−1.024	−0.007	0.073	0.904	0.945	0.990
Female Europeans (N = 885183)
$α_{1}$	1.011	0.001	0.008	0.887	0.948	0.986
$μ_{1}$	−6.114	−0.003	0.024	0.896	0.944	0.987
$α_{2}$	0.568	0.001	0.014	0.901	0.950	0.992
γ	1.082	0.001	0.021	0.904	0.948	0.992
$μ_{2}$	−2.556	−0.008	0.056	0.908	0.949	0.990

a

Actual coverage rates for confidence intervals with nominal rates 90%, 95%, 99%. The DGP parameters are based on the estimates in Table 3 and AFAR = 40, τ = 0.1, $θ^{*} = DGP$ ⁠, R = 1000. Each experiment has 1000 replications, except convergence failed in 2 cases for male Europeans and 2 cases for female Europeans.

Table 2 shows that the MSL estimators are nearly unbiased and the root mean square errors (RMSEs) are small, which is not surprising given the large sample sizes. Crucially, the empirical coverage probabilities of the confidence intervals are very close to the nominal targets. Thus simulation noise is not an important issue here.

In other applications, the influence of simulation noise can be examined in Monte Carlo studies like the ones we have presented here. If simulation noise is found to be an issue, perhaps the simplest solution is to increase R until the influence is deemed acceptable. This can be implemented as an asymptotically efficient two-step estimator, where R need only be huge in the second step (Hajivassiliou, 2000; Kristensen & Salanié, 2017).

4.4 Extension: Non-independent mortality

As mentioned, in our main analysis we assume that right censoring due to death and AMIs events is independent, and we do not model mortality. This is a reasonable simplification, given that the death rates are low. In principle, however, AMI-related mortality could affect the distribution of pseudo-histories for left-censored observations if people with more events are less likely to survive. To investigate whether our main results are robust to the simplifying assumption, we consider extended models that allow for AMI-related mortality. The extended models distinguish between deaths that are directly caused by an AMI event and other deaths. We refer to these deaths as AMI related and AMI unrelated. We maintain the assumption that AMI events and AMI-unrelated deaths (and other causes of right censoring) are independent.

Let $d_{i}$ be an indicator variable with $d_{i} = 1$ for an AMI-related death, and $d_{i} = 0$ for an AMI-unrelated death and for right censoring without death. For simplicity, we allow AMI-related mortality risk to depend on time but assume it does not depend on AMI history prior to the event that caused death. Let f denote the conditional probability of dying given that an AMI event occurs:

f (t, ζ) = {(1 + \exp (- ζ_{0} - ζ_{1} t - ζ_{2} t^{2}))}^{- 1},

(8)

where t is the time of the event and $ζ = {(ζ_{0}, ζ_{1}, ζ_{2})}^{'}$ ⁠.

Let $b_{j}$ be a history with $k_{j}$ events, either $b_{1}$ or $b_{2}$ ⁠. The probability m evaluated at $(b_{j}, d)$ of surviving $k_{j} - 1$ AMI events and surviving (d ≠ 1) or not surviving (d = 1) the $k_{j}$ th event is

m (b_{j}, d | ζ) = (\prod_{k = 1}^{k_{j} - 1} [1 - f (b_{jk}, ζ)]) {[1 - f (b_{j k_{j}}, ζ)]}^{1 (d \neq 1)} f {(b_{j k_{j}}, ζ)}^{1 (d = 1)} .

(9)

The log-likelihood function for the sample of people alive on 1 July 2002 can now be written as

\begin{matrix} \tilde{L} (θ, ζ) & = \sum_{i = 1}^{N} [(1 - c_{i}) \ln [g_{1} (b_{i 1} | θ) m (b_{i 1}, d_{i} | ζ)] \\ + c_{i} \ln \frac{\int_{supp (b_{1})} g_{2} (b_{i 2} | b_{1}, θ) m (b_{i 2}, d_{i} | ζ) g_{1} (b_{1} | θ) m (b_{1}, 0 | ζ) d b_{1}}{\int_{supp (b_{1}^{'})} g_{1} (b_{1}^{'} | θ) m (b_{1}^{'}, 0 | ζ) s d b_{1}^{'}}], \end{matrix}

(10)

where the prime on $b_{1}^{'}$ is simply to distinguish it from the other integration dummy $b_{1}$ ⁠. For non-left-censored histories, the likelihood contribution is the density of $(b_{i 1}, d_{i})$ ⁠. For left-censored histories, the likelihood contribution is the conditional density of $(b_{i 2}, d_{i})$ given survival until the start of the observation period $b_{1}^{o}$ (i.e. the analysis time between AFAR and 1 July 2002). In particular, the integral in the denominator is the probability of surviving all AMI events until $b_{1}^{o}$ ⁠.

If AMI events and AMI-related mortality are independent, then $m (b_{j}, d | ζ)$ does not depend on $b_{j}$ and the likelihood function becomes separable in θ and ζ. Therefore, Equation (5) is a special case of Equation (10) where independence is assumed and the terms involving ζ are omitted.

To evaluate the log-likelihood function in Equation (10), the unobserved histories can be integrated out using importance sampling simulation methods as before. For the ISU method, the simulated log-likelihood function is

\begin{matrix} \tilde{L} (θ) & \approx \sum_{i = 1}^{N} [(1 - c_{i}) \ln [g_{1} (b_{i 1} | θ) m (b_{i 1}, d_{i} | θ)] \\ + c_{i} \ln (\frac{1}{\sum_{r = 1}^{R_{i}} 1 (d_{i}^{r} = 0)} \{\sum_{r = 1}^{R_{i}} g_{2} (b_{i 2} | b_{i 1}^{r}, θ) m (b_{i 2}, d_{i} | θ) \frac{1 (d_{i}^{r} = 0) w (b_{i 1}^{r}, d_{i}^{r} | θ)}{w (b_{i 1}^{r}, d_{i}^{r} | θ^{*})}\})] \end{matrix}

(11)

with

w (b_{1}, d | θ) = \frac{g_{1} (b_{1} | θ) m (b_{1}, d | θ)}{\int_{supp (b_{1}^{'})} g_{1} (b_{1}^{'} | θ) m (b_{1}^{'}, d | θ) d b_{1}^{'}},

(12)

where the $(b_{i 1}^{r}, d_{i}^{r})$ s are simulated pseudo-event histories. We increase the number of pseudo-histories for each i such that $\sum_{r = 1}^{R_{i}} 1 (d_{i}^{r} = 0) = R$ ⁠. The ISN estimator is constructed by normalizing the ratio term in Equation (11) to sum to R across r, similar to Equation (7). The ISN estimator is simpler to compute than the ISU estimator, because the denominator in Equation (12) does not depend on r, so this term drops out of the normalized importance sampling adjustment factors and does not have to be computed.

4.5 Extension: Unobserved heterogeneity

Heterogeneity in risk can be considerable. Unfortunately, we do not have other risk markers in our data, be they biological, socioeconomic or behavioural factors. (See Lee and Gørgens, 2021, for a brief discussion of how to incorporate observed heterogeneity.) To investigate the role of unobserved individual heterogeneity, we estimate models with so-called random effects (frailty). Specifically, we include an unobserved random variable v ∼ N(0, 1) in the models.

To keep the notation simple, we reuse the names $h_{1}$ ⁠, $h_{2}$ ⁠, $g_{1}$ ⁠, $g_{2}$ and L for analogous functions that depend on the random effect. We include the random effect linearly in the hazard functions as follows:

h_{1} (t | v, θ) = \exp (t α_{1} + μ_{1} + v σ_{1}),

(13)

h_{2} (t | t^{-}, v, θ) = \exp (t α_{2} + R e c e n t γ + μ_{2} + v σ_{2}) for t > t^{-},

(14)

where the new parameters $σ_{1}$ and $σ_{2}$ represent the influence of the random effect and where θ has been augmented to include $σ_{1}$ and $σ_{2}$ ⁠. For identification, we normalize $σ_{1} \geq 0$ and $σ_{2} \geq 0$ ⁠.

To state the likelihood function, redefine the density $g_{1}$ of $b_{i 1}$ in Equation (3) and the conditional density $g_{2}$ of $b_{i 2}$ given $b_{i 1}^{u} = b_{1}$ in Equation (4) to depend on the random effect via the new $h_{1}$ and $h_{2}$ ⁠. Let Φ denote the standard normal cumulative distribution function. Assuming right censoring is independent of AMI events and of random effects, the log-likelihood function for N observed histories can be written

\begin{matrix} L (θ) = & \sum_{i = 1}^{N} [(1 - c_{i}) \ln \int_{R} g_{1} (b_{i 1} | v, θ) d Φ (v) \\ + c_{i} \ln \int_{R} \int_{supp (b_{1})} g_{2} (b_{i 2} | b_{1}, v, θ) g_{1} (b_{1} | v, θ) d b_{1} d Φ (v)] . \end{matrix}

(15)

This is similar to Equation (5), except here the random effect v is integrated out for each history in the sample. In fact, Equation (5) is the special case of Equation (15) where $σ_{1} = 0$ and $σ_{2} = 0$ ⁠.

To evaluate the log-likelihood function in Equation (15), the integrals over the random effects are one-dimensional and can be handled by, for example, Gaussian quadrature, while the integral over the unobserved history can be handled using importance sampling simulation methods as before. For the ISU method, the simulated log-likelihood function is

\begin{matrix} L (θ) & \approx \sum_{i = 1}^{N} [(1 - c_{i}) \ln (\sum_{q = 1}^{Q} w_{q} g_{1} (b_{i 1} | p_{q}, θ)) \\ + c_{i} \ln (\sum_{q = 1}^{Q} w_{q} \frac{1}{R} \{\sum_{r = 1}^{R} g_{2} (b_{i 2} | b_{i 1}^{qr}, p_{q}, θ) \frac{g_{1} (b_{i 1}^{qr} | p_{q}, θ)}{g_{1} (b_{i 1}^{qr} | p_{q}, θ^{*})}\})], \end{matrix}

(16)

where the $p_{q}$ s are Gauss–Hermite quadrature points and the $w_{q}$ s are the corresponding weights, and the $b_{i 1}^{qr}$ s are simulated pseudo-event histories conditional on $v = p_{q}$ ⁠. The $b_{i 1}^{qr}$ s are created as before, but separately for each $p_{q}$ value. The ISN estimator is constructed by normalizing the ratio term in Equation (16) to sum to R across r, as in Equation (7).

4.6 Alternative estimators

As mentioned in Section 1, common practice is simply to drop all left-censored histories from the analysis and base estimation on the subset of observations with $c_{i} = 0$ ⁠. This non-left-censored (NLC) estimator leads to valid inference as long as censoring is independent and there are enough non-left-censored observations.

Another Heckman (HKM) estimator is based on Heckman’s (1981) approach to a related ‘initial conditions’ problem. By plugging in for $g_{1}$ and $g_{2}$ ⁠, the likelihood function in Equation (5) can be broken into terms involving one of $h_{1} (t | θ)$ ⁠, $H_{1} (t | t^{-}, θ)$ ⁠, $h_{2} (t | t^{-}, θ)$ ⁠, or $H_{2} (t | t^{-}, θ)$ for different values of t and $t^{-}$ ⁠. Suppose we classify such terms of the likelihood function according to whether they are ‘computable’ or ‘uncomputable’, where uncomputable terms are those that depend on events that might have happened before 1 July 2002. Specifically, for left-censored histories we do not know whether the first observed event is the individual’s first actual event or a subsequent event, so we do not know whether $(h_{1}, H_{1})$ or $(h_{2}, H_{2})$ applies. Also, if the first observed event is within the first 1 year of the observation period, we do not know whether the person has had a recent event, so we do not know the value of Recent.

Adapting Heckman’s idea means specifying an auxiliary model for the uncomputable terms that does not require unobserved information. The auxiliary model can use whatever information is available about the person at time t. A simple specification is

h_{3} (t | ξ) = \exp (t α_{3} + μ_{3}),

(17)

where $ξ = (α_{3}, μ_{3})$ ⁠. Let $H_{3} (t | t^{-}, ξ) = \int_{t^{-}}^{t} h_{3} (y | ξ) d y$ denote the value of the cumulative hazard function. The likelihood function is constructed by substituting $h_{3} (t | ξ)$ or $H_{3} (t | ξ)$ for all uncomputable terms and incorporating ξ into θ. The hope is that the resulting likelihood function is a good approximation to the true likelihood function in the dimensions of the parameters of interest.

In their Monte Carlo simulations, Lee and Gørgens (2021) find that the RMSEs can be several times higher for the NLC and HKM estimators compared to the ISN estimator in experiments where the degree of censoring is high and events are relatively rare. In particular, the NLC estimator tends to provide unreliable estimates of parameters related to the risk subsequent events, although the HKM estimator can be in the ballpark even if the RMSEs are high. (Neither the NLC nor the HKM estimator provide good estimates for models with random effects, with the estimation algorithm often failing to converge within a generous time limit.)

We provide NLC and HKM estimates for our models in the supplementary materials for comparison. As expected, they have much larger standard errors than do the ISN estimates and they tend to fit the data poorly. The HKM estimates are used as the importance sampling parameter $θ^{*}$ for our main ISN estimates. We verify in robustness analysis that this works well.

5 RESULTS

5.1 Time-specific risk

We begin by discussing estimates using basic models that ignore AMI-related mortality and unobserved heterogeneity. These models fit the overall patterns in the data very well, as shown below and in the supplementary materials. For the ISN estimation, we use HKM estimates as the importance sampling parameter $θ^{*}$ and take R = 1000. When computing ISN estimates, we generally try at least two different starting values for the optimization algorithm (e.g. $θ^{*} \pm 0.2$ ⁠) in order to help discover multiple local maxima.

Table 3 reports ISN parameter estimates. Recall that $μ_{1}$ captures the levels of AMI risk at age 40 for the 1962–1972 birth cohort. The estimates of $μ_{1}$ in Table 3 are small (large negative), reflecting the low but not quite zero incidence rates around age 40 (c.f. Figure 2). The estimates of $μ_{1}$ are not the same, however, and imply that male Maoris have the highest risk of experiencing the first event, followed by male Europeans and female Maoris whose risk is similar, while female Europeans have the lowest risk. Recall that $α_{1}$ captures changes in the risk of experiencing the first AMI event across time, which here represent the effects of age as well as cohort and calendar time effects. The estimates of $α_{1}$ in Table 3 are all positive, implying that the risk of having the first AMI event increases with time. The risk increase is lowest for male Maoris, followed by male Europeans and female Maoris, while female Europeans have the steepest profile. Since the estimate of $μ_{1}$ is smallest and the estimate of $α_{1}$ is largest for female Europeans, it is possible that the risk gap between female Europeans and the other groups vanishes for older people. However, as shown graphically below, the differences in the estimates of $μ_{1}$ are too large and the differences in $α_{1}$ are too small for this to happen within ordinary human lifetimes.

TABLE 3

Open in new tab

Parameter estimates

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.54	0.65	0.72	1.01	272.24
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.35	−4.58	−6.11	1445.42
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.17	0.55	0.22	0.57	564.75
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.08	1.08	4.11
	(0.06)	(0.02)	(0.06)	(0.02)	[0.25]
$μ_{2}$	−1.00	−2.46	−1.03	−2.55	1012.44
	(0.06)	(0.03)	(0.07)	(0.05)	[0.00]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.54	0.65	0.72	1.01	272.24
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.35	−4.58	−6.11	1445.42
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.17	0.55	0.22	0.57	564.75
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.08	1.08	4.11
	(0.06)	(0.02)	(0.06)	(0.02)	[0.25]
$μ_{2}$	−1.00	−2.46	−1.03	−2.55	1012.44
	(0.06)	(0.03)	(0.07)	(0.05)	[0.00]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

TABLE 3

Open in new tab

Parameter estimates

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.54	0.65	0.72	1.01	272.24
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.35	−4.58	−6.11	1445.42
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.17	0.55	0.22	0.57	564.75
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.08	1.08	4.11
	(0.06)	(0.02)	(0.06)	(0.02)	[0.25]
$μ_{2}$	−1.00	−2.46	−1.03	−2.55	1012.44
	(0.06)	(0.03)	(0.07)	(0.05)	[0.00]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.54	0.65	0.72	1.01	272.24
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.35	−4.58	−6.11	1445.42
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.17	0.55	0.22	0.57	564.75
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.08	1.08	4.11
	(0.06)	(0.02)	(0.06)	(0.02)	[0.25]
$μ_{2}$	−1.00	−2.46	−1.03	−2.55	1012.44
	(0.06)	(0.03)	(0.07)	(0.05)	[0.00]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

The estimates of γ are similar for all four groups and suggest that the risk during the first year after an event is nearly three times as large as the risk more than a year after the event (exp(1) ≈ 2.7). The parameter $μ_{2}$ essentially reflects the risk of a subsequent event at age 40 for people who had an event at or before age 38 (i.e. >1 year before turning 40). Although this is out of sample, note that the estimates tend to be higher than the estimates for $μ_{1}$ ⁠. The estimates of $α_{2}$ indicate that the risk for Europeans increases with time much faster than the risk for Maoris.

Table 3 also reports Wald test statistics for the joint null hypotheses that the respective parameters are the same across all four groups. For example, the Wald statistic for testing equality of $α_{1}$ across the four groups is

\frac{{({\hat{α}}_{1}^{MM} - {\hat{α}}_{1}^{ME})}^{2}}{\hat{Var} ({\hat{α}}_{1}^{MM}) + \hat{Var} ({\hat{α}}_{1}^{ME})} + \frac{{({\hat{α}}_{1}^{ME} - {\hat{α}}_{1}^{FM})}^{2}}{\hat{Var} ({\hat{α}}_{1}^{ME}) + \hat{Var} ({\hat{α}}_{1}^{FM})} + \frac{{({\hat{α}}_{1}^{FM} - {\hat{α}}_{1}^{FE})}^{2}}{\hat{Var} ({\hat{α}}_{1}^{FM}) + \hat{Var} ({\hat{α}}_{1}^{FE})},

where ${\hat{α}}_{1}^{MM}$ is the estimate for male Maoris etc. By standard maximum likelihood theory, the asymptotic distribution is $χ^{2} (3)$ ⁠. As the table shows, the null is rejected for $α_{1}$ ⁠, $α_{2}$ ⁠, $μ_{1}$ and $μ_{2}$ ⁠, but not for γ.

Since the model is non-linear and it is difficult to interpret some of the parameters, we translate the estimates in Table 3 into hazard rates. Figure 3 shows the estimated hazards for the four groups. Panel A shows the hazards of experiencing the first AMI event. Male Maoris have the highest risk, followed by male Europeans and female Maoris who have similar intermediate-level risk, while female Europeans have the lowest risk. Panel B shows the hazard of experiencing an event within 1 year of another, while panels C and C’ show the hazard of subsequent events after 1 year. (The only difference between C and C’ is the scale of the vertical axis.) It is clear that the risk of subsequent events is much larger than the risk of the first event, especially within 1 year of a previous event. This pattern suggests that either having an event makes the body more susceptible to further events (causal effect) or that having an event is a marker for being a high-risk person (unobserved heterogeneity), or both. Note that male Europeans and female Maoris do not have similar risk of subsequent events. Instead, the risk of subsequent events is similar for male and female Maoris on the one hand and male and female Europeans on the other hand.

FIGURE 3

Hazard functions (per 10 years) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

A different view is provided in Figure 4, which shows the risk of subsequent events relative to the risk of the first event for the four groups. In our flexible two-equation systems, the effect of having the first event is not restricted to be proportional to a given baseline risk. The relative change in risk before and after the first event depends on time and history:

\frac{h_{2} (t | t^{-}, θ)}{h_{1} (t | θ)} = \exp (t {α_{2} - α_{1}} + R e c e n t γ + {μ_{2} - μ_{1}}) .

(18)

Figure 4 shows that the relative risk of subsequent events is extremely large, but falls steeply with time. This is especially true for younger females, whose relative risk is literally off the scale. The relative risk is also large for male Maoris, and falls with time at a slower rate. By comparison, relative risk appears much flatter for male Europeans. The large values for younger people arise because the risk of having the first event is very low, so the denominator is small (⁠ $μ_{1} < μ_{2}$ ⁠). The effect of time is negative, because the estimates imply $α_{2} < α_{1}$ ⁠. A formal Wald tests of the proportional hazards restriction $α_{2} = α_{1}$ is rejected for all groups (p = 0.00). Clearly, the risk of the first and subsequent event should not be modelled in a single equation with proportional hazards.

FIGURE 4

Ratio of hazard functions for subsequent acute myocardial infarctions (AMIs) over first AMI [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

5.2 Lifetime cumulative risk

So far we have compared outcomes across gender and ethnic groups in terms of time-specific risk. The estimated models allow us to simulate and compare synthetic lifetime cumulative outcomes.

Such an exercise raises two questions. First, the model is estimated on a cross-section and may not be representative of any particular birth cohort. As mentioned, there may have been significant changes in both life styles that affect risk and in the medical know-how in treating symptoms and preventing disease. However, as in other areas of demographic and epidemiological analysis, synthetic cohort analysis provides a useful summary of outcomes during a given period of time. Besides, we are comparing four groups which have lived in similar environments and experienced similar changes.

The second question is what to do about differences in mortality across the four groups. In reality, people do not live forever and mortality is likely to be related to both a person’s innate frailty and to their previous medical history. As discussed earlier, it is possible to augment the basic model with equations representing mortality risk. However, here we proceed by assuming no one dies until age 85, because this provides a better foundation for isolating and comparing the risk of AMI events. Were we to implement group-specific non-independent mortality, the internal composition of the four groups in terms of high-risk and low-risk people would vary over time at differential rates. As a result, the predicted outcomes would partly reflect differences in AMI risk and partly differences in mortality.

We compute lifetime cumulative outcomes by dynamically simulating AMI events for each gender and ethnic group over the age range from 40 to 85 using the ISN estimates in Table 3 as the data-generating process. Technically, the simulation method is similar to that used internally for computing the likelihood function when we estimate the models. We generate 100,000 individual pseudo-histories for each group.

Table 4 shows the frequency distributions of the lifetime cumulative number of AMI events for the four groups. The majority of people do not experience any events, but there are significant differences across groups. Male Maoris are most likely to ever have an AMI event (38%), followed by female Maoris and male Europeans whose probabilities are similar (30%), while female Europeans are least likely to experience any events (18%). All four distributions are heavily skewed to the right, with a small fraction of people experiencing a large number of events. The skewness implies that the mean number of events are much larger than the medians (which are all 0). Specifically, the average number of AMIs before age 85 is about 0.86 for male Maoris, 0.67 for female Maoris, 0.62 for male Europeans and 0.34 for female Europeans. When looking at the subsets of people with events, female Maoris experience the same number of events as male Maoris (2.3), higher than male Europeans (2.1) and female Europeans (1.9).

TABLE 4

Open in new tab

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Number of AMIs frequency distribution
0	61.83	70.15	70.34	81.64
1	14.94	12.67	11.80	9.18
2	10.33	8.58	7.96	5.16
3	6.36	4.81	4.83	2.42
4	3.47	2.23	2.56	1.02
5+	3.07	1.56	2.51	0.58
Number of AMIs
Mean	0.86	0.62	0.67	0.34
Number of AMIs for those with at least one
Mean	2.25	2.07	2.26	1.85

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Number of AMIs frequency distribution
0	61.83	70.15	70.34	81.64
1	14.94	12.67	11.80	9.18
2	10.33	8.58	7.96	5.16
3	6.36	4.81	4.83	2.42
4	3.47	2.23	2.56	1.02
5+	3.07	1.56	2.51	0.58
Number of AMIs
Mean	0.86	0.62	0.67	0.34
Number of AMIs for those with at least one
Mean	2.25	2.07	2.26	1.85

Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 3 as the data-generating process.

TABLE 4

Open in new tab

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Number of AMIs frequency distribution
0	61.83	70.15	70.34	81.64
1	14.94	12.67	11.80	9.18
2	10.33	8.58	7.96	5.16
3	6.36	4.81	4.83	2.42
4	3.47	2.23	2.56	1.02
5+	3.07	1.56	2.51	0.58
Number of AMIs
Mean	0.86	0.62	0.67	0.34
Number of AMIs for those with at least one
Mean	2.25	2.07	2.26	1.85

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Number of AMIs frequency distribution
0	61.83	70.15	70.34	81.64
1	14.94	12.67	11.80	9.18
2	10.33	8.58	7.96	5.16
3	6.36	4.81	4.83	2.42
4	3.47	2.23	2.56	1.02
5+	3.07	1.56	2.51	0.58
Number of AMIs
Mean	0.86	0.62	0.67	0.34
Number of AMIs for those with at least one
Mean	2.25	2.07	2.26	1.85

Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 3 as the data-generating process.

Since the estimated models fit the data very well, the incidence rates and cumulative hazard rates for the synthetic populations (not shown) are similar to those for the study populations presented in Figure 2. More detailed views of the average outcomes are provided in Figure 5. The left panel follows people as they age and shows the proportion who have ever had an AMI event (i.e. the extensive margin). The proportions are highest for male Maoris, since they have the highest time-specific AMI risk of the first event. Then follow female Maoris and male Europeans whose outcomes are similar, and finally female Europeans are least likely to have had any AMI events. The right panel shows the average number of AMI events people have had when they reach a certain age, for those who have experienced at least one AMI at that age (i.e. the intensive margin). Interestingly, male Europeans and female Maoris do not have similar experiences when conditioning on having had an event. Instead, the outcomes for female Maoris are very similar to those for male Maoris, while the outcomes for male Europeans are similar to those for female Europeans.

FIGURE 5

Synthetic lifetime cumulative outcomes [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

5.3 Non-independent mortality

This section briefly consider estimates from models that allow for non-independent mortality as discussed in Section 4.4. Again, we use HKM estimates as the importance sampling parameter $θ^{*}$ and take R = 1000.

Table 5 shows estimates for the four groups. The probability of death conditional on an event occurring is specified as the logistic function of time. Not surprisingly, the probability of dying of a heart attack is higher for older people, as indicated by the positive estimates of $ζ_{1}$ and $ζ_{2}$ ⁠. As expected, the estimates of parameters related to AMI events are similar to those presented in Table 3. Figure 6 shows the hazard rates corresponding to the estimates in Table 3. The estimated hazard rates are nearly indistinguishable from those in Figure 3, although close inspection reveals that the estimated hazard rates are slightly higher for older people in Figure 6. Since their death rates are higher (see Table 1), it is perhaps not surprising that these differences are relatively higher for Maoris than for Europeans. In any case, the differences are very small and negligible for purposes of this paper.

FIGURE 6

Hazard functions (per 10 years) (models with acute myocardial infarction-related mortality) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

TABLE 5

Open in new tab

Parameter estimates for models with acute myocardial infarction (AMI)-related mortality

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.57	0.68	0.74	1.03	298.85
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.37	−4.58	−6.14	1548.01
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.19	0.56	0.23	0.58	523.97
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.07	1.08	4.51
	(0.06)	(0.02)	(0.06)	(0.02)	[0.21]
$μ_{2}$	−1.03	−2.49	−1.05	−2.58	994.63
	(0.06)	(0.03)	(0.08)	(0.05)	[0.00]
$ζ_{0}$	−1.80	−2.84	−2.43	−2.93	62.42
	(0.11)	(0.08)	(0.20)	(0.15)	[0.00]
$ζ_{1}$	0.11	0.32	0.02	0.14	5.99
	(0.11)	(0.06)	(0.17)	(0.10)	[0.11]
$ζ_{2}$	0.04	0.02	0.08	0.06	4.34
	(0.02)	(0.01)	(0.03)	(0.02)	[0.23]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.57	0.68	0.74	1.03	298.85
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.37	−4.58	−6.14	1548.01
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.19	0.56	0.23	0.58	523.97
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.07	1.08	4.51
	(0.06)	(0.02)	(0.06)	(0.02)	[0.21]
$μ_{2}$	−1.03	−2.49	−1.05	−2.58	994.63
	(0.06)	(0.03)	(0.08)	(0.05)	[0.00]
$ζ_{0}$	−1.80	−2.84	−2.43	−2.93	62.42
	(0.11)	(0.08)	(0.20)	(0.15)	[0.00]
$ζ_{1}$	0.11	0.32	0.02	0.14	5.99
	(0.11)	(0.06)	(0.17)	(0.10)	[0.11]
$ζ_{2}$	0.04	0.02	0.08	0.06	4.34
	(0.02)	(0.01)	(0.03)	(0.02)	[0.23]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

TABLE 5

Open in new tab

Parameter estimates for models with acute myocardial infarction (AMI)-related mortality

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.57	0.68	0.74	1.03	298.85
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.37	−4.58	−6.14	1548.01
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.19	0.56	0.23	0.58	523.97
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.07	1.08	4.51
	(0.06)	(0.02)	(0.06)	(0.02)	[0.21]
$μ_{2}$	−1.03	−2.49	−1.05	−2.58	994.63
	(0.06)	(0.03)	(0.08)	(0.05)	[0.00]
$ζ_{0}$	−1.80	−2.84	−2.43	−2.93	62.42
	(0.11)	(0.08)	(0.20)	(0.15)	[0.00]
$ζ_{1}$	0.11	0.32	0.02	0.14	5.99
	(0.11)	(0.06)	(0.17)	(0.10)	[0.11]
$ζ_{2}$	0.04	0.02	0.08	0.06	4.34
	(0.02)	(0.01)	(0.03)	(0.02)	[0.23]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.57	0.68	0.74	1.03	298.85
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.70	−4.37	−4.58	−6.14	1548.01
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.19	0.56	0.23	0.58	523.97
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
γ	1.02	1.13	1.07	1.08	4.51
	(0.06)	(0.02)	(0.06)	(0.02)	[0.21]
$μ_{2}$	−1.03	−2.49	−1.05	−2.58	994.63
	(0.06)	(0.03)	(0.08)	(0.05)	[0.00]
$ζ_{0}$	−1.80	−2.84	−2.43	−2.93	62.42
	(0.11)	(0.08)	(0.20)	(0.15)	[0.00]
$ζ_{1}$	0.11	0.32	0.02	0.14	5.99
	(0.11)	(0.06)	(0.17)	(0.10)	[0.11]
$ζ_{2}$	0.04	0.02	0.08	0.06	4.34
	(0.02)	(0.01)	(0.03)	(0.02)	[0.23]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

5.4 Unobserved heterogeneity

In this section we discuss extended models that include random effects in an attempt to decompose risk into time-related and dynamic factors on the one hand and unobserved heterogeneity on the other. This analysis is more speculative, because the decomposition is identified through ‘functional form’ assumptions. It is well known that random effect distributions are difficult to estimate, and estimates of $σ_{1}$ and $σ_{2}$ may not be very reliable in general. Nevertheless, the estimates may provide a ballpark measure of the influence of unobserved heterogeneity.

The ISN estimates of models with random effects use the estimates from the corresponding models without random effects as importance sampling parameter $θ^{*}$ ⁠, with the modification that $\ln σ_{1}^{*} = \ln σ_{2}^{*} = 0$ ⁠, and take R = 1000 and Q = 10.

In our case, the likelihood function is clearly maximized on the boundary of the parameter space where $σ_{1} = 0$ for both male and female Europeans. For male and female Maoris, the likelihood function has two local maxima, one with $σ_{1} = 0$ and one with $σ_{1} > 0$ ⁠. The highest likelihood value is associated with the former for male Maoris and with the latter for female Maoris. On the other hand, the model fit is better with $σ_{1} = 0$ for both groups. For that reason, and to maintain comparability, we present the estimates with $σ_{1} = 0$ for all four groups. Estimates for models with $σ_{1} > 0$ are provided in the supplementary materials.

Table 6 reports ISN parameter estimates for the four groups. The estimates of $\ln σ_{2}$ imply $σ_{2} \approx 1.5$ ⁠, which means that the hazard rate at the third quartile of the distribution of random effects is >7 times the hazard rate at the first quartile. This suggests that unobserved heterogeneity is important and some people are much more at risk of having AMI events than others. In other words, the risk of having events is not evenly distributed within the groups.

TABLE 6

Open in new tab

Parameter estimates for models with random effects (⁠ $σ_{1} = 0$ ⁠)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.59	0.70	0.76	1.04	269.71
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.72	−4.40	−4.60	−6.15	1517.16
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.14	0.54	0.20	0.58	300.72
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
γ	0.73	0.82	0.74	0.72	3.32
	(0.06)	(0.02)	(0.06)	(0.03)	[0.34]
$μ_{2}$	−1.44	−2.99	−1.52	−3.13	460.83
	(0.10)	(0.05)	(0.12)	(0.08)	[0.00]
$\ln σ_{2}$	0.37	0.38	0.40	0.44	0.51
	(0.06)	(0.02)	(0.07)	(0.02)	[0.92]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.59	0.70	0.76	1.04	269.71
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.72	−4.40	−4.60	−6.15	1517.16
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.14	0.54	0.20	0.58	300.72
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
γ	0.73	0.82	0.74	0.72	3.32
	(0.06)	(0.02)	(0.06)	(0.03)	[0.34]
$μ_{2}$	−1.44	−2.99	−1.52	−3.13	460.83
	(0.10)	(0.05)	(0.12)	(0.08)	[0.00]
$\ln σ_{2}$	0.37	0.38	0.40	0.44	0.51
	(0.06)	(0.02)	(0.07)	(0.02)	[0.92]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

TABLE 6

Open in new tab

Parameter estimates for models with random effects (⁠ $σ_{1} = 0$ ⁠)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.59	0.70	0.76	1.04	269.71
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.72	−4.40	−4.60	−6.15	1517.16
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.14	0.54	0.20	0.58	300.72
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
γ	0.73	0.82	0.74	0.72	3.32
	(0.06)	(0.02)	(0.06)	(0.03)	[0.34]
$μ_{2}$	−1.44	−2.99	−1.52	−3.13	460.83
	(0.10)	(0.05)	(0.12)	(0.08)	[0.00]
$\ln σ_{2}$	0.37	0.38	0.40	0.44	0.51
	(0.06)	(0.02)	(0.07)	(0.02)	[0.92]

	Male Maoris	Male Europeans	Female Maoris	Female Europeans	Wald
$α_{1}$	0.59	0.70	0.76	1.04	269.71
	(0.02)	(0.01)	(0.02)	(0.01)	[0.00]
$μ_{1}$	−3.72	−4.40	−4.60	−6.15	1517.16
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
$α_{2}$	0.14	0.54	0.20	0.58	300.72
	(0.03)	(0.01)	(0.04)	(0.02)	[0.00]
γ	0.73	0.82	0.74	0.72	3.32
	(0.06)	(0.02)	(0.06)	(0.03)	[0.34]
$μ_{2}$	−1.44	−2.99	−1.52	−3.13	460.83
	(0.10)	(0.05)	(0.12)	(0.08)	[0.00]
$\ln σ_{2}$	0.37	0.38	0.40	0.44	0.51
	(0.06)	(0.02)	(0.07)	(0.02)	[0.92]

Standard errors in parentheses and p-values in brackets. Wald:

χ^{2}

test statistic for the null hypothesis that the parameter is the same across groups.

Figure 7 shows the corresponding hazard functions evaluated at the median of the random effects. While the risk patterns and rankings are similar, the median hazard rates in Figure 7 are much lower than the overall rates in Figure 3. This reflects the non-linear properties of the hazard models Equations (13) and (14), where the symmetrically distributed random effects v are transformed through the exponential function, leading to a highly skewed distribution of hazard rates. Specifically, the risk for the median person is much lower than the mean risk.

FIGURE 7

Median hazard functions (per 10 years) (models with random effects) [Colour figure can be viewed at https://dbpia.nl.go.kr/]

Open in new tab Download slide

Figures showing hazard rates at the first and third quartile of the distribution of the random effects are provided in the supplementary materials. Since the estimates of $\ln σ^{2}$ are similar for all four groups, the ranking of risk across groups is similar at the 75th percentile of the distribution of unobserved heterogeneity. The main impression from these graphs is that there are enormous differences between high- and low-risk people. Presumably it would be possible to explain or reduce this variation if data were available on individual biology, life style choices, etc.

Another way of summarizing the inequality in risk within groups is to simulate cumulative outcomes at different points in the distribution of random effects. The first panel in Table 7 shows the proportion of (synthetic) people who have ever had an AMI event between ages 40 and 85 (i.e. the extensive margin). Since $σ_{1} = 0$ ⁠, this proportion is the same throughout the random effects distribution. The estimates are similar to those in Table 4. The second panel shows the average total number of AMI events for those (synthetic) people who have at least one AMI in their lifetime (i.e. the intensive margin). Since $σ_{2} > 0$ for all groups, the numbers are different at the three quartiles of the random effects distribution. At the lower quartile, people total about 1.2 events. At the upper quartile, the numbers range from 2.3 to 2.9 across the four groups. Comparison with Table 4 shows that allowing for random effects reduces the estimated differences in the intensive margin at the 25th and 50th percentiles, while the differences are a bit larger between the four groups at the 75th percentile of the distribution of random effects.

TABLE 7

Open in new tab

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs) (models with random effects)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Proportion of people with at least one AMI (%)
v ∼ N(0,1)	41.4	32.1	32.2	19.3
Average number of AMIs for people with at least one AMI
v = -0.674	1.24	1.20	1.23	1.15
v = 0	1.64	1.54	1.65	1.44
v = 0.674	2.81	2.54	2.88	2.31

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Proportion of people with at least one AMI (%)
v ∼ N(0,1)	41.4	32.1	32.2	19.3
Average number of AMIs for people with at least one AMI
v = -0.674	1.24	1.20	1.23	1.15
v = 0	1.64	1.54	1.65	1.44
v = 0.674	2.81	2.54	2.88	2.31

Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 6 as the data-generating process.

TABLE 7

Open in new tab

Summary statistics for synthetic lifetime cumulative acute myocardial infarctions (AMIs) (models with random effects)

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Proportion of people with at least one AMI (%)
v ∼ N(0,1)	41.4	32.1	32.2	19.3
Average number of AMIs for people with at least one AMI
v = -0.674	1.24	1.20	1.23	1.15
v = 0	1.64	1.54	1.65	1.44
v = 0.674	2.81	2.54	2.88	2.31

	Male Maoris	Male Europeans	Female Maoris	Female Europeans
Proportion of people with at least one AMI (%)
v ∼ N(0,1)	41.4	32.1	32.2	19.3
Average number of AMIs for people with at least one AMI
v = -0.674	1.24	1.20	1.23	1.15
v = 0	1.64	1.54	1.65	1.44
v = 0.674	2.81	2.54	2.88	2.31

Based on simulating outcomes over age range 40–85 for 100,000 people using the estimates in Table 6 as the data-generating process.

5.5 Goodness of fit and robustness

Simulating histories from the estimated models in Table 3 can also be used as an informal check of their within-sample fit. For this, we draw five random histories for each of the individuals in the estimation samples, taking their date of birth and their observation period as given (including the date of death if they die before 30 June 2012). We then compute and compare summary statistics for the simulated samples and the estimation samples.

The first panel in Table 8 shows the actual and the simulated distributions of AMI events people experience during their observation period. The fit is very good, although all the estimated models tend to slightly underpredict the proportion of people with a single event and slightly overpredict the proportion with two or more events.

TABLE 8

Open in new tab

Within-sample goodness-of-fit statistics

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	Data	Model	Data	Model	Data	Model	Data	Model
Number of AMIs frequency distribution
0	94.44	94.55	94.42	94.54	96.46	96.54	97.05	97.10
1	4.67	3.98	4.69	4.11	2.90	2.41	2.46	2.13
2	0.67	1.13	0.68	1.01	0.46	0.77	0.37	0.57
3+	0.21	0.34	0.22	0.34	0.19	0.29	0.13	0.20
Incidence rates
Age 40–44	0.03	0.03	0.01	0.02	0.01	0.01	0.00	0.00
Age 45–49	0.05	0.05	0.02	0.02	0.02	0.02	0.01	0.01
Age 50–54	0.07	0.07	0.04	0.03	0.04	0.04	0.01	0.01
Age 55–59	0.10	0.10	0.05	0.05	0.06	0.06	0.02	0.02
Age 60–64	0.14	0.14	0.08	0.08	0.10	0.09	0.03	0.03
Age 65–69	0.19	0.19	0.11	0.11	0.14	0.13	0.05	0.05
Age 70–74	0.22	0.27	0.16	0.18	0.19	0.20	0.09	0.09
Age 75–79	0.30	0.35	0.25	0.28	0.23	0.31	0.16	0.17
Age 80+	0.37	0.48	0.39	0.45	0.34	0.45	0.27	0.31
Cumulative hazard rates
Age 50	0.04	0.04	0.02	0.02	0.02	0.02	0.00	0.00
Age 55	0.07	0.08	0.04	0.04	0.03	0.04	0.01	0.01
Age 60	0.12	0.13	0.06	0.06	0.07	0.07	0.02	0.02
Age 65	0.20	0.20	0.10	0.10	0.11	0.11	0.03	0.03
Age 70	0.29	0.30	0.16	0.16	0.18	0.18	0.05	0.06
Age 75	0.40	0.43	0.24	0.25	0.28	0.29	0.10	0.10
Age 80	0.55	0.61	0.37	0.39	0.40	0.44	0.18	0.19
Age 85	0.75	0.86	0.57	0.62	0.57	0.67	0.31	0.34

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	Data	Model	Data	Model	Data	Model	Data	Model
Number of AMIs frequency distribution
0	94.44	94.55	94.42	94.54	96.46	96.54	97.05	97.10
1	4.67	3.98	4.69	4.11	2.90	2.41	2.46	2.13
2	0.67	1.13	0.68	1.01	0.46	0.77	0.37	0.57
3+	0.21	0.34	0.22	0.34	0.19	0.29	0.13	0.20
Incidence rates
Age 40–44	0.03	0.03	0.01	0.02	0.01	0.01	0.00	0.00
Age 45–49	0.05	0.05	0.02	0.02	0.02	0.02	0.01	0.01
Age 50–54	0.07	0.07	0.04	0.03	0.04	0.04	0.01	0.01
Age 55–59	0.10	0.10	0.05	0.05	0.06	0.06	0.02	0.02
Age 60–64	0.14	0.14	0.08	0.08	0.10	0.09	0.03	0.03
Age 65–69	0.19	0.19	0.11	0.11	0.14	0.13	0.05	0.05
Age 70–74	0.22	0.27	0.16	0.18	0.19	0.20	0.09	0.09
Age 75–79	0.30	0.35	0.25	0.28	0.23	0.31	0.16	0.17
Age 80+	0.37	0.48	0.39	0.45	0.34	0.45	0.27	0.31
Cumulative hazard rates
Age 50	0.04	0.04	0.02	0.02	0.02	0.02	0.00	0.00
Age 55	0.07	0.08	0.04	0.04	0.03	0.04	0.01	0.01
Age 60	0.12	0.13	0.06	0.06	0.07	0.07	0.02	0.02
Age 65	0.20	0.20	0.10	0.10	0.11	0.11	0.03	0.03
Age 70	0.29	0.30	0.16	0.16	0.18	0.18	0.05	0.06
Age 75	0.40	0.43	0.24	0.25	0.28	0.29	0.10	0.10
Age 80	0.55	0.61	0.37	0.39	0.40	0.44	0.18	0.19
Age 85	0.75	0.86	0.57	0.62	0.57	0.67	0.31	0.34

Data: Computed directly from the estimation sample. Model: Computed from simulating outcomes using the estimates in Table 3 as the data-generating process and taking the individual observation periods in the estimation sample as given.

TABLE 8

Open in new tab

Within-sample goodness-of-fit statistics

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	Data	Model	Data	Model	Data	Model	Data	Model
Number of AMIs frequency distribution
0	94.44	94.55	94.42	94.54	96.46	96.54	97.05	97.10
1	4.67	3.98	4.69	4.11	2.90	2.41	2.46	2.13
2	0.67	1.13	0.68	1.01	0.46	0.77	0.37	0.57
3+	0.21	0.34	0.22	0.34	0.19	0.29	0.13	0.20
Incidence rates
Age 40–44	0.03	0.03	0.01	0.02	0.01	0.01	0.00	0.00
Age 45–49	0.05	0.05	0.02	0.02	0.02	0.02	0.01	0.01
Age 50–54	0.07	0.07	0.04	0.03	0.04	0.04	0.01	0.01
Age 55–59	0.10	0.10	0.05	0.05	0.06	0.06	0.02	0.02
Age 60–64	0.14	0.14	0.08	0.08	0.10	0.09	0.03	0.03
Age 65–69	0.19	0.19	0.11	0.11	0.14	0.13	0.05	0.05
Age 70–74	0.22	0.27	0.16	0.18	0.19	0.20	0.09	0.09
Age 75–79	0.30	0.35	0.25	0.28	0.23	0.31	0.16	0.17
Age 80+	0.37	0.48	0.39	0.45	0.34	0.45	0.27	0.31
Cumulative hazard rates
Age 50	0.04	0.04	0.02	0.02	0.02	0.02	0.00	0.00
Age 55	0.07	0.08	0.04	0.04	0.03	0.04	0.01	0.01
Age 60	0.12	0.13	0.06	0.06	0.07	0.07	0.02	0.02
Age 65	0.20	0.20	0.10	0.10	0.11	0.11	0.03	0.03
Age 70	0.29	0.30	0.16	0.16	0.18	0.18	0.05	0.06
Age 75	0.40	0.43	0.24	0.25	0.28	0.29	0.10	0.10
Age 80	0.55	0.61	0.37	0.39	0.40	0.44	0.18	0.19
Age 85	0.75	0.86	0.57	0.62	0.57	0.67	0.31	0.34

	Male Maoris		Male Europeans		Female Maoris		Female Europeans
	Data	Model	Data	Model	Data	Model	Data	Model
Number of AMIs frequency distribution
0	94.44	94.55	94.42	94.54	96.46	96.54	97.05	97.10
1	4.67	3.98	4.69	4.11	2.90	2.41	2.46	2.13
2	0.67	1.13	0.68	1.01	0.46	0.77	0.37	0.57
3+	0.21	0.34	0.22	0.34	0.19	0.29	0.13	0.20
Incidence rates
Age 40–44	0.03	0.03	0.01	0.02	0.01	0.01	0.00	0.00
Age 45–49	0.05	0.05	0.02	0.02	0.02	0.02	0.01	0.01
Age 50–54	0.07	0.07	0.04	0.03	0.04	0.04	0.01	0.01
Age 55–59	0.10	0.10	0.05	0.05	0.06	0.06	0.02	0.02
Age 60–64	0.14	0.14	0.08	0.08	0.10	0.09	0.03	0.03
Age 65–69	0.19	0.19	0.11	0.11	0.14	0.13	0.05	0.05
Age 70–74	0.22	0.27	0.16	0.18	0.19	0.20	0.09	0.09
Age 75–79	0.30	0.35	0.25	0.28	0.23	0.31	0.16	0.17
Age 80+	0.37	0.48	0.39	0.45	0.34	0.45	0.27	0.31
Cumulative hazard rates
Age 50	0.04	0.04	0.02	0.02	0.02	0.02	0.00	0.00
Age 55	0.07	0.08	0.04	0.04	0.03	0.04	0.01	0.01
Age 60	0.12	0.13	0.06	0.06	0.07	0.07	0.02	0.02
Age 65	0.20	0.20	0.10	0.10	0.11	0.11	0.03	0.03
Age 70	0.29	0.30	0.16	0.16	0.18	0.18	0.05	0.06
Age 75	0.40	0.43	0.24	0.25	0.28	0.29	0.10	0.10
Age 80	0.55	0.61	0.37	0.39	0.40	0.44	0.18	0.19
Age 85	0.75	0.86	0.57	0.62	0.57	0.67	0.31	0.34

Data: Computed directly from the estimation sample. Model: Computed from simulating outcomes using the estimates in Table 3 as the data-generating process and taking the individual observation periods in the estimation sample as given.

The second and third panels in Table 8 show the incidence rates and cumulative hazard rates for the estimation sample and for simulated data. (The supplementary materials provide graphical comparisons of these incidence rates and cumulative hazard rates.) Again, the fit is remarkably good, although the models slightly overpredict the number of events for older people. The proportions of older people in the estimation samples are small, especially for Maoris, so a slightly higher estimation uncertainty and a slightly poorer fit for them is expected.

Naturally, allowing for AMI-related mortality (see Table 5) reduces the predicted number of people with many events. However, we show in the supplementary materials that the effect is small. In terms of the age-specific unconditional probability of dying of AMI and the age-specific conditional probability of dying given that an AMI event occurs, these models fit the data very well.

The supplementary materials also include a discussion of goodness of fit for the estimated models with random effects in Table 6. Despite their much higher likelihood values and the statistical significance of the estimate of $\ln σ_{2}$ ⁠, the models with random effects do not fit the data as well as the models without random effects. In particular, while they get the proportion of people with no events about right, they overestimate risk and overpredict the total number of events in the population to a far greater extent.

The ISN estimates rely on the choice of $θ^{*}$ ⁠, R, and Q. The supplementary materials present estimation results for using first-stage importance sampling estimates as importance sampling parameters $θ^{*}$ in a second stage or taking R = 3000 or taking Q = 12 yields similar estimates. We also consider lowering AFAR to 35 years. None of these changes have any practical influence on results presented here.

As mentioned, the 1 year cut-off for Recent is arbitrary but in accordance with the literature. In the supplementary materials, we show that lowering the cut-off to 6 months (τ = 0.05) leads to an increase in the estimates of γ and $μ_{2}$ ⁠, whereas raising the cut-off to 2 years (τ = 0.2) leads to a decrease. This pattern is consistent with a world where the risk of further events increases sharply immediately after an event and gradually tapers off thereafter. However, the practical implications and the overall fit of the models are similar for the three different value of τ.

The time needed to compute the MSL estimate varies greatly depending on the sample sizes, the choice of $θ^{*}$ ⁠, R and Q, the starting values for the optimization algorithm, and the computing resources available. As an indication, the estimates in Table 3 take 1–7 min each to compute while those in Table 6 take 70–266 min on a 2016 workstation with 8 dual core CPUs and 64 GB of RAM. Some of the estimates shown in the supplementary materials take 6–7 h.

6 CONCLUSION

In this study, we investigate gender and ethnic disparities in several aspects of AMI risk in New Zealand. Our study complements the literature on gender and ethnic disparities in other areas of public health and health economics. AMIs, commonly known as heart attacks, are one of the leading causes of disability and mortality and therefore an important issue in public health.

We estimate hazard models of AMI risk separately for male and female people of Maori and European descent using high-quality administrative data on hospital admissions and death registrations combined with census data. The main challenge with analysing these data is the high degree of left censoring. For left-censored observations, the early part of the event histories is missing. Left censoring therefore poses a problem for studying history dependence and dynamic effects. Ad hoc approaches to handling left censoring do not yield reliable estimates. Recent methodological advances allow us to overcome the left-censoring problem effectively, by using MSL estimation and importance sampling methods.

Using these hazard models, we examine how AMI risk is distributed within each subpopulation and we compare patterns across subpopulations. We find, as expected, that older people have much higher risk than younger people. This partly reflects biological effects and partly time effects as different cohorts have been exposed to different environments, made different life style choices, and had access to different medical technologies. We find that there are important dynamic effects in that the risk of experiencing the first AMI event is much lower than the risk of having subsequent events, and the risk is particularly high in the first year following an event. The risks of the first and subsequent events are not proportional across time. In addition, we find that there is considerable within-group variation in risk, as measured by the influence of random effects. These two aspects, history dependence and unobserved heterogeneity, pull in the same direction of concentrating risk among relatively fewer people within the subpopulations.

Comparing the four gender and ethnic groups, we discover large disparities in AMI time-specific risk between male and female people of Maori and European descent. For the first event, the ranking is that male Maoris tend to have the highest risk, followed by female Maoris and male Europeans whose risk levels are similar, and finally female Europeans have the lowest risk levels. The pattern is somewhat different for subsequent events, where male and female Maoris have similar and high time-specific risk levels on the one hand while male and female Europeans have similar and comparatively lower risk levels on the other hand.

Using the estimated hazard models, we compute lifetime cumulative outcomes for the synthetic cohorts. Here, the gender and ethnic disparities are also clear, and the ranking between groups reflects the ranking of time-specific risk. The proportion of people who experience any event (the extensive margin) is highest for male Maoris, followed by female Maoris and male Europeans who have similar proportions, while the proportion of female Europeans who experience at least one event is the smallest. The average number of events for those who experience at least one (the intensive margin) is highest for male and female Maoris and lowest for male and female Europeans.

Our findings motivate further research on gender and ethnic disparities. For example, if the data are found or become available, it would be interesting and useful to investigate potential biological, socioeconomic or environmental factors which may explain some of the heterogeneity in risk.

Supporting Information

Additional supporting information may be found in the online version of the article at the publisher’s website.

ACKNOWLEDGEMENTS

This research was supported in part by Australian Research Council Grant DP1096862 and BK21 FOUR (Fostering Outstanding Universities for Research) funded by the Ministry of Education (MOE) of Korea and National Research Foundation (NRF) of Korea. We thank joint editor James Carpenter and three anonymous referees for comments and suggestions that greatly improved the paper.

REFERENCES

Alderman

,

M.H.

,

Cohen

,

H.W.

&

Madhavan

,

S.

(

2000

)

Myocardial infarction in treated hypertensive patients: the paradox of lower incidence but higher mortality in young blacks compared with whites

.

Circulation

,

101

(

10

),

1109

–

1114

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Anand

,

S.S.

,

Yusuf

,

S.

,

Vuksan

,

V.

,

Devanesen

,

S.

,

Teo

,

K.K.

,

Montague

,

P.A.

et al. (

2000

)

Differences in risk factors, atherosclerosis, and cardiovascular disease between ethnic groups in Canada: the study of health assessment and risk in ethnic groups (SHARE)

.

Lancet

,

356

(

9226

),

279

–

284

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Avendano

,

M.

&

Soerjomataram

,

I.

(

2008

)

Monitoring trends in acute coronary syndromes: can we use hospital admission registries

?

Heart

,

94

(

12

),

1524

–

1525

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Bhuller

,

M.

,

Brinch

,

C.N.

&

Königs

,

S.

(

2017

)

Time aggregation and state dependence in welfare receipt

.

Economic Journal

,

127

(

604

),

1833

–

1873

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Chan

,

W.C.

,

Wright

,

C.

,

Riddell

,

T.

,

Wells

,

S.

,

Kerr

,

A.J.

,

Gala

,

G.

et al. (

2008

)

Ethnic and socioeconomic disparities in the prevalence of cardiovascular disease in New Zealand

.

New Zealand Medical Journal

,

121

(

1285

),

11

–

20

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Chan

,

W.C.

,

Wright

,

C.

,

Tobias

,

M.

,

Mann

,

S.

&

Jackson

,

R.

(

2008

)

Explaining trends in coronary heart disease hospitalisations in New Zealand: admissions and incidence can trend in opposite directions

.

Heart

,

94

(

12

),

1589

–

1593

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Chang

,

W.C.

,

Kaul

,

P.

,

Fu

,

Y.

,

Westerhout

,

C.M.

,

Granger

,

C.B.

,

Mahaffey

,

K.W.

et al. (

2006

)

Forecasting mortality: dynamic assessment of risk in ST-segment elevation acute myocardial infarction

.

European Heart Journal

,

27

(

4

),

419

–

426

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Cockx

,

B.

&

Picchio

,

M.

(

2013

)

Scarring effects of remaining unemployed for long-term unemployed school-leavers

.

Journal of the Royal Statistical Society Series A

,

176

(

4

),

951

–

980

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Doiron

,

D.

&

Gørgens

,

T.

(

2008

)

State dependence in youth labor market experiences, and the evaluation of policy interventions

.

Journal of Econometrics

,

145

,

81

–

97

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Gottlieb

,

S.

,

Harpaz

,

D.

,

Shotan

,

A.

,

Boyko

,

V.

,

Leor

,

J.

,

Cohen

,

M.

et al. (

2000

)

Sex differences in management and outcome after acute myocardial infarction in the 1990s: a prospective observational community-based study

.

Circulation

,

102

(

20

),

2484

–

2490

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Hajivassiliou

,

V.A.

(

2000

) Some practical issues in maximum simulated likelihood. In:

Mariano

,

R.

,

Weeks

,

M.

&

Schuermann

,

T.

(Eds),

Simulation-based inference in econometrics: methods and applications

.

Cambridge

:

Cambridge University Press

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Heckman

,

J.J.

(

1981

) The incidental parameter problem and the problem of initial conditions in estimating a discrete time–discrete data stochastic process. In:

Manski

,

C.F.

&

McFadden

,

D.

(Eds),

Structural analysis of discrete data with econometric applications

,

Cambridge, Massachusetts

:

MIT Press

, pp.

179

–

195

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Herman

,

B.

,

Greiser

,

E.

&

Pohiabeln

,

H.

(

1997

)

A sex difference in short-term survival after initial acute myocardial infarction: the MONICA-Bremen acute myocardial infarction register, 1985–1990

.

European Heart Journal

,

18

(

6

),

963

–

970

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Hesterberg

,

T.

(

1995

)

Weighted average importance sampling and defensive mixture distributions

.

Technometrics

,

37

(

2

),

185

–

194

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Hougaard

,

P.

(

1986

)

Survival models for heterogeneous populations derived from stable distributions

.

Biometrika

,

73

(

2

),

387

–

396

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Kauf

,

T.L.

,

Velazquez

,

E.J.

,

Crosslin

,

D.R.

,

Weaver

,

W.D.

,

Diaz

,

R.

,

Granger

,

C.B.

et al. (

2006

)

The cost of acute myocardial infarction in the new millennium: evidence from a multinational registry

.

American Heart Journal

,

151

(

1

),

206

–

212

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Klein

,

J.P.

,

Moeschberger

,

M.

,

Li

,

Y.H.

,

Wang

,

S.T.

&

Flournoy

,

N.

(

1992

) Estimating random effects in the framingham heart study. In:

Survival Analysis: State of the Art

,

Berlin

:

Springer

, pp.

99

–

120

.

Kristensen

,

D.

&

Salanié

,

B.

(

2017

)

Higher-order properties of approximate estimators

.

Journal of Econometrics

,

198

(

2

),

189

–

208

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Kytö

,

V.

,

Sipilä

,

J.

&

Rautava

,

P.

(

2015

)

Gender and in-hospital mortality of ST-segment elevation myocardial infarction (from a multihospital nationwide registry study of 31,689 patients)

.

American Journal of Cardiology

,

115

(

3

),

303

–

306

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Lee

,

S.

&

Gørgens

,

T.

(

2021

)

Estimation of dynamic models of recurrent events with censored data

.

Econometrics Journal

,

24

(

2

),

199

–

224

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

MacIntyre

,

K.

,

Stewart

,

S.

,

Capewell

,

S.

(

2001

)

Gender and survival: a populationbased study of 20,1114 men and women following a first acute myocardial infarction

.

Journal of the American College of Cardiology

,

38

(

3

),

729

–

735

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Madrigano

,

J.

,

Mittleman

,

M.A.

,

Baccarelli

,

A.

,

Goldberg

,

R.

,

Melly

,

S.

,

von Klot

,

S.

et al. (

2013

)

Temperature, myocardial infarction, and mortality: effect modification by individual and area-level characteristics

.

Epidemiology

,

24

(

3

),

439

–

446

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Mak

,

K.H.

,

Chia

,

K.S.

,

Kark

,

J.D.

,

Chua

,

T.

,

Tan

,

C.

,

Foong

,

B.H.

et al. (

2003

)

Ethnic differences in acute myocardial infarction in Singapore

.

European Heart Journal

,

24

(

2

),

151

–

160

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Manhapra

,

A.

,

Canto

,

J.G.

,

Vaccarino

,

V.

,

Parsons

,

L.

,

Kiefe

,

C.I.

,

Barron

,

H.V.

et al. (

2004

)

Relation of age and race with hospital death after acute myocardial infarction

.

American Heart Journal

,

148

(

1

),

92

–

98

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Ministry of Health

. (

2010

)

Health expenditure trends in New Zealand 1997–2007

.

Wellington

.

Morrow

,

D.A.

(

2017

)

Myocardial infarction: a companion to braunwald’s heart disease E-book

.

Amsterdam

:

Elsevier Health Sciences

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Naghavi

,

M.

,

Wang

,

H.

,

Lozano

,

R.

,

Davis

,

A.

,

Liang

,

X.

,

Zhou

,

M.

et al. (

2015

)

Global, regional, and national age-sex specific allcause and cause-specific mortality for 240 causes of death, 1990–2013: A systematic analysis for the Global Burden of Disease Study 2013

.

Lancet

385

(

9963

),

117

–

171

.

Google Scholar

OpenURL Placeholder Text

WorldCat

National Health Board Business Unit

. (

2011

)

National minimum dataset (hospital events) data dictionary

.

Wellington

:

Ministry of Health

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

National Health Committee

. (

2013

)

Strategic overview: cardiovascular disease in New Zealand (working draft)

.

Nguyen

,

H.L.

,

Ha

,

D.A.

,

Phan

,

D.T.

,

Nguyen

,

Q.N.

,

Nguyen

,

V.L.

,

Nguyen

,

N.H.

et al. (

2014

)

Sex differences in clinical characteristics, hospital management practices, and in-hospital outcomes in patients hospitalized in a Vietnamese hospital with a first acute myocardial infarction. e95631

.

PLOS ONE

9

(

4

),

Google Scholar

OpenURL Placeholder Text

WorldCat

Pokorney

,

S.D.

,

Rodriguez

,

J.F.

,

Ortiz

,

J.T.

,

Lee

,

D.C.

,

Bonow

,

R.O.

&

Wu

,

E.

(

2012

)

Infarct healing is a dynamic process following acute myocardial infarction

.

Journal of Cardiovascular Magnetic Resonance

,

14

(

1

),

62

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Smolina

,

K.

,

Wright

,

F.L.

,

Rayner

,

M.

&

Goldacre

,

M.J.

(

2012

)

Incidence and 30-day case fatality for acute myocardial infarction in England in 2010: National-linked database study

.

European Journal of Public Health

,

22

(

6

),

848

–

853

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Vaccarino

,

V.

,

Rathore

,

S.S.

,

Wenger

,

N.K.

,

Frederick

,

P.D.

,

Abramson

,

J.L.

,

Barron

,

H.V.

et al. (

2005

)

Sex and racial differences in the management of acute myocardial infarction, 1994 through 2002

.

New England Journal of Medicine

,

353

(

7

),

671

–

682

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Vlaar

,

P.J.

,

Svilaas

,

T.

,

der Horst

,

I.C.C.

,

Diercks

,

G.F.

,

Fokkema

,

M.L.

,

de Smet

,

B.J.

et al. (

2008

)

Cardiac death and reinfarction after 1 year in the Thrombus Aspiration during Percutaneous coronary intervention in Acute myocardial infarction study (TAPAS): a 1-year follow-up study

.

Lancet

,

371

(

9628

),

1915

–

1920

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Wang

,

O.J.

,

Wang

,

Y.

,

Chen

,

J.

&

Krumholz

,

H.M.

(

2012

)

Recent trends in hospitalization for acute myocardial infarction

.

American Journal of Cardiology

,

109

(

11

),

1589

–

1593

.

Crossref

Google Scholar

OpenURL Placeholder Text

WorldCat

Wienke

,

A.

(

2010

)

Frailty models in survival analysis

.

Boca Raton

:

Chapman and Hall/CRC

.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Month:	Total Views:
March 2023	8
April 2023	19
May 2023	8
June 2023	2
July 2023	13
August 2023	5
September 2023	11
October 2023	8
November 2023	5
December 2023	8
January 2024	18
February 2024	13
March 2024	8
April 2024	12
May 2024	4
June 2024	5
July 2024	20
August 2024	13
September 2024	6
October 2024	9
November 2024	3
December 2024	9
January 2025	4
February 2025	10
March 2025	20
April 2025	9
May 2025	3

Article Contents

Using Maximum Simulated Likelihood Methods to Overcome Left Censoring: Dynamic Event History Models of Heart Attack Risk in New Zealand

Abstract

1 INTRODUCTION

2 RELATED LITERATURE

3 DATA

4 MODEL AND ESTIMATION

4.1 Model specification and complete-data likelihood function

4.2 MSL estimation

4.3 MSL standard errors

4.4 Extension: Non-independent mortality

4.5 Extension: Unobserved heterogeneity

4.6 Alternative estimators

5 RESULTS

5.1 Time-specific risk

5.2 Lifetime cumulative risk

5.3 Non-independent mortality

5.4 Unobserved heterogeneity

5.5 Goodness of fit and robustness

6 CONCLUSION

Supporting Information

ACKNOWLEDGEMENTS

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Using Maximum Simulated Likelihood Methods to Overcome Left Censoring: Dynamic Event History Models of Heart Attack Risk in New Zealand Free

Abstract

1 INTRODUCTION

2 RELATED LITERATURE

3 DATA

4 MODEL AND ESTIMATION

4.1 Model specification and complete-data likelihood function

4.2 MSL estimation

4.3 MSL standard errors

4.4 Extension: Non-independent mortality

4.5 Extension: Unobserved heterogeneity

4.6 Alternative estimators

5 RESULTS

5.1 Time-specific risk

5.2 Lifetime cumulative risk

5.3 Non-independent mortality

5.4 Unobserved heterogeneity

5.5 Goodness of fit and robustness

6 CONCLUSION

Supporting Information

ACKNOWLEDGEMENTS

REFERENCES

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Using Maximum Simulated Likelihood Methods to Overcome Left Censoring: Dynamic Event History Models of Heart Attack Risk in New Zealand