Non-parametric inference about mean functionals of non-ignorable non-response data without identifying the joint distribution

Coverage probability of the 95% confidence interval for the estimators REP-DB and mnarIPW in case (I) under $n = 500$ and $n = 1, 000$

n	Methods	LL	NL	LN	NN
500	REP-DB	0.939	0.938	0.945	0.943
	mnarIPW	0.930	0.635	0.928	0.491
1,000	REP-DB	0.947	0.939	0.948	0.948
	mnarIPW	0.943	0.381	0.951	0.177

n	Methods	LL	NL	LN	NN
500	REP-DB	0.939	0.938	0.945	0.943
	mnarIPW	0.930	0.635	0.928	0.491
1,000	REP-DB	0.947	0.939	0.948	0.948
	mnarIPW	0.943	0.381	0.951	0.177

Table 1.

Coverage probability of the 95% confidence interval for the estimators REP-DB and mnarIPW in case (I) under $n = 500$ and $n = 1, 000$

n	Methods	LL	NL	LN	NN
500	REP-DB	0.939	0.938	0.945	0.943
	mnarIPW	0.930	0.635	0.928	0.491
1,000	REP-DB	0.947	0.939	0.948	0.948
	mnarIPW	0.943	0.381	0.951	0.177

n	Methods	LL	NL	LN	NN
500	REP-DB	0.939	0.938	0.945	0.943
	mnarIPW	0.930	0.635	0.928	0.491
1,000	REP-DB	0.947	0.939	0.948	0.948
	mnarIPW	0.943	0.381	0.951	0.177

For case (II), we generate data according to Model 1 in Example 1. As with case (I), we consider two different sample sizes $n = 500$ and $n = 1, 000$ ⁠. We calculate the bias (Bias), Monte Carlo standard deviation (SD) and 95% CPs based on 1,000 replications in each setting. For comparison, we also apply the estimator mnarIPW with a correct propensity score model to estimate μ. As mentioned earlier, the estimator mnarIPW is obtained through solving estimating equations, and its performance may depend on initial values during the optimization process, especially in this case where the full data distribution is not identified. We consider two different settings of initial values for optimization parameters: true values and random values from $U n i f o r m (0, 1)$ ⁠. The results are summarized in Table 2.

Table 2.

Comparisons in case (II) between the estimators REP-DB and mnarIPW under $n = 500$ and $n = 1, 000$

	REP-DB			mnarIPW-true			mnarIPW-uniform
n	Bias	SD	CP	Bias	SD	CP	Bias	SD	CP
500	0.004	0.031	0.905	$- 0.003$	0.050	0.923	$- 0.113$	0.206	0.709
1,000	0.001	0.023	0.932	$- 0.004$	0.035	0.933	$- 0.134$	0.216	0.667

	REP-DB			mnarIPW-true			mnarIPW-uniform
n	Bias	SD	CP	Bias	SD	CP	Bias	SD	CP
500	0.004	0.031	0.905	$- 0.003$	0.050	0.923	$- 0.113$	0.206	0.709
1,000	0.001	0.023	0.932	$- 0.004$	0.035	0.933	$- 0.134$	0.216	0.667

Table 2.

Comparisons in case (II) between the estimators REP-DB and mnarIPW under $n = 500$ and $n = 1, 000$

	REP-DB			mnarIPW-true			mnarIPW-uniform
n	Bias	SD	CP	Bias	SD	CP	Bias	SD	CP
500	0.004	0.031	0.905	$- 0.003$	0.050	0.923	$- 0.113$	0.206	0.709
1,000	0.001	0.023	0.932	$- 0.004$	0.035	0.933	$- 0.134$	0.216	0.667

	REP-DB			mnarIPW-true			mnarIPW-uniform
n	Bias	SD	CP	Bias	SD	CP	Bias	SD	CP
500	0.004	0.031	0.905	$- 0.003$	0.050	0.923	$- 0.113$	0.206	0.709
1,000	0.001	0.023	0.932	$- 0.004$	0.035	0.933	$- 0.134$	0.216	0.667

We observe from Table 2 that the proposed estimator REP-DB has negligible bias, small standard deviation and satisfactory CP even under sample size $n = 500$ ⁠. As sample size increases to $n = 1, 000$ ⁠, the 95% CP is close to the nominal level. For the estimator mnarIPW, only when the initial values for optimization parameters are set to be true values, it has comparable performance with REP-DB. However, if the initial values are randomly drawn from $U n i f o r m (0, 1)$ ⁠, the estimator mnarIPW has non-negligible bias, large standard deviation and low CP. As sample size increases, the situation becomes worse. We also calculate the estimator mnarIPW when initial values are drawn from other distributions, e.g., standard normal distribution. The performance is even worse and we do not report the results here. The simulations in this case demonstrate the superiority of the proposed estimator over existing estimators which require identifiability of the full data distribution.

4.2 Empirical example

We apply the proposed approach to the China Family Panel Studies, which was previously analysed in Miao et al. (2015). The data set includes 3,126 households in China. The outcome Y is the log of current home price (in $10^{4}$ RMB yuan), and it has missing values due to the non-response of house owner and the non-availability from the real estate market. The missingness process of home price is likely to be not at random, because subjects having expensive houses may be less likely to disclose their home prices. The missing data rate of current home price is $21.8 %$ ⁠. The completely observed covariates X includes 5 continuous variables: travel time to the nearest business center, house building area, family size, house story height, log of family income, and 3 discrete variables: province, urban (1 for urban househould, 0 rural), refurbish status. The shadow variable Z is chosen as the construction price of a house, which is also completely observed. The construction price is related to the current price of a house, and the shadow variable assumption that the non-response is independent of the construction price conditional on the current price and fully observed covariates is reasonable as the construction price can be viewed as an error-prone proxy for the current home value. For the representer assumption, we have discussed in Section 2 that it can be justified in principle without extra model assumptions on the missing data distribution, as the assumption involves only observed data. In particular, there are some simple examples where this assumption is guaranteed to hold. For instance, if $E (Z ∣ R = 1, X, Y)$ is linear in Y, then this assumption holds immediately according to the discussions below the representer assumption. Motivated by this fact, we fit a linear regression of the shadow variable Z on the covariates X and outcome Y using observed data. We test for significance by performing a t-test for the coefficient of Y, and the corresponding p-value is approximately zero, which indicates that there is a statistically significant relationship between Y and Z. Residual plots in Figure S2 of online supplementary materials show that the residuals are approximately randomly scattered, which one may interpret as evidence supporting the linearity assumption in this example. To further detect the linear structure in this multidimensional regression example, we also make a component-plus-residual plot given in Figure S3 of online supplementary materials. The scatter and the slope of the line in these plots appear approximately linear, as no obvious non-linear patterns emerge. All these empirical results provide no compelling evidence of violation of a linear model for $E (Z ∣ R = 1, X, Y)$ ⁠, and hence the representer assumption seems reasonable in the real data example. Further discussions on the required assumptions in this example are given in supplementary materials.

We apply the proposed estimator REP-DB to estimate the outcome mean and the 95% confidence interval. We also use the competing estimator mnarIPW and two estimators assuming MAR (marREG and marIPW) for comparison. The results are shown in Table 3. We observe that the results from the proposed estimator are similar to those from the estimator mnarIPW, both yielding slightly lower estimates of home price on the log scale than those obtained from the standard MAR estimators. However, when the data are transformed back to the original scale, the deviations are notable and amount to approximately $1.13 \times 10^{4}$ RMB yuan. These analysis results are generally consistent with those in Miao et al. (2015).

Table 3.

Point estimates and 95% confidence intervals of the outcome mean for the home pricing example

Methods	Estimate	95% confidence interval
REP-DB	2.591	(2.520, 2.661)
mnarIPW	2.611	(2.544, 2.678)
marREG	2.714	(2.661, 2.766)
marIPW	2.715	(2.659, 2.772)

Table 3.

. https://doi.org/10.1111/1468-0262.00470

Point estimates and 95% confidence intervals of the outcome mean for the home pricing example

Methods	Estimate	95% confidence interval
REP-DB	2.591	(2.520, 2.661)
mnarIPW	2.611	(2.544, 2.678)
marREG	2.714	(2.661, 2.766)
marIPW	2.715	(2.659, 2.772)

5 Discussion

With the aid of a shadow variable, we have established a necessary and sufficient condition for non-parametric identification of mean functionals of non-ignorable missing data even if the joint distribution is not identified. Then we naturally strengthen this condition by imposing a representer assumption that is shown to be necessary for $\sqrt{n}$ -estimability of the mean functional. The assumption involves the existence of solutions to a representer equation, which is a Fredholm integral equation of the first kind and can be satisfied under mild requirements. Based on the representer equation, we propose a sieve-based estimator of the mean functional, which bypasses the difficulties of correctly specifying and estimating the unknown missingness mechanism and the outcome regression. Although the joint distribution is not identifiable, the proposed estimator is shown to be consistent for the mean functional. In addition, we establish conditions under which the proposed estimator is asymptotically normal. Since the solution to the representer equation is not uniquely determined, one cannot simply apply standard theories for non-parametric sieve estimators to derive the above asymptotic results. In fact, we need to firstly construct a consistent estimator of the solution set, and then find from the estimated set a consistent estimator of an appropriately chosen solution. We finally show that the proposed estimator attains the semi-parametric efficiency bound for the shadow variable model at a key submodel where the representer is uniquely identified.

The availability of a valid shadow variable is crucial for the proposed approach to adjust for non-ignorable missing data. An interesting feature of this assumption is that it has some observable implications and it can in principle be falsified with observed data, contrary to the usual ignorability assumption (D’Haultfœuille, 2010; Molenberghs et al., 2008). In particular, under the shadow variable assumption, the equation in $π (\cdot)$ ⁠: $E {R / π (X, Y) ∣ X, Z} = 1$ has a solution that is a positive probability, i.e., $π (\cdot) \in (0, 1]$ ⁠, and hence, this assumption can be rejected if such a positive solution does not exist. Although refutable, the validity of the shadow variable assumption generally requires domain-specific knowledge of experts and needs to be investigated on a case-by-case basis. The existence of a shadow variable is practically reasonable in the empirical example presented in this paper and similar situations where one or more proxies or surrogates of a variable prone to missing data may be available. In fact, it is not uncommon in survey studies and/or cohort studies in the health and social sciences, that certain outcomes may be sensitive and/or expensive to measure accurately, so that a gold standard measurement is obtained only for a select subset of the sample, while one or more proxies or surrogate measures may be available for the remaining sample. Instead of a standard measurement error model often used in such settings which requires stringent identifying conditions, the more flexible shadow variable approach proposed in this paper provides a more robust alternative to incorporate surrogate measurement in a non-parametric framework, under minimal identification conditions. Besides, as advocated by Robins et al. (2000), in principle, one can also conduct sensitivity analysis to assess how results would change if the shadow variable assumption were violated by some pre-specified amount. For example, suppose that the parameter of interest μ is the population outcome mean. If the shadow variable assumption does not hold, then $E (Z ∣ R, X, Y)$ should depend on R, e.g., $E (Z ∣ R, X, Y) = α_{0} + ρ R + α_{1} X + α_{2} Y$ ⁠. Under this model, the representer assumption holds with $δ_{0} (X, Z) = (Z - α_{0} - ρ - α_{1} X) / α_{2}$ ⁠. Following the proof of Corollary 1, one can verify that $μ = E {R Y + (1 - R) δ_{0} (X, Z)} + ρ / α_{2} * E (1 - R)$ ⁠. To conduct sensitivity analysis, we first use the proposed approach to estimate $δ_{0} (X, Z)$ and $α_{2}$ based on observed data. Then for any given value of the sensitivity parameter $ρ$ (⁠ $ρ = 0$ when the shadow variable assumption holds), one can estimate μ using sample analogues of the above identification formula.

This paper may be improved or extended in several directions. Firstly, the proposed identification and estimation framework may be extended to handle non-ignorable missing outcome regression or missing covariate problems. Secondly, one can use modern machine learning techniques to solve the representer equation so that an improved estimator may be achieved that adapts to sparsity structures in the data. Thirdly, it is of great interest to extend our results to handling other problems of coarsened data, for instance, unmeasured confounding problems in causal inference. Proximal causal inference has been recently developed to adjust for unmeasured confounders by using two suitable negative control variables (Cui et al., 2020; Miao et al., 2018), which is different from the current non-ignorable missing setting that requires only a valid shadow variable. Since the identification of causal effects in negative control literature relies on certain confounding bridge functions that are similar to the representer equation, the proposed approach is a promising tool for relaxing the requirement of unique solutions in proximal causal inference. We plan to pursue these and other related issues in future research.

Acknowledgments

We would like to thank the editor, associate editor, and two anonymous reviewers for their very insightful and helpful comments, which led to a significant improvement of our paper.

Funding

W.L.'s and W.M.'s research is supported by the National Natural Science Foundation of China (12101607,12071015), the National Key R&D Program of China (2022YFA1008100), Beijing Natural Science Foundation (1232008), and the National Statistical Science Research Project (2022LZ13).

Data availability

The R code for the implementation of this paper and the real data set are available at https://gitlab.com/weylpku/nonignorablemissingcode.

Supplementary material

Supplementary material is available online at Journal of the Royal Statistical Society: Series B.

References

Ai

C.

, &

Chen

X.

(

2003

).

Efficient estimation of models with conditional moment restrictions containing unknown functions

.

Econometrica

,

71

(

6

),

1795

–

1843

. https://doi.org/10.3982/ECTA10851

Canay

I. A.

,

Santos

A.

, &

Shaikh

A. M.

(

2013

).

On the testability of identification in some nonparametric models with endogeneity

.

Econometrica

,

81

(

6

),

2535

–

2559

. https://doi.org/10.1016/S1573-4412(07)06077-1

Carrasco

M.

,

Florens

J.-P.

, &

Renault

E.

(

2007

).

Linear inverse problems in structural econometrics estimation based on spectral decomposition and regularization

.

Handbook of Econometrics

,

6(Part B)

,

5633

–

5751

. https://doi.org/10.1016/S1573-4412(07)06076-X

Chen

X.

(

2007

).

Large sample sieve estimation of semi-nonparametric models

.

Handbook of Econometrics

,

6(Part B)

,

5549

–

5632

. https://doi.org/10.3982/ECTA7888

Chen

X.

, &

Pouzo

D.

(

2012

).

Estimation of nonparametric conditional moment models with possibly nonsmooth generalized residuals

.

Econometrica

,

80

(

1

),

277

–

321

. https://doi.org/10.1111/j.1468-0262.2007.00794.x

Chernozhukov

V.

,

Hong

H.

, &

Tamer

E.

(

2007

).

Estimation and confidence regions for parameter sets in econometric models

.

Econometrica

,

75

(

5

),

1243

–

1284

. https://doi.org/10.3982/ECTA6539

Cui

Y.

,

Pu

H.

,

Shi

X.

,

Miao

W.

, &

Tchetgen Tchetgen

E. J.

(

2020

).

‘Semiparametric proximal causal inference’, arXiv, arXiv:2011.08411, preprint: not peer reviewed

.

Darolles

S.

,

Fan

Y.

,

Florens

J.-P.

, &

Renault

E.

(

2011

).

Nonparametric instrumental regression

.

Econometrica

,

79

(

5

),

1541

–

1565

. https://doi.org/10.1111/1467-937X.00236

Das

M.

,

Newey

W. K.

, &

Vella

F.

(

2003

).

Nonparametric estimation of sample selection models

.

The Review of Economic Studies

,

70

(

1

),

33

–

58

. https://doi.org/10.1017/S0266466610000368

D’Haultfoeuille

X.

(

2011

).

On the completeness condition in nonparametric instrumental problems

.

Econometric Theory

,

27

(

3

),

460

–

471

. https://doi.org/10.1016/j.jeconom.2009.06.005

D’Haultfœuille

X.

(

2010

).

A new instrumental method for dealing with endogenous selection

.

Journal of Econometrics

,

154

(

1

),

1

–

15

. https://doi.org/10.1080/01621459.1982.10477793

Greenlees

J. S.

,

Reece

W. S.

, &

Zieschang

K. D.

(

1982

).

Imputation of missing values when the probability of response depends on the variable being imputed

.

Journal of the American Statistical Association

,

77

(

378

),

251

–

261

. https://doi.org/10.2307/1912352

Heckman

J. J.

(

1979

).

Sample selection bias as a specification error

.

Econometrica

,

47

(

1

),

153

–

161

. https://doi.org/10.1214/aos/1065705120

Horowitz

J. L.

(

2009

).

Semiparametric and nonparametric methods in econometrics

(Vol.

12

).

Springer

.

Huang

J. Z.

(

2003

).

Local asymptotics for polynomial spline regression

.

Annals of Statistics

,

31

(

5

),

1600

–

1635

. https://doi.org/10.1111/1467-9868.00170

Ibrahim

J. G.

,

Lipsitz

S. R.

, &

Chen

M. -H.

(

1999

).

Missing covariates in generalized linear models when the missing data mechanism is non-ignorable

.

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

,

61

(

1

),

173

–

190

. https://doi.org/10.1111/1467-9876.00240

Ibrahim

J. G.

,

Lipsitz

S. R.

, &

Horton

N.

(

2001

).

Using auxiliary data for parameter estimation with non-ignorably missing outcomes

.

Journal of the Royal Statistical Society: Series C (Applied Statistics)

,

50

(

3

),

361

–

373

. https://doi.org/10.1111/rssb.12212

Kennedy

E. H.

,

Ma

Z.

,

McHugh

M. D.

, &

Small

D. S.

(

2017

).

Non-parametric methods for doubly robust estimation of continuous treatment effects

.

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

,

79

(

4

),

1229

–

1245

Kim

J. K.

, &

Yu

C. L.

(

2011

).

A semiparametric estimation of mean functionals with nonignorable missing data

.

Journal of the American Statistical Association

,

106

(

493

),

157

–

165

. https://doi.org/10.1198/jasa.2011.tm10104

Kosorok

M. R.

(

2008

).

Introduction to empirical processes and semiparametric inference

.

Springer

.

Kott

P. S.

(

2014

).

Calibration weighting when model and calibration variables can differ. In F. Mecatti, L. P. Conti, & G. M. Ranalli (Eds.), Contributions to sampling statistics (pp. 1–18). Springer

.

Li

Q.

, &

Racine

J. S.

(

2007

).

Nonparametric econometrics: Theory and practice

.

Princeton University Press

.

Google Preview

. https://doi.org/10.5705/ss.202017.0196

Little

R. J.

, &

Rubin

D. B.

(

2002

).

Statistical analysis with missing data

.

Wiley-Interscience

.

Liu

L.

,

Miao

W.

,

Sun

B.

,

Robins

J.

, &

Tchetgen Tchetgen

E. J.

(

2020

).

Identification and inference for marginal average treatment effect on the treated with an instrumental variable

.

Statistica Sinica

,

30(3)

,

1517

–

1541

. https://doi.org/10.1080/01621459.2015.1105808

Miao

W.

,

Ding

P.

, &

Geng

Z.

(

2016

).

Identifiability of normal and normal mixture models with nonignorable missing data

.

Journal of the American Statistical Association

,

111

(

516

),

1673

–

1683

. https://doi.org/10.1093/biomet/asy038

Miao

W.

,

Geng

Z.

, &

Tchetgen Tchetgen

E. J.

(

2018

).

Identifying causal effects with proxy variables of an unmeasured confounder

.

Biometrika

,

105

(

4

),

987

–

993

Miao

W.

,

Liu

L.

,

Tchetgen Tchetgen

E. J.

, &

Geng

Z.

(

2015

).

‘Identification, doubly robust estimation, and semiparametric efficiency theory of nonignorable missing data with a shadow variable’, arXiv, arXiv:1509.02556, preprint: not peer reviewed

.

Miao

W.

, &

Tchetgen Tchetgen

E. J.

(

2016

).

On varieties of doubly robust estimators under missingness not at random with a shadow variable

.

Biometrika

,

103

(

2

),

475

–

482

. https://doi.org/10.1093/biomet/asw016

Molenberghs

G.

,

Beunckens

C.

,

Sotto

C.

, &

Kenward

M. G.

(

2008

).

Every missingness not at random model has a missingness at random counterpart with equal fit

.

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

,

70

(

2

),

371

–

388

. https://doi.org/10.1111/j.1467-9868.2007.00640.x

. https://doi.org/10.1214/21-AOS2070

Morikawa

K.

, &

Kim

J. K.

(

2021

).

Semiparametric optimal estimation with nonignorable nonresponse data

.

The Annals of Statistics

,

49

(

5

),

2991

–

3014

. https://doi.org/10.1016/S0304-4076(97)00011-0

Newey

W. K.

(

1997

).

Convergence rates and asymptotic normality for series estimators

.

Journal of Econometrics

,

79

(

1

),

147

–

168

. https://doi.org/10.1111/1468-0262.00459

Newey

W. K.

, &

Powell

J. L.

(

2003

).

Instrumental variable estimation of nonparametric models

.

Econometrica

,

71

(

5

),

1565

–

1578

. https://doi.org/10.1198/016214502753479338

Qin

J.

,

Leung

D.

, &

Shao

J.

(

2002

).

Estimation with survey data under nonignorable nonresponse or informative sampling

.

Journal of the American Statistical Association

,

97

(

457

),

193

–

200

. https://doi.org/10.1080/01621459.1994.10476818

Robins

J. M.

,

Rotnitzky

A.

, &

Scharfstein

D. O.

(

2000

).

Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In M. E. Halloran & D. Berry (Eds.), Statistical models in epidemiology, the environment, and clinical trials (pp. 1–94). Springer

.

Robins

J. M.

,

Rotnitzky

A.

, &

Zhao

L. P.

(

1994

).

Estimation of regression coefficients when some regressors are not always observed

.

Journal of the American Statistical Association

,

89

(

427

),

846

–

866

. https://doi.org/10.1002/(SICI)1097-0258(19970115)16:1<81::AID-SIM473>3.0.CO;2-0

Rotnitzky

A.

, &

Robins

J.

(

1997

).

Analysis of semi-parametric regression models with non-ignorable non-response

.

Statistics in Medicine

,

16

(1),

81

–

102

Rotnitzky

A.

,

Robins

J. M.

, &

Scharfstein

D. O.

(

1998

).

Semiparametric regression for repeated outcomes with nonignorable nonresponse

.

Journal of the American Statistical Association

,

93

(

444

),

1321

–

1339

. https://doi.org/10.1080/01621459.1998.10473795

. https://doi.org/10.1093/biomet/63.3.581

Rubin

D. B.

(

1976

).

Inference and missing data (with discussion)

.

Biometrika

,

63

(

3

),

581

–

592

. https://doi.org/10.1016/j.jeconom.2010.11.014

Santos

A.

(

2011

).

Instrumental variable methods for recovering continuous linear functionals

.

Journal of Econometrics

,

161

(

2

),

129

–

146

. https://doi.org/10.1080/01621459.1999.10473862

Scharfstein

D. O.

,

Rotnitzky

A.

, &

Robins

J. M.

(

1999

).

Adjusting for nonignorable drop-out using semiparametric nonresponse models

.

Journal of the American Statistical Association

,

94

(

448

),

1096

–

1120

. https://doi.org/10.1016/j.jeconom.2012.05.018

Severini

T. A.

, &

Tripathi

G.

(

2012

).

Efficiency bounds for estimating linear functionals of nonparametric regression models with endogenous regressors

.

Journal of Econometrics

,

170

(

2

),

491

–

498

. https://doi.org/10.1093/biomet/asv071

Shao

J.

, &

Wang

L.

(

2016

).

Semiparametric inverse propensity weighting for nonignorable missing data

.

Biometrika

,

103

(

1

),

175

–

187

. https://doi.org/10.5705/ss.202016.0324

Sun

B.

,

Liu

L.

,

Miao

W.

,

Wirth

K.

,

Robins

J.

, &

Tchetgen Tchetgen

E. J.

(

2018

).

Semiparametric estimation with data missing not at random using an instrumental variable

.

Statistica Sinica

,

28(4)

,

1965

–

1983

. https://doi.org/10.1093/biomet/90.4.747

Tang

G.

,

Little

R. J.

, &

Raghunathan

T. E.

(

2003

).

Analysis of multivariate missing data with nonignorable nonresponse

.

Biometrika

,

90

(

4

),

747

–

764

Tang

N.

,

Zhao

P.

, &

Zhu

H.

(

2014

).

Empirical likelihood for estimating equations with nonignorably missing data

.

Statistica Sinica

,

24(2)

,

723

-747. https://doi.org/10.5705/ss.2012.254

. https://doi.org/10.1111/biom.12670

Tchetgen Tchetgen

E. J.

, &

Wirth

K. E.

(

2017

).

A general instrumental variable framework for regression analysis with outcome missing not at random

.

Biometrics

,

73

(

4

),

1123

–

1131

Vansteelandt

S.

,

Rotnitzky

A.

, &

Robins

J.

(

2007

).

Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse

.

Biometrika

,

94

(

4

),

841

–

860

. https://doi.org/10.1093/biomet/asm070

Wang

S.

,

Shao

J.

, &

Kim

J. K.

(

2014

).

An instrumental variable approach for identification and estimation with nonignorable nonresponse

.

Statistica Sinica

,

24

(3),

1097

–

1116

. https://doi.org/10.5705/ss.2012.074

. https://doi.org/10.1097/00004583-199209000-00025

Zahner

G. E.

,

Pawelkiewicz

W.

,

DeFrancesco

J. J.

, &

Adnopoz

J.

(

1992

).

Children’s mental health service needs and utilization patterns in an urban community: An epidemiological assessment

.

Journal of the American Academy of Child & Adolescent Psychiatry

,

31

(

5

),

951

–

960

. https://doi.org/10.1080/01621459.2014.983234

Zhao

J.

, &

Shao

J.

(

2015

).

Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data

.

Journal of the American Statistical Association

,

110

(

512

),

1577

–

1590