Abstract

Background

Mendelian randomization (MR) is widely used to unravel causal relationships in epidemiological studies. Whereas multiple MR methods have been developed to control for bias due to horizontal pleiotropy, their performance in the presence of other sources of bias, like non-random mating, has been mostly evaluated using simulated data. Empirical comparisons of MR estimators in such scenarios have yet to be conducted. Pleiotropy and non-random mating have been shown to account equally for the genetic correlation between height and educational attainment. Previous studies probing the causal nature of this association have produced conflicting results.

Methods

We estimated the causal effect of height on educational attainment in various MR models, including the MR-Egger and the MR-Direction of Causation (MR-DoC) models that correct for, or explicitly model, horizontal pleiotropy.

Results

We reproduced the weak but positive association between height and education in the Netherlands Twin Register sample (P= 3.9 × 10–6). All MR analyses suggested that height has a robust, albeit small, causal effect on education. We showed via simulations that potential assortment for height and education had no effect on the causal parameter in the MR-DoC model. With the pleiotropic effect freely estimated, MR-DoC yielded a null finding.

Conclusions

Non-random mating may have a bearing on the results of MR studies based on unrelated individuals. Family data enable tests of causal relationships to be conducted more rigorously, and are recommended to triangulate results of MR studies assessing pairs of traits leading to non-random mate selection.

Key Messages
  • We compare empirically the performance of various Mendelian randomization (MR) models in the presence of non-random/assortative mating.

  • We show that non-random mating may generate data that are consistent with a causal relationship between height and educational attainment, and may have a bearing on the results of MR studies.

  • MR models based on family data allow tests of causal hypotheses to be conducted in the presence of non-random mating (and other confounders such as dynastic effects). Summary statistics-based MR methods might be confounded in such scenarios.

Introduction

Causal inference based on observational data is challenging because of genetic and environmental confounding. Mendelian randomization (MR) has tremendous potential in addressing causality in the presence of confounding, and currently enjoys considerable methodological and substantive interest (for example Hemani et al.1, Verbanck et al.2). MR employs genetic variants associated with a given exposure as randomization instruments to detect the causal effect of the exposure on disease/outcome in non-experimental settings.3,4 Specifically, MR exploits the fact that the genetic variants’ meiosis is random with respect to environmental and social confounders,5,6 and they are not modified by disease outcomes (i.e. are not subject to reverse causality). The MR design yields information about causality, under the assumptions that the (genetic) instrument: (i) is well-associated with the exposure; (ii) is independent of confounders; and (iii) affects the outcome exclusively via the exposure variable (i.e. there is no horizontal pleiotropy). Although assumptions (i) and (ii) are likely to be tenable,7,8 assumption (iii) is unlikely to hold on account of the ubiquity of pleiotropy.9–12 As a result, multiple MR methods have been developed to control for bias due to horizontal pleiotropy.8,13–17

Yet another source of bias that may have a bearing on the results of MR studies is assortative mating18 (i.e. ‘the tendency of like to marry like’). MR approaches assume that the genetic variants used to instrument the analysis are randomly distributed in the population. This assumption draws heavily on random mating,8 the absence of inbreeding and the absence of selection on the genetic instrument. There is extensive evidence for assortative mating, specifically for height and education (for example Courtiol et al.,19 Hugh-Jones et al.,20 Yengo et al.21) as well as cross-trait,22 but also within (and between) psychiatric disorders like substance use.23,24 Recently, Hartwig and colleagues18 showed by means of simulations that assortative mating may bias the causal estimate in two-sample methods that are robust to pleiotropy, like MR-Egger or inverse variance weighted regression. Empirical comparisons of MR estimators in such scenarios have yet to be conducted.

Here we assess the performance of several methods robust to horizontal pleiotropy in the presence of assortative mating. Pleiotropy and assortative mating have been shown to account equally for the genetic correlation between height and educational attainment.22 We therefore consider these traits for our illustrative analyses. Empirical evidence has shown that height predicts attained education at different ages throughout the lifetime.25–28 There are several explanations for the height-educational attainment association.

The first is that stature has a causal effect on education.22,29–35 The reverse hypothesis that educational attainment causally influences height is arguably too implausible to merit consideration (but see Supplementary material, available as Supplementary data at IJE online, for a sensitivity check). The third ‘underlying factor’ hypothesis is that the association is due to common factors influencing both traits. Twin studies22,32 and studies in unrelated individuals which used measured genetic variants to disentangle the sources of this correlation33 showed that the correlation between the two traits is entirely explained by shared genetic factors.22,32,33 This genetic correlation can arise from vertical pleiotropy (i.e. the genetic effects on height are manifest in educational attainment due to the direct causal effect of height on education). The correlation may also be due to horizontal pleiotropy, i.e. due to overlap in the set of genes affecting brain size, intelligence and height, given the significant bivariate correlations observed between these traits (note that both intelligence and height are genetically correlated with brain volume36–41; see also Figure S1, available as Supplementary data at IJE online).

Alternatively, cross-trait assortative mating,18,22 dynastic effects and population structure42 may also induce genetic correlation between these traits. We consider these competing explanations in several MR models including MR-Egger14 and MR-direction of causation (MR-DoC) models16,17 that correct for, or explicitly model, horizontal pleiotropy. We also use simulations to assess the MR-DoC model’s robustness in the presence of assortative mating and finally, we employ the latent causal variable (LCV) model43 to account for potential bias arising from participant overlap44 in the two-sample MR approaches.

Methods

Ethical approval

The NTR study was approved by the Central Ethics Committee on Research Involving Human Subjects of the VU University Medical Centre, Amsterdam, an Institutional Review Board certified by the US Office of Human Research Protections (IRB number IRB-2991 under Federal-wide Assurance-3703; IRB/institute codes, NTR 03–180). All subjects provided written informed consent.

Sample

We evaluated the association of height and educational attainment (EA) in a Dutch sample from the Netherlands Twin Register (NTR).45,46 For the phenotypic association analysis, we selected 5899 Dutch adults (of whom 3184 were twins) with information on height and EA, relevant covariates (i.e. year of birth and sex) and observed genotypic data. EA was measured using a seven-category variable (1 = primary school only; 2 = lower educational schooling; 3 = lower vocational schooling; 4 = intermediate vocational schooling; 5 = intermediate/higher secondary schooling; 6 = higher vocational schooling; and 7 = university). The availability of repeated measures allowed us to run a series of consistency checks (see Supplementary material, available as Supplementary data at IJE online). Given the choice of the scheme used to code EA (1 through 7), and the fact that the underlying assumption of equivalent distances between adjacent EA categories is unlikely to hold in the case of ordinal data,47 we converted the seven-category variable to sex- and birth-cohort specific ridit scores47,48 as described in van Dongen et al.49 The resulting scores reflect males’ and females’ hierarchy of attained education in accordance with the societal values and historical conditions that define a given birth cohort. In short, we calculated the sex- and cohort-specific ridit scores for each EA category j by summing the proportion of cases below that category and one half the proportion of cases in the category47,48  Supplementary Table S1, available as Supplementary data at IJE online, shows descriptive statistics for the sample used in the analyses. Genome-wide single nucleotide polymorphisms (SNP) data were collected in the NTR sample on various genotyping platforms.45 Missing data were cross-platform imputed with the Genome of the Netherlands as the reference set.50–52 MaCH-Admix53 was used for phasing and imputation. We refer to the Supplementary material, available as Supplementary data at IJE online, for details on quality control checks performed before and after imputation.

Statistical analyses

Phenotypic association

Education was regressed on height and on the following covariates: age, sex and birth cohort (see Supplementary material, available as Supplementary data at IJE online for details). To correct for the effect of family clustering, we employed a generalized estimation equation linear model54 with an exchangeable model for the background familial covariance matrix and robust standard errors.55–57

Causality testing: one- and two-sample MR

We tested the causal effect of height on education in one- and two-sample MR models. The one-sample MR was based on the two-stage least squares (TSLS) procedure (but standard MR can also be performed using the Wald estimator) and on the MR-DoC model.17 As an instrumental variable we used a polygenic score, i.e. a weighed linear combination of SNPs. The weights used in calculating the polygenic score were obtained from the largest to date GWAS of height58 (excluding the NTR participants, see Supplementary material, available as Supplementary data at IJE online). Causality was tested in a subset of the sample comprising 3184 twins born between 1926 and 1989. The covariates used in the causality analysis were the dummy coded birth cohorts, age, sex, and 10 principal components (to correct for potential Dutch population structure). To increase the ethnic homogeneity of the sample, we removed ethnic outliers (see59 and Supplementary material, available as Supplementary data at IJE online).

The two-sample MR was based on MR-Egger,14 the inverse variance weighted approach, mode-60 and median-based MR methods.13,15 We assessed the suitability of summary data for conducting MR-Egger using the I2GX statistics.61 We also employed heterogeneity tests and Rucker’s framework62 to further assess pleiotropy and for model selection. To account for potential sample overlap that may bias the MR results, we used the latent causal variable (LCV) model43 (see Supplementary material, available as Supplementary data at IJE online). The two-sample MR used summary statistics for the SNPs available for both traits, downloaded from Wood et al.58 and from Okbay et al.,41 respectively. We refer to [https://gitlab.com/camish/empiricalcomparisons] for the R-script used in the analyses and to the Supplementary material, available as Supplementary data at IJE online, for more details on the methods employed.

Simulation: checking the robustness of MR-DoC in the presence of assortative mating

To evaluate the effect of assortative mating in the MR-DoC model, we created genotypes based on 150 diallelic loci (all minor allele frequencies = 0.5) in 50 000 parent pairs. Of the 150 loci, 50 affected height, 50 affected education and 50 had pleiotropic effects (i.e. affecting directly both height and EA). All 150 loci were used to compute polygenic scores, which were subsequently sorted ascendingly and used to divide the samples of mothers and fathers into five quintiles. Mate selection was contingent upon genetic similarity: males and females with the same quantile score were allowed to form parent pairs. To control the degree of assortative mating, we used a mover-stayer transition probability matrix to allow the males and females to switch to any other quintiles; the probability of switching to any other quantile equalled 0.05. Based on the resulting matings, we created two offspring per parent pair. The offspring were subsequently randomly assigned to pairs and the assortative mating procedure was repeated. After 10 generations, the familial (co)variances reached the expected equilibrium values63 such that parent-parent genetic correlations equalled about 0.4 and the sibling correlations equaled about 0.7. Half of the resulting sibling pairs were used to create MZ twin pairs, by duplicating the first genotype. Phenotypic data were next created based on the genotypes and environmental variation. Polygenic scores computed using the 20 (horizontally) pleiotropic loci were used as the genetic instrument in the MR-DoC model.

Results

The observational regression analysis in the NTR sample showed that height is associated with educational attainment (Wald χ2(1) = 21.3; P = 3.9 × 10–6; R2 = ∼0.3%). TSLS indicated that height has a causal effect on educational attainment (bcausal = 0.004; standard error (SE) = 0.001; Wald χ2(1) =6.7; P  = 0.0096). Table 1 includes the results obtained using two-sample MR methods. The I2GX statistic equalled 0.84—a value close to the recommended value of 1—providing no evidence of regression dilution, and supporting the use of MR-Egger. Consistent with the TSLS and with the other pleiotropy-robust methods results (see Table 1), MR-Egger implied a causal effect of height on EA. The MR-Egger intercept provided no evidence of detectable directional pleiotropy (resulting in the mean value of the pleiotropy distribution deviating from zero). In line with the MR-Egger intercept result, the funnel plot suggested the ‘no directional pleiotropy’ assumption is likely to be satisfied (Figure 1).

Funnel plot for the height - educational attainment Mendelian Randomization analysis based on summary statistics. The dots represent the instrumental variable effects estimated based on each single nucleotide polymorphism (βIV), plotted against their precision (the standard error of the effect estimate, SEIV). The vertical solid gray line represents the MR-Egger intercept, the vertical gray dashed line represents the inverse variance weighted estimate. The plot resembles a symmetrical funnel suggesting there is no evidence for directional pleiotropy.
Figure 1

Funnel plot for the height - educational attainment Mendelian Randomization analysis based on summary statistics. The dots represent the instrumental variable effects estimated based on each single nucleotide polymorphism (βIV), plotted against their precision (the standard error of the effect estimate, SEIV). The vertical solid gray line represents the MR-Egger intercept, the vertical gray dashed line represents the inverse variance weighted estimate. The plot resembles a symmetrical funnel suggesting there is no evidence for directional pleiotropy.

Table 1.

Two-sample MR results of the height-educational attainment analysis

Analysis methodEstimateSE95% CI-lower95% CI-upperP-value
IVW fixed effects0.040.0020.0360.0433 × 10−53
IVW random effects0.040.00470.0300.0492 × 10−17
MR-Egger regression fixed effects: slope0.030.0060.0180.0419 × 10−6
intercept0.00030.0001−0.000060.00060.10
MR-Egger regression random effects: slope0.030.0120.0060.0531.8 × 10−02
intercept0.00030.0003−0.00040.0010.39
Simple median0.0430.0050.0330.0522 × 10−13
Weighted median0.0370.0050.0270.0461 × 10−9
Penalized weighted median0.0330.0050.0230.0422 × 10−8
Simple mode0.0580.0270.0050.1102 × 10−2
Weighted mode0.0660.0230.0200.1111.8 × 10−3
Analysis methodEstimateSE95% CI-lower95% CI-upperP-value
IVW fixed effects0.040.0020.0360.0433 × 10−53
IVW random effects0.040.00470.0300.0492 × 10−17
MR-Egger regression fixed effects: slope0.030.0060.0180.0419 × 10−6
intercept0.00030.0001−0.000060.00060.10
MR-Egger regression random effects: slope0.030.0120.0060.0531.8 × 10−02
intercept0.00030.0003−0.00040.0010.39
Simple median0.0430.0050.0330.0522 × 10−13
Weighted median0.0370.0050.0270.0461 × 10−9
Penalized weighted median0.0330.0050.0230.0422 × 10−8
Simple mode0.0580.0270.0050.1102 × 10−2
Weighted mode0.0660.0230.0200.1111.8 × 10−3

CI, Confidence intervals; SE, standard error; IVW, inverse variance weighted model; MR, Mendelian randomization.

Table 1.

Two-sample MR results of the height-educational attainment analysis

Analysis methodEstimateSE95% CI-lower95% CI-upperP-value
IVW fixed effects0.040.0020.0360.0433 × 10−53
IVW random effects0.040.00470.0300.0492 × 10−17
MR-Egger regression fixed effects: slope0.030.0060.0180.0419 × 10−6
intercept0.00030.0001−0.000060.00060.10
MR-Egger regression random effects: slope0.030.0120.0060.0531.8 × 10−02
intercept0.00030.0003−0.00040.0010.39
Simple median0.0430.0050.0330.0522 × 10−13
Weighted median0.0370.0050.0270.0461 × 10−9
Penalized weighted median0.0330.0050.0230.0422 × 10−8
Simple mode0.0580.0270.0050.1102 × 10−2
Weighted mode0.0660.0230.0200.1111.8 × 10−3
Analysis methodEstimateSE95% CI-lower95% CI-upperP-value
IVW fixed effects0.040.0020.0360.0433 × 10−53
IVW random effects0.040.00470.0300.0492 × 10−17
MR-Egger regression fixed effects: slope0.030.0060.0180.0419 × 10−6
intercept0.00030.0001−0.000060.00060.10
MR-Egger regression random effects: slope0.030.0120.0060.0531.8 × 10−02
intercept0.00030.0003−0.00040.0010.39
Simple median0.0430.0050.0330.0522 × 10−13
Weighted median0.0370.0050.0270.0461 × 10−9
Penalized weighted median0.0330.0050.0230.0422 × 10−8
Simple mode0.0580.0270.0050.1102 × 10−2
Weighted mode0.0660.0230.0200.1111.8 × 10−3

CI, Confidence intervals; SE, standard error; IVW, inverse variance weighted model; MR, Mendelian randomization.

The Cochran’s Q test of heterogeneity64 detected heterogeneity (P <10–4). Here we assume this is due to balanced pleiotropy (i.e. the pleiotropy distribution has a zero mean); yet we note that heterogeneity may arise due to factors other than pleiotropy.62 The Rucker’s model selection framework (see Supplementary material, available as Supplementary data at IJE online, for details) showed there is no detectable directional pleiotropy, and indicated that an inverse variance weighted (IVW) random effects model adequately accommodated the heterogeneity present in the data (Qdiff(1) = 1.4; P = 0.2; where diff stands for the difference between the Cochran’s Q statistic—that tests for heterogeneity about the IVW model, and Rucker’s Q’ statistic—that tests for heterogeneity about the MR-Egger model). Consistent with these results, the MR-Egger intercept indicates that directional pleiotropy does not bias the estimate, and the Steiger test65 (which removes SNPs with potential reverse-causal effects) supports the conclusion that causation flows from height to EA. It can be seen from Figure 2 that all approaches based on summary statistics yielded consistent results.

Scatterplot of the effect sizes on height and educational attainment of each of the 931 independent single nucleotide polymorphisms. Effect sizes are extracted from the genome-wide association study (GWAS) of height and from the GWAS of educational attainment.
Figure 2

Scatterplot of the effect sizes on height and educational attainment of each of the 931 independent single nucleotide polymorphisms. Effect sizes are extracted from the genome-wide association study (GWAS) of height and from the GWAS of educational attainment.

The LCV model (see Supplementary material, available as Supplementary data at IJE online, for details), which is robust to sample overlap, found a genetic correlation of ρg = 0.15, and a genetic causality parameter GCP = 0.27 (SE = 0.08; z-score = 2.93; one-tailed P = 0.002), but did not reach the GCP value of at least 0.6 to claim robust positive finding.

MR-DoC: simulation and empirical results

Simulation results showed that non-random mate selection had no detectable effect on the causal effect estimate in the MR-DoC model when the pleiotropic effect is freely estimated (i.e. parameter b2 in Figure S2, available as Supplementary data at IJE online). The estimate of the causal effect (g1 in Figure S2, available as Supplementary data at IJE online) was close to its true value of zero (the statistical test of the hypothesis g1 = 0 was not rejected given alpha = 0.05). Interestingly, in the MR-DoC model with a non-pleiotropic polygenic score computed based on the 20 loci unique to height, non-random mating resulted in a pleiotropic effect (the statistical test of parameter b2 = 0 was rejected given alpha = 0.05). This suggests that assortative mating is accommodated by the b2 parameter.

The MR-DoC model17 explicitly estimates the amount of horizontal pleiotropy, but has to assume, for reasons of identification, that the unique environmental component influencing the exposure does not directly influence the outcome (other than through its effect on the exposure, i.e. re =0, see Figure S2, available as Supplementary data at IJE online). With the parameter re constrained to 0, we tested the causal effect of height on EA while explicitly modelling horizontal pleiotropy (estimating the direct effect of the instrumental variable on the outcome). MR-DoC yielded a null finding: the causal effect was not different from zero at alpha = 0.05 (g1 = 0.0019; SE = 0.002; standardized g1 = 0.017; SE = 0.025; likelihood ratio test χ2(1) = 0.46; P = 0.49). When we assumed there is no horizontal pleiotropy (the parameter b2 in Figure S1, available as Supplementary data at IJE online, was fixed to zero), we detected a causal effect of height on EA (g1 = 0.004; SE = 0.001; standardized g1 = 0.039; SE = 0.015; χ2(1) = 6.9, P = 0.0085). The test of the pleiotropic effect failed to reject the null hypothesis of ‘no pleiotropy’ (b2 = 0.01; SE = 0.01; χ2(1) = 1.08; P = 0.29). Constraining b2 to zero and testing whether re =0 suggested that the assumption is tenable (χ2(1) = 1.08; P = 0.29). However, the null findings concerning absence of non-shared environmental confounding and absence of directional pleiotropy may also reflect a lack of power (see Supplementary material, available as Supplementary data at IJE online). Importantly, constraining the parameter b2 to zero is not recommended in the light of ubiquity of pleiotropy across the human genome, and given that this parameter also accommodates potential non-random mating effects.

Discussion

We assessed the performance of several MR approaches in empirical data, in the presence of non-random mating. Recently, Hartwig and colleagues18 showed by means of simulations that non-random mating may bias the two-sample MR results based on summary statistics. Building on this work, the current study presented an empirical comparison of various one- and two-sample MR models when addressing the causal nature of the relationship between height and education (two traits extensively documented to lead to non-random mate selection). Previous studies addressing this relationship produced conflicting results. Tyrell et al.29 suggested that the observational association between height and educational attainment (indexed by a continuous variable representing time spent in full-time education) is causal in nature. Using MR-Egger regression and a TSLS model in the UK Biobank sample, they showed that taller individuals complete a larger number of years of formal education. Conversely, based on the IVW MR,15 Hagenaars et al.66 found no support for a causal effect flowing from height to education (indexed by a four-category variable representing degree level in the UK biobank). However, the authors noted that phenotype definition (i.e. using as the exposure a four-category variable instead of a continuous one as Tyrell et al. did), in conjunction with the use of weak genetic instruments may explain their null findings.66

Assuming all SNPs meet the three standard instrumental variable assumptions (see Introduction), the current one-sample TSLS results implied that height has a small positive causal effect on EA. This conclusion was reinforced by several alternative MR approaches robust to certain forms of pleiotropy, like IVW, median- and mode-based approaches and MR-Egger. Whereas the mode- and median-based approaches can each handle certain proportions of pleiotropic SNPs, MR-Egger explicitly tests for directional pleiotropy. The MR-Egger test showed there is no detectable directional pleiotropy. It is, however, important to note that the power of the MR-Egger test to detect pleiotropy may be low (see Table 1 in Bowden et al.14). We used heterogeneity tests as additional means to assess pleiotropy. Although pleiotropy might be present, we found that this is likely to be balanced, and may be adequately modelled by a random effects IVW model.

Sample overlap may affect the conclusions of summary statistics-based MR approaches, with large overlap biasing the causal effect estimate towards the observed association (because of weak instruments). The LCV model is robust to sample overlap and to other factors that may bias two-sample MR methods, like heritable confounding, differences in polygenicity or unequal sample sizes (see O’Connor and Price43 for details). This model provided weak evidence for the hypothesis that height affects education causally (genetic causality parameter GCP = 0.27; P = 0.002). Height and college completion are two of the traits studied by O'Connor and Price in their empirical applications in the UK Biobank.43 They found a genetic correlation between these traits of ρg = 0.17 and a GCP of 0.33. The positive GCP value lends some support to the hypothesis of height affecting education, but these values are rather too small to claim a robust positive finding. O’Connor and Price recommended absolute GCPs values larger than 0.6 as unlikely to be false-positives, and concluded that this low GCP value likely reflects shared developmental pathways rather than a direct causal effect. Further consideration of the GCP statistic supports this recommendation: a heritable latent confounder affecting height with a path coefficient of 0.55 and education with a path coefficient of 0.3 could yield a genetic correlation of ρg = 0.17 and a partial genetic causality parameter of GCP = 0.33. If the former coefficient were higher (say, at least 0.8), then one could reasonably argue that, although height is not literally a cause of education, it is nevertheless a very good proxy for some other variable that is a true cause.

The MR-DoC model is also useful when the confounding is predominantly genetic. The MR-DoC model has not detected a causal effect of height on EA (χ2(1) = 0.46; P = 0.49), unless we assumed directional pleiotropy was absent. The test of the pleiotropic effect could not reject the null hypothesis that the effect equals zero—a result consistent with the MR-Egger test result. The test of the parameter b2 suggests that the effect resulting from non-random mating is probably too weak to be picked up by the current sample. Notwithstanding this result, we argue against fixing this parameter to zero given the simulation results reported in Minică et al.16 (which showed that the estimate of the causal effect g1 is unbiased provided the pleiotropic effect b2 is freely estimated), and given the current simulation results (which showed assortative mating is accommodated by the b2 parameter). Estimating the b2 parameter might reduce power (if, indeed, pleiotropy is absent), but we rather err on the conservative side, given the ubiquity of pleiotropy across the human genome and given that non-random mating may presumably bias the causal effect if we do not model the directional pleiotropy path.

The usefulness of family data in addressing the non-random mating assumption was also demonstrated by Hartwig and colleagues.18 They showed, using data on trios, that height had no causal effect on education when correcting for possible effects of non-random mating (by using TSLS and adding the parental polygenic scores). However, as the authors acknowledge, it is worth noting that the small sample employed in this analysis may have resulted in imprecise estimates. The structural equation model developed by Warrington et al.,67 to estimate jointly maternal and fetal effects on birthweight, may also be extended to provide another means to address the potential bias induced by non-random mating.

Family data can additionally address bias in MR studies arising from dynastic/genetic nurture effects that may also confound the association between height and education.18,42,68 For instance, when available, trios data enable causality tests robust to dynastic effects;18,68 while providing a straightforward means to address dynastic effects and non-random mating; this approach requires trios data and needs to assume absence of horizontal pleiotropy. The within-family-based approach described by Brumpton et al.42 may also address dynastic effects. These within-family estimates of SNP outcome and SNP exposure effects obtained in separate sibling samples can be used, in a second step, with pleiotropy-robust summary data estimators; yet, as the authors note, these family-based estimates are likely to be underpowered to allow for both family structure and pleiotropy. It is important to note that the MR-DoC model in some scenarios has to make the strong assumption that there is no appreciable non-shared environmental confounding16 However, the model does account for environmental confounding shared by twins. In so doing, MR-DoC also controls for dynastic effects because twins share the environments created by parents. As twins share a womb and a household/neighbourhood, often until age 18, this is non-trivial. Dynastic effects are not controlled for in MR on unrelated individuals.69,70

To conclude, one should remain cautious when addressing the nature of the relationship between height and education as there are other confounders (beside horizontal pleiotropy) that may yield the observed association. This empirical example showed that non-random mating and, possibly, dynastic effects may generate data that are remarkably consistent with a causal model for the relationship between height and education. These factors—and others like fine-scale population structure42,71—may affect this relationship, and pleiotropy-robust methods are not necessarily robust to these forms of confounding. Structural equation models67 and MR models that exploit family data16,18,42 should enable tests of causal relationships to be conducted more rigorously, and are recommended to triangulate72 results of MR studies assessing pairs of complex traits leading to non-random mate selection.

Funding

This work was supported by the National Institute on Drug Abuse (grant number DA-018673).

Acknowledgements

We are grateful to our reviewers for their excellent comments, which substantially improved the paper. We are grateful to the Netherlands Twin Register participants whose data we analysed in this study. We thank the participants in the GIANT and the SSAG consortia for making the GWAS summary statistics for height and educational attainment publicly available.

Conflict of interest

None declared.

References

1

Hemani
G
,
Zheng
J
,
Elsworth
B
 et al.  
The MR-Base platform supports systematic causal inference across the human phenome
.
Elife
 
2018
;
7
:
e34408
.

2

Verbanck
M
,
Chen
C-Y
,
Neale
B
,
Do
R.
 
Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases
.
Nat Genet
 
2018
;
50
:
693
98
.

3

Davey Smith
G
,
Hemani
G.
 
Mendelian randomization: genetic anchors for causal inference in epidemiological studies
.
Hum Mol Genet
 
2014
;
23
:
R89
98
.

4

Evans
DM
,
Davey Smith
G.
 
Mendelian randomization: new applications in the coming age of hypothesis-free causality
.
Annu Rev Genom Hum Genet
 
2015
;
16
:
327
50
.

5

Davey

Smith
G
,
Lawlor
DA
,
Harbord
R
,
Timpson
N
,
Day
I
,
Ebrahim
S.
 
Clustered environments and randomized genes: a fundamental distinction between conventional and genetic epidemiology
.
PLoS Med
 
2007
;
4
:
e352
.

6

Davey Smith
G.
 
Use of genetic markers and gene-diet interactions for interrogating population-level causal influences of diet on health
.
Genes Nutr
 
2011
;
6
:
27
.

7

MacArthur
J
,
Bowler
E
,
Cerezo
M
 et al.  
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
.
Nucleic Acids Res
 
2017
;
45
:
D896
901
.

8

Burgess
S
,
Thompson
SG.
 
Mendelian Randomization: Methods for Using Genetic Variants in Causal Estimation
.
Boca Raton, FL
:
CRC Press
,
2015
.

9

Bulik-Sullivan
B
,
Finucane
HK
,
Anttila
V
 et al.  
An atlas of genetic correlations across human diseases and traits
.
Nat Genet
 
2015
;
47
:
1236
41
.

10

Chesmore
K
,
Bartlett
J
,
Williams
SM.
 
The ubiquity of pleiotropy in human disease
.
Hum Genet
 
2018
;
137
:
1236
41
.

11

Visscher
PM
,
Yang
J.
 
A plethora of pleiotropy across complex traits
.
Nat Genet
 
2016
;
48
:
707
08
.

12

Pickrell
JK
,
Berisa
T
,
Liu
JZ
,
Ségurel
L
,
Tung
JY
,
Hinds
DA.
 
Detection and interpretation of shared genetic influences on 42 human traits
.
Nat Genet
 
2016
;
48
:
709
17
.

13

Burgess
S
,
Dudbridge
F
,
Thompson
SG.
 
Re:“Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects”
.
Am J Epidemiol
 
2015
;
181
:
290
91
.

14

Bowden
J
,
Davey Smith
G
,
Burgess
S.
 
Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression
.
Int J Epidemiol
 
2015
;
44
:
512
25
.

15

Bowden
J
,
Davey Smith
G
,
Haycock
PC
,
Burgess
S.
 
Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator
.
Genet Epidemiol
 
2016
;
40
:
304
14
.

16

Minică
CC
,
Dolan
CV
,
Boomsma
DI
,
de Geus
E
,
Neale
MC.
 
Extending causality tests with genetic instruments: an integration of Mendelian randomization and the classical twin design
.
bioRxiv
 
2017
:
134585
.

17

Minică
CC
,
Dolan
CV
,
Boomsma
DI
,
de Geus
E
,
Neale
MC.
 
Extending causality tests with genetic instruments: an integration of Mendelian randomization with the classical twin design
.
Behav Genet
 
2018
;
48
:
337–
49
.

18

Hartwig
FP
,
Davies
NM
,
Davey Smith
G.
 
Bias in Mendelian randomization due to assortative mating
.
Genet Epidemiol
 
2018
;
42
:
608
20
.

19

Courtiol
A
,
Raymond
M
,
Godelle
B
,
Ferdy
JB.
 
Mate choice and human stature: homogamy as a unified framework for understanding mating preferences
.
Evolution
 
2010
;
64
:
2189
203
.

20

Hugh-Jones
D
,
Verweij
KJ
,
Pourcain
BS
,
Abdellaoui
A.
 
Assortative mating on educational attainment leads to genetic spousal resemblance for polygenic scores
.
Intelligence
 
2016
;
59
:
103
–0
8
.

21

Yengo
L
,
Robinson
MR
,
Keller
MC
 et al.  
Imprint of assortative mating on the human genome
.
Nat Hum Behav
 
2018
;
2
:
948
54
.

22

Keller
MC
,
Garver-Apgar
CE
,
Wright
MJ
 et al.  
The genetic correlation between height and IQ: shared genes or assortative mating?
 
PLoS Genet
 
2013
;
9
:
e1003451
.

23

Agrawal
A
,
Heath
AC
,
Grant
JD
 et al.  
Assortative mating for cigarette smoking and for alcohol consumption in female Australian twins and their spouses
.
Behav Genet
 
2006
;
36
:
553
66
.

24

Merikangas
KR
,
Risch
N.
 
Genomic priorities and public health
.
Science
 
2003
;
302
:
599
601
.

25

Magnusson
PK
,
Rasmussen
F
,
Gyllensten
UB.
 
Height at age 18 years is a strong predictor of attained education later in life: cohort study of over 950 000 Swedish men
.
Int J Epidemiol
 
2006
;
35
:
658
63
.

26

Huang
Y
,
van Poppel
F
,
Lumey
LH.
 
Differences in height by education among 371,105 Dutch military conscripts
.
Econ Hum Biol
 
2015
;
17(Suppl C
):
202
07
.

27

Gorry
D.
 
The influence of height on academic outcomes
.
Econ Educ Rev
 
2017
;
56(Suppl C
):
1
8
.

28

Case
A
,
Paxson
C.
 
Causes and consequences of early-life health
.
Demography
 
2010
;
47
:
S65
85
.

29

Tyrrell
J
,
Jones
SE
,
Beaumont
R
 et al.  
Height, body mass index, and socioeconomic status: Mendelian randomisation study in UK Biobank
.
BMJ
 
2016
;
352
:
i582
.

30

Carvalho
L.
 
Childhood circumstances and the intergenerational transmission of socioeconomic status
.
Demography
 
2012
;
49
:
913
38
.

31

Vogl
TS.
Education and health in developing economies. In: Encyclopedia of Health Economics. Amsterdam: Elsevier, 2012, pp. 246–9.

32

Silventoinen
K
,
Posthuma
D
,
Van Beijsterveldt
T
,
Bartels
M
,
Boomsma
DI.
 
Genetic contributions to the association between height and intelligence: evidence from Dutch twin data from childhood to middle age
.
Genes Brain Behav
 
2006
;
5
:
585
95
.

33

Marioni
RE
,
Batty
GD
,
Hayward
C
 et al.  
Common genetic variants explain the majority of the correlation between height and intelligence: the Generation Scotland study
.
Behav Genet
 
2014
;
44
:
91
–9
6
.

34

Black
RE
,
Allen
LH
,
Bhutta
ZA
 et al.  
Maternal and child undernutrition: global and regional exposures and health consequences
.
Lancet
 
2008
;
371
:
243
60
.

35

Perkins
JM
,
Subramanian
SV
,
Davey Smith
G
,
Özaltin
E.
 
Adult height, nutrition, and population health
.
Nutr Rev
 
2016
;
74
:
149
65
.

36

Posthuma
D
,
De
GE
,
Neale
M
 et al.  
Multivariate genetic analysis of brain structure in an extended twin design
.
Behav Genet
 
2000
;
30
:
311
19
.

37

Posthuma
D
,
De Geus
EJ
,
Baaré
WF
,
Pol
HEH
,
Kahn
RS
,
Boomsma
DI.
 
The association between brain volume and intelligence is of genetic origin
.
Nat Neurosci
 
2002
;
5
:
83
–8
4
.

38

Rushton
JP
,
Ankney
CD.
 
Whole brain size and general mental ability: a review
.
Int J Neurosci
 
2009
;
119
:
692
732
.

39

Deary
IJ
,
Penke
L
,
Johnson
W.
 
The neuroscience of human intelligence differences
.
Nat Rev Neurosci
 
2010
;
11
:
201
11
.

40

McDaniel
MA.
 
Big-brained people are smarter: a meta-analysis of the relationship between in vivo brain volume and intelligence
.
Intelligence
 
2005
;
33
:
337
46
.

41

Okbay
A
,
Beauchamp
JP
,
Fontana
MA
 et al.  
Genome-wide association study identifies 74 loci associated with educational attainment
.
Nature
 
2016
;
533
:
539
42
.

42

Brumpton
B
,
Sanderson
E
,
Hartwig
FP
 et al.  
Within-family studies for Mendelian randomization: avoiding dynastic, assortative mating, and population stratification biases
.
bioRxiv
 
2019
:
602516
.

43

O’Connor
LJ
,
Price
AL.
 
Distinguishing genetic correlation from causation across 52 diseases and complex traits
. Nat Genet  
2018
;
50
:
1728
34
.

44

Burgess
S
,
Davies
NM
,
Thompson
SG.
 
Bias due to participant overlap in two‐sample Mendelian randomization
.
Genet Epidemiol
 
2016
;
40
:
597
608
.

45

Willemsen
G
,
De Geus
EJ
,
Bartels
M
 et al.  
The Netherlands Twin Register biobank: a resource for genetic epidemiological studies
.
Twin Res Hum Genet
 
2010
;
13
:
231
45
.

46

Willemsen
G
,
Vink
JM
,
Abdellaoui
A
 et al.  
The Adult Netherlands Twin Register: twenty-five years of survey and biological data collection
.
Twin Res Hum Genet
 
2013
;
16
:
271
81
.

47

Donaldson
GW.
 
Ridit scores for analysis and interpretation of ordinal pain data
.
Eur J Pain
 
1998
;
2
:
221
27
.

48

Bross
ID.
 
How to use ridit analysis
.
Biometrics
 
1958
;
14
:
18
38
.

49

van Dongen
J
,
Bonder
MJ
,
Dekkers
KF
 et al.  
DNA methylation signatures of educational attainment
.
NPJ Sci Learn
 
2018
;
3
:
7
.

50

Boomsma
DI
,
Wijmenga
C
,
Slagboom
EP
 et al.  
The Genome of the Netherlands: design, and project goals
.
Eur J Hum Genet
 
2014
;
22
:
221
27
.

51

Fedko
IO
,
Hottenga
J-J
,
Medina-Gomez
C
 et al.  
Estimation of genetic relationships between individuals across cohorts and platforms: application to childhood height
.
Behav Genet
 
2015
;
45
:
514
28
.

52

Francioli
LC
,
Menelaou
A
,
Pulit
SL
 et al.  
Whole-genome sequence variation, population structure and demographic history of the Dutch population
.
Nat Genet
 
2014
;
46
:
818
25
.

53

Liu
EY
,
Li
M
,
Wang
W
,
Li
Y.
 
MaCH‐Admix: genotype imputation for admixed populations
.
Genet Epidemiol
 
2013
;
37
:
25
37
.

54

Carey
VJ
,
Lumley
T
,
Ripley
B.
gee: Generalized Estimation Equation solver, R package version 4.13-18.
2012
. http://CRAN.R-project.org/package=gee (2017, date last accessed).

55

Dobson
A.
 
An Introduction to Generalized Linear Models
(
Chatfield
C
,
Zidek
J
, eds).
London
:
Chapman & Hall/CRC
,
2002
.

56

Minică
C
,
Boomsma
D
,
Vink
J
,
Dolan
C.
 
MZ twin pairs or MZ singletons in population family-based GWAS? More power in pairs
.
Mol Psychiatry
 
2014
;
19
:
1154
55
.

57

Minică
CC
,
Dolan
CV
,
Kampert
MM
,
Boomsma
DI
,
Vink
JM.
 
Sandwich corrected standard errors in family-based genome-wide association studies
.
Eur J Hum Genet
 
2015
;
23
:
388
94
.

58

Wood
AR
,
Esko
T
,
Yang
J
 et al.  
Defining the role of common variation in the genomic and biological architecture of adult human height
.
Nat Genet
 
2014
;
46
:
1173
86
.

59

Abdellaoui
A
,
Hottenga
J-J
,
de Knijff
P
 et al.  
Population structure, migration, and diversifying selection in the Netherlands
.
Eur J Hum Genet
 
2013
;
21
:
1277
285
.

60

Hartwig
FP
,
Davey Smith
G
,
Bowden
J.
 
Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption
.
Int J Epidemiol
 
2017
;
46
:
1285
98
.

61

Bowden
J
,
Del Greco
MF
,
Minelli
C
,
Davey Smith
G
,
Sheehan
NA
,
Thompson
JR.
 
Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I 2 statistic
.
Int J Epidemiol
 
2016
;
45
:
1961
974
.

62

Bowden
J
,
Del Greco
MF
,
Minelli
C
,
Davey Smith
G
,
Sheehan
N
,
Thompson
J.
 
A framework for the investigation of pleiotropy in two‐sample summary data Mendelian randomization
.
Stat Med
 
2017
;
36
:
1783
802
.

63

Frankham
R.
 Introduction to Quantitative Genetics. 4th edn. (
Falconer
DS
,
Trudy
FC
,
Longman
M
, eds).
New York, NY
:
Elsevier Current Trends
,
1996
.

64

Greco M
FD
,
Minelli
C
,
Sheehan
NA
,
Thompson
JR.
 
Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome
.
Stat Med
 
2015
;
34
:
2926
40
.

65

Hemani
G
,
Tilling
K
,
Smith
G.
 
Orienting the causal relationship between imprecisely measured traits using GWAS summary data
.
PLoS Genet
 
2017
;
13
:
e1007081
.

66

Hagenaars
SP
,
Gale
CR
,
Deary
IJ
,
Harris
SE.
 
Cognitive ability and physical health: a Mendelian randomization study
.
Sci Rep
 
2017
;
7
:
2651
.

67

Warrington
NM
,
Freathy
RM
,
Neale
MC
,
Evans
DM.
 
Using structural equation modelling to jointly estimate maternal and fetal effects on birthweight in the UK Biobank
.
Int J Epidemiol
 
2018
;
47
:
1229
41
.

68

Kong
A
,
Thorleifsson
G
,
Frigge
ML
 et al.  
The nature of nurture: Effects of parental genotypes
.
Science
 
2018
;
359
:
424
28
.

69

Fletcher
JM
,
Lehrer
SF.
 
Genetic lotteries within families
.
J Health Econ
 
2011
;
30
:
647
59
.

70

Pingault
J-B
,
Cecil
CA
,
Murray
J
,
Munafò
MR
,
Viding
E.
 
Causal inference in psychopathology: a systematic review of Mendelian randomisation studies aiming to identify environmental risk factors for psychopathology
.
Psychopathol Rev
 
2017
;
4
:
4
25
.

71

Abdellaoui
A
,
Hugh-Jones
D
,
Yengo
L
 et al.  
Genetic correlates of social stratification in Great Britain
.
Nat Hum Behav
 
2019
;
3
:
1332
42
.

72

Munafò
MR
,
Smith
G.
 
Robust research needs many lines of evidence
.
Nature
 
2018
;
553
:
399–401
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Supplementary data