Abstract

Background

Attention-deficit/hyperactivity disorder (ADHD) is one of the most commonly diagnosed mental disorders in children. For many patients, treatment involves long-term medication in order to reduce symptoms, regulate behaviour, and, hopefully, improve school performance and achievement. However, there is little to no evidence to support a long-term effect on the latter complex outcomes.

Methods

We utilize a target trial framework to emulate a pretest–posttest control group design and estimate the intention-to-treat effect of ADHD medication on national test scores in children diagnosed with ADHD born between 2000 and 2007 in Norway. The data were obtained through linkage of Norwegian registries (NorPD, Norwegian Prescription Database; NPR, Norwegian Patient Registry; KUHR, Database for Control and Payment of Health Reimbursement; SSB, Statistics Norway; MBRN, Medical Birth Registry of Norway).

Results

The resulting analytic sample size consisted of 8548 children diagnosed with ADHD, with about 9% missingness in their grade eight national test scores. We find that initiating ADHD medication had a slight positive average effect on national test scores for all three domains: English, numeracy, and reading [standardized mean differences: 0.037 (95% compatibility interval (CI95), −0.003; 0.076), 0.063 (CI95, 0.016; 0.111), 0.071 (CI95, 0.030; 0.111), respectively].

Conclusion

We conclude that the estimated long-term average effect of ADHD medication on learning, as measured by the Norwegian national tests, is not clinically relevant. Study strengths include the use of real-world data on ecologically valid and relevant outcomes and the robustness of results across model specifications. Limitations include possibility of unobserved confounding and lack of prescription data.

Key Messages
  • We examine the long-term academic effects of ADHD medication using Norwegian national registry data of all children diagnosed with ADHD born between 2000 and 2007.

  • Children who initiated pharmaceutical treatment within a three-month window two years before the grade eight national test had slightly higher scores, although the improvement represented only a small fraction of the achievement gap between children diagnosed with ADHD and the general children population.

  • These results contribute important evidence to the limited research on long-term ADHD treatment effects using real-world data, informing treatment decisions based on academic expectations.

Introduction

Attention-deficit hyperactivity disorder (ADHD) is defined as a neurodevelopmental disorder characterized by age-inappropriate inattention or hyperactivity–impulsivity, or both [1]. In the long term, ADHD patients tend to show negative life outcomes including lower academic outcomes [2–5], higher criminal behaviour [6], financial distress [7], and suicide risk [7, 8]. ADHD medication has been shown to ameliorate inattention and hyperactivity/impulsivity symptoms in the short-term [9], so it is hypothesized it might have positive indirect effects on academic outcomes. To our knowledge, only the multimodal treatment of ADHD (MTA) randomized clinical trial (RCT) [10–13] has considered the long-term effect of medication on standardized achievement tests. The exploratory analysis found a small improvement for reading achievement scores.

The Cochrane review by Størebo et al. [14] suggested possible benefits of ADHD medication involved only teacher-rated behaviour and symptoms as well as parent-reported quality of life, while harms involved mostly sleep problems and decreased appetite. Similar results were found by a later review of non-randomized studies [15]. A recent crossover trial found that even in the short-term, the improved classroom behaviour and seat work productivity did not translate into improved learning [16]. A systematic review by Arnold et al. [17] found that in the case of long-term academic achievement, most studies evaluating pharmacological treatments reported benefits. However, another review considers these benefits to be educationally negligible, given the small effect sizes [18].

Observational studies complement results from such RCTs, especially their limited treatment durations, follow-up periods, or specific outcomes under study. For example, the Swedish national registry study of Jangmo et al. [3], which found that dispensations of ADHD medication had effects on grade point average, eligibility, and completion of upper secondary school and teacher assessment of performance, also in the long-term (three years). However, improvements in teacher assessed performance were not reflected in better standardized test scores. Likewise, the Danish registry study by Keilow et al. [19] leverages medication non-response as a natural experiment to estimate the effect of full or partial discontinuation of pharmaceutical treatment in grade point average. They found a negative effect of discontinuation of 0.22 SDs.

Recently proposed, the target trial framework helps avoid common ‘self-inflicted injuries’ in the causal analysis of observational data [20–24]. Researchers are required to explicitly state the hypothetical trial they want to emulate with an observational study, specifying eligibility criteria, treatment strategies, follow-up period, and a time-zero, among others. This prevents common pitfalls such as using post-treatment information for assignment, or estimating non-causal contrasts such as always-takers versus never-takers (rather than initiators versus non-initiators). To our knowledge, only a couple of applications to ADHD research exist: a study on ADHD medication initiation and mortality [25] and a study on the effect of diagnosis on quality of life [26].

In summary, evidence weakly supports the hypothesis that long-term medication will meaningfully improve academic achievement. Contrastingly, academic improvement might be one of the main drivers of parent’s decision to treat [27]. There is also evidence for the adverse effects of medication on children. Better quality of evidence is thus required so that clinicians, parents, and users can take informed decisions about the trade-offs at stake, and carefully analysed observational data can contribute to the body of evidence.

The present article’s main objective is to estimate the long-term causal effect of initiating ADHD medication on national test scores in the domains of English, numeracy, and reading in Norwegian children diagnosed with ADHD. The causal estimand corresponds to the intention-to-treat effect, i.e. an average treatment effect of initiation in a given period.

A secondary objective is to explore heterogeneity of the effects across the different domains of national test scores and subgroups.

Methods

This article follows the STROBE reporting guidelines for observational studies and the RECORD/RECORD-PE extensions for routinely collected health data [28–30], respectively. All code used for analysis is available on the GitHub repository at https://github.com/tvarnetperez/TTE_ADHD.

RCT literature review

To our knowledge, the MTA trial [10–13] was the only pertinent RCT, though our available data were incompatible with its emulation. Thus, an ideal target trial following a pretest–posttest control group design was devised, which was adapted into the feasible target trial that our data allow.

Data source

The study included person-, family-, and school-level variables, linked from multiple data sources: the Norwegian Prescription Database, the Norwegian Patient Registry, the Database for Control and Payment of Health Reimbursement, Statistics Norway, and the Medical Birth Registry of Norway.

Target trial protocol

The protocol is summarized in the design diagram in Fig. 1.

Diagram showing a timeline of events together with milestones, starting with grade five national test, then with time-zero and finishing with grade eight national test. Different subjects with different times for attention-deficit/hyperactivity disorder diagnosis and attention-deficit/hyperactivity disorder pharmaceutical dispensation are shown.
Figure 1.

Design diagram for the Norwegian registries target trial. Time-zero was fixed at one year after the grade five national test (or ‘pretest’). A washout period is a period after which we expect previous interventions have become ineffective. For the main analysis, a complete washout period was considered, i.e. any medication before time-zero was reason for exclusion. For the sensitivity analysis with no-washout period, estimated previous medication was included in the model as a covariate. ‘Long-term’ cutoff refers to the minimum period after which an effect is considered to be a long-term effect, which we set at one year and nine months before the grade eight national test (or ‘posttest’). This results in a 3-month grace period. The ‘post-exam diagnosis window’ was chosen as to include a significant portion of children diagnosed with attention-deficit/hyperactivity disorder. In the diagram we can see that: Subject A is included in the study and assigned to the ‘initiated’ or ‘treated’ condition. Subject B is included in the study and assigned to the ‘control’ condition. Subject C is excluded in the main analysis. They had a dispensation within the medication-free washout period before the national test. Subject D is excluded. They are diagnosed with attention-deficit/hyperactivity disorder after time-zero.

Eligibility criteria

Eligibility criteria were evaluated only once for all participants, at time-zero. The study included Norwegian-born children diagnosed with ADHD by a specialist or across four general practitioner (GP) visits (reducing misclassification from symptomatic concerns). Children with diagnoses of severe comorbid disorders such as autism, epilepsy, and Down’s syndrome were excluded, as well as children who did not have any national test score at grade five. Children who were medicated before time-zero were excluded, i.e. a complete washout period. Figure 2 details the exclusion steps.

Flow diagram for selection of included children in the emulated target trial analytic samples. It starts with about 550 thousand children living in Norway with birth years between 2000-2007 and ends with about 8 thousand five hundred eligible children in the analytic sample (after exclusions based on being born outside Norway, not having an ADHD diagnosis, having a serious comorbid disorder, missing data on both parents, missing scores for all grade five national tests, missing non-imputed covariates and having been previously medicated). These are further assigned into the emulated treatment arm, with 7159 children and the emulated control arm, with 1299 children.
Figure 2.

Flow diagram for selection of included children in the emulated target trial analytic samples. ADHD: attention-deficit/hyperactivity disorder.

Treatment strategies

The ‘treatment’ or ‘as-started’ arm [22] corresponds to having been dispensed to their name an ADHD medication from a pharmacy in a grace period [21] between 365 and 455 days after their grade five national test. In the main analysis, this strictly means the intent to initiate a pharmaceutical treatment regime, whereas in the sensitivity analysis (with no-washout period), it can also mean to continue a regime.

The ‘control’ arm corresponds to not having been dispensed any ADHD medication from a pharmacy during the same grace period.

Assignment procedures

Randomization is emulated via model-based adjustment for potential confounders as measured until time-zero, i.e. baseline confounders.

Time-zero

We have defined time-zero as 365 days after the grade five national test. As can be seen in Supplementary Fig. S2, 365 days after the grade five national test, 81.11% of children that would receive a diagnosis before the grade eight national test had already received it.

Follow-up period

The follow-up period lasted until the (approximated) grade eight national test date, with an additional 30-day margin of error to tally children both diagnosed and medicated within the study period.

Outcomes

The outcomes are the standardized grade eight national test scores for all three domains: numeracy, reading, and English.

Causal contrasts of interest

The ‘as-started’ effect, the target trial emulation equivalent of an ‘intention-to-treat’ effect [22].

Analysis plan

A more general dataset containing 38 215 children was used to determine the time-zero cutoff (see Supplementary Fig. S2). This included dates of ADHD diagnosis for all children at any time.

Application of exclusion criteria except for washout period results in 12 541 observations. This is the dataset used for determining the grace-period cutoff (see Supplementary Fig. S3).

This eligible sample is further reduced into an analytic sample of 8458 observations by only considering children with a complete washout period—i.e. no ADHD medication use before time-zero and by listwise deletion of unimputed covariates, namely parents’ civil status at grades five and six (due to low cell count) and school ID at grades five and eight. The main analysis results and the descriptive tables are based on this dataset. The sensitivity analysis sample without the washout period exclusion results in 11 835 observations. Descriptive tables stratified by washout period are included in Supplementary Material, Section C.

Loss to follow-up (i.e. missing data in the posttest) was addressed through multiple imputation. Details are available in the shared analysis code.

For the main analysis, we aimed to estimate three estimands: sample average treatment effect (SATE), sample average treatment effect on the treated (SATT), and sample average treatment effect on the untreated (SATU). Detailed estimand descriptions are available in Supplementary Material, Section D. We employed the g-formula [20] with a flexible semiparametric model using Bayesian Additive Regression Trees (BART [31, 32]), allowing for a variety of functional forms and interactions among candidate confounders and the treatment. The model specification was as follows:
where Yi is the national test score at grade eight for individual i; Ai is a binary indicator of whether individual i initiated treatment or not; Wi is a vector of measured baseline covariates; γjg{5,8} are random intercepts at grade g for school j; and δkg{5,8} are random slopes at grade g for school k.

Adjustment for observed confounders in the main analysis was done following a ‘pretreatment criterion’ [33, 34]—based on domain knowledge, a set of baseline variables measured before medication dispensation initiation was selected. The covariates included in the BART term were: grade five national test scores for English, numeracy, and reading and respective missingness indicators; civil status of both mother and father at grades five and six; mother’s highest completed education; mother’s age at birth, pregnancy length, and parity; pupil’s sex and birth weight; birth year and month; a binary indicator for having at least one registration for ADHD for both parents; number of registered visits for an ADHD diagnosis, internalizing disorders, externalizing disorders, and any diagnosis prior to time-zero for both parents and the child, number of registered visits for sleep disturbances and specific learning problems prior to time-zero for the child, days medicated before time-zero, and a binary indicator for whether they had been medicated prior to time-zero (the latter two for the no-washout sample). The look-back period was at minimum six years and up to 12 years depending on the registry. Further variable descriptions are available in Supplementary Material, Section A.

All analyses were conducted using R version 4.2.3 [35], with the main analysis based on the package stan4bart [36].

Sensitivity analyses

Our main sensitivity analysis includes the pupils that had taken ADHD medication before time-zero (i.e. no-washout period). In this manner, the target population of children diagnosed with ADHD is broadened. Additional sensitivity analyses, including negative controls [37] and proximal causal inference [38, 39] are available in Supplementary Material, Section E.

Due to apparently conflicting results with previous studies, we conducted a post hoc informal replication of the analytic strategy of [19] on our data.

Results

Participants

The flow diagram in Fig. 2 summarizes the number of participants and respective exclusion criteria.

Descriptive data

Descriptive statistics will be presented only for the washout sample. Table 1 shows frequencies across emulated treatment arm for different categorical variables. Emulated arms are relatively comparable—only medication variables have standardized mean differences exceeding an absolute value of 0.1.

Table 1.

Frequencies of demographic, clinical, and academic factor variables in the washout analytic sample (n = 8458) stratified by emulated treatment arm

VariableControlInitiatedTotal
Sex
 Female29.6% (n = 385)25.5% (n = 1827)26.2% (n = 2212)
 Male70.4% (n = 914)74.5% (n = 5332)73.8% (n = 6246)
Birth month
 16.9% (n = 89)7.3% (n = 521)7.2% (n = 610)
 27.2% (n = 93)7.2% (n = 513)7.2% (n = 606)
 37.3% (n = 95)7.7% (n = 554)7.7% (n = 649)
 47.9% (n = 102)7.4% (n = 527)7.4% (n = 629)
 57.5% (n = 97)8.2% (n = 585)8.1% (n = 682)
 69.2% (n = 119)8.3% (n = 595)8.4% (n = 714)
 78.5% (n = 110)9.1% (n = 652)9% (n = 762)
 89.2% (n = 119)9.2% (n = 659)9.2% (n = 778)
 910% (n = 130)9.1% (n = 648)9.2% (n = 778)
 108.8% (n = 114)9% (n = 642)8.9% (n = 756)
 118.9% (n = 116)8.8% (n = 631)8.8% (n = 747)
 128.9% (n = 115)8.8% (n = 632)8.8% (n = 747)
Birth year
 200012.2% (n = 159)11.6% (n = 832)11.7% (n = 991)
 200112.6% (n = 164)11.2% (n = 805)11.5% (n = 969)
 200211.5% (n = 149)11.9% (n = 854)11.9% (n = 1003)
 200313.5% (n = 175)12.6% (n = 905)12.8% (n = 1080)
 200413.7% (n = 178)13% (n = 928)13.1% (n = 1106)
 200512.9% (n = 168)12.2% (n = 875)12.3% (n = 1043)
 200611.2% (n = 146)13.4% (n = 958)13.1% (n = 1104)
 200712.3% (n = 160)14% (n = 1002)13.7% (n = 1162)
Parity
 047% (n = 610)45.2% (n = 3234)45.4% (n = 3844)
 133.6% (n = 436)34.2% (n = 2450)34.1% (n = 2886)
 212.8% (n = 166)15.1% (n = 1082)14.8% (n = 1248)
 34.9% (n = 64)4.1% (n = 293)4.2% (n = 357)
 41.2% (n = 16)0.9% (n = 63)0.9% (n = 79)
 5 or higher0.5% (n = 7)0.5% (n = 37)0.5% (n = 44)
No. of siblings with any health diagnosis
 081.6% (n = 1060)77.9% (n = 5577)78.5% (n = 6637)
 116.4% (n = 213)20% (n = 1434)19.5% (n = 1647)
 2 or more2% (n = 26)2.1% (n = 148)2.1% (n = 174)
Mother’s civil status at grade six
 Married or in partnership46.3% (n = 601)48.9% (n = 3500)48.5% (n = 4101)
 Divorced or separated16.6% (n = 215)16.8% (n = 1203)16.8% (n = 1418)
 Unmarried37.2% (n = 483)34.3% (n = 2452)34.7% (n = 2935)
 Widow or Widower0.1% (n = 4)0% (n = 4)
Father’s civil status at grade six
 Married or in partnership45.9% (n = 596)48.7% (n = 3490)48.3% (n = 4086)
 Divorced or separated17.4% (n = 226)17% (n = 1217)17.1% (n = 1443)
 Unmarried36.6% (n = 476)34.2% (n = 2450)34.6% (n = 2926)
 Widow or Widower0.1% (n = 1)0% (n = 2)0% (n = 3)
Proportion of missing posttests
 081.5% (n = 1059)85.1% (n = 6089)84.5% (n = 7148)
 ⅓8.9% (n = 116)7.2% (n = 517)7.5% (n = 633)
 ⅔4% (n = 52)3% (n = 218)3.2% (n = 270)
 15.5% (n = 72)4.7% (n = 335)4.8% (n = 407)
Missing English (grade five) score
 Non-missing77.8% (n = 1011)79.8% (n = 5712)79.5% (n = 6723)
Missing reading (grade five) score
 Non-missing89.5% (n = 1163)91% (n = 6517)90.8% (n = 7680)
Missing numeracy (grade five) score
 Non-missing93.9% (n = 1220)95.3% (n = 6819)95% (n = 8039)
Missing English (grade eight) score
 Non-missing88.1% (n = 1145)90.4% (n = 6471)90% (n = 7616)
Missing reading (grade eight) score
 Non-missing89.5% (n = 1162)90.9% (n = 6511)90.7% (n = 7673)
Missing numeracy (grade eight) score
 Non-missing88.8% (n = 1154)91.3% (n = 6537)90.9% (n = 7691)
VariableControlInitiatedTotal
Sex
 Female29.6% (n = 385)25.5% (n = 1827)26.2% (n = 2212)
 Male70.4% (n = 914)74.5% (n = 5332)73.8% (n = 6246)
Birth month
 16.9% (n = 89)7.3% (n = 521)7.2% (n = 610)
 27.2% (n = 93)7.2% (n = 513)7.2% (n = 606)
 37.3% (n = 95)7.7% (n = 554)7.7% (n = 649)
 47.9% (n = 102)7.4% (n = 527)7.4% (n = 629)
 57.5% (n = 97)8.2% (n = 585)8.1% (n = 682)
 69.2% (n = 119)8.3% (n = 595)8.4% (n = 714)
 78.5% (n = 110)9.1% (n = 652)9% (n = 762)
 89.2% (n = 119)9.2% (n = 659)9.2% (n = 778)
 910% (n = 130)9.1% (n = 648)9.2% (n = 778)
 108.8% (n = 114)9% (n = 642)8.9% (n = 756)
 118.9% (n = 116)8.8% (n = 631)8.8% (n = 747)
 128.9% (n = 115)8.8% (n = 632)8.8% (n = 747)
Birth year
 200012.2% (n = 159)11.6% (n = 832)11.7% (n = 991)
 200112.6% (n = 164)11.2% (n = 805)11.5% (n = 969)
 200211.5% (n = 149)11.9% (n = 854)11.9% (n = 1003)
 200313.5% (n = 175)12.6% (n = 905)12.8% (n = 1080)
 200413.7% (n = 178)13% (n = 928)13.1% (n = 1106)
 200512.9% (n = 168)12.2% (n = 875)12.3% (n = 1043)
 200611.2% (n = 146)13.4% (n = 958)13.1% (n = 1104)
 200712.3% (n = 160)14% (n = 1002)13.7% (n = 1162)
Parity
 047% (n = 610)45.2% (n = 3234)45.4% (n = 3844)
 133.6% (n = 436)34.2% (n = 2450)34.1% (n = 2886)
 212.8% (n = 166)15.1% (n = 1082)14.8% (n = 1248)
 34.9% (n = 64)4.1% (n = 293)4.2% (n = 357)
 41.2% (n = 16)0.9% (n = 63)0.9% (n = 79)
 5 or higher0.5% (n = 7)0.5% (n = 37)0.5% (n = 44)
No. of siblings with any health diagnosis
 081.6% (n = 1060)77.9% (n = 5577)78.5% (n = 6637)
 116.4% (n = 213)20% (n = 1434)19.5% (n = 1647)
 2 or more2% (n = 26)2.1% (n = 148)2.1% (n = 174)
Mother’s civil status at grade six
 Married or in partnership46.3% (n = 601)48.9% (n = 3500)48.5% (n = 4101)
 Divorced or separated16.6% (n = 215)16.8% (n = 1203)16.8% (n = 1418)
 Unmarried37.2% (n = 483)34.3% (n = 2452)34.7% (n = 2935)
 Widow or Widower0.1% (n = 4)0% (n = 4)
Father’s civil status at grade six
 Married or in partnership45.9% (n = 596)48.7% (n = 3490)48.3% (n = 4086)
 Divorced or separated17.4% (n = 226)17% (n = 1217)17.1% (n = 1443)
 Unmarried36.6% (n = 476)34.2% (n = 2450)34.6% (n = 2926)
 Widow or Widower0.1% (n = 1)0% (n = 2)0% (n = 3)
Proportion of missing posttests
 081.5% (n = 1059)85.1% (n = 6089)84.5% (n = 7148)
 ⅓8.9% (n = 116)7.2% (n = 517)7.5% (n = 633)
 ⅔4% (n = 52)3% (n = 218)3.2% (n = 270)
 15.5% (n = 72)4.7% (n = 335)4.8% (n = 407)
Missing English (grade five) score
 Non-missing77.8% (n = 1011)79.8% (n = 5712)79.5% (n = 6723)
Missing reading (grade five) score
 Non-missing89.5% (n = 1163)91% (n = 6517)90.8% (n = 7680)
Missing numeracy (grade five) score
 Non-missing93.9% (n = 1220)95.3% (n = 6819)95% (n = 8039)
Missing English (grade eight) score
 Non-missing88.1% (n = 1145)90.4% (n = 6471)90% (n = 7616)
Missing reading (grade eight) score
 Non-missing89.5% (n = 1162)90.9% (n = 6511)90.7% (n = 7673)
Missing numeracy (grade eight) score
 Non-missing88.8% (n = 1154)91.3% (n = 6537)90.9% (n = 7691)

‘Initiated’ refers to having been dispensed to their name an attention-deficit/hyperactivity disorder medication from a pharmacy in a grace period between 365 and 455 days after their grade five national test. ‘Control’ refers to not having been dispensed any attention-deficit/hyperactivity disorder medication from a pharmacy during the same grace period.

Table 1.

Frequencies of demographic, clinical, and academic factor variables in the washout analytic sample (n = 8458) stratified by emulated treatment arm

VariableControlInitiatedTotal
Sex
 Female29.6% (n = 385)25.5% (n = 1827)26.2% (n = 2212)
 Male70.4% (n = 914)74.5% (n = 5332)73.8% (n = 6246)
Birth month
 16.9% (n = 89)7.3% (n = 521)7.2% (n = 610)
 27.2% (n = 93)7.2% (n = 513)7.2% (n = 606)
 37.3% (n = 95)7.7% (n = 554)7.7% (n = 649)
 47.9% (n = 102)7.4% (n = 527)7.4% (n = 629)
 57.5% (n = 97)8.2% (n = 585)8.1% (n = 682)
 69.2% (n = 119)8.3% (n = 595)8.4% (n = 714)
 78.5% (n = 110)9.1% (n = 652)9% (n = 762)
 89.2% (n = 119)9.2% (n = 659)9.2% (n = 778)
 910% (n = 130)9.1% (n = 648)9.2% (n = 778)
 108.8% (n = 114)9% (n = 642)8.9% (n = 756)
 118.9% (n = 116)8.8% (n = 631)8.8% (n = 747)
 128.9% (n = 115)8.8% (n = 632)8.8% (n = 747)
Birth year
 200012.2% (n = 159)11.6% (n = 832)11.7% (n = 991)
 200112.6% (n = 164)11.2% (n = 805)11.5% (n = 969)
 200211.5% (n = 149)11.9% (n = 854)11.9% (n = 1003)
 200313.5% (n = 175)12.6% (n = 905)12.8% (n = 1080)
 200413.7% (n = 178)13% (n = 928)13.1% (n = 1106)
 200512.9% (n = 168)12.2% (n = 875)12.3% (n = 1043)
 200611.2% (n = 146)13.4% (n = 958)13.1% (n = 1104)
 200712.3% (n = 160)14% (n = 1002)13.7% (n = 1162)
Parity
 047% (n = 610)45.2% (n = 3234)45.4% (n = 3844)
 133.6% (n = 436)34.2% (n = 2450)34.1% (n = 2886)
 212.8% (n = 166)15.1% (n = 1082)14.8% (n = 1248)
 34.9% (n = 64)4.1% (n = 293)4.2% (n = 357)
 41.2% (n = 16)0.9% (n = 63)0.9% (n = 79)
 5 or higher0.5% (n = 7)0.5% (n = 37)0.5% (n = 44)
No. of siblings with any health diagnosis
 081.6% (n = 1060)77.9% (n = 5577)78.5% (n = 6637)
 116.4% (n = 213)20% (n = 1434)19.5% (n = 1647)
 2 or more2% (n = 26)2.1% (n = 148)2.1% (n = 174)
Mother’s civil status at grade six
 Married or in partnership46.3% (n = 601)48.9% (n = 3500)48.5% (n = 4101)
 Divorced or separated16.6% (n = 215)16.8% (n = 1203)16.8% (n = 1418)
 Unmarried37.2% (n = 483)34.3% (n = 2452)34.7% (n = 2935)
 Widow or Widower0.1% (n = 4)0% (n = 4)
Father’s civil status at grade six
 Married or in partnership45.9% (n = 596)48.7% (n = 3490)48.3% (n = 4086)
 Divorced or separated17.4% (n = 226)17% (n = 1217)17.1% (n = 1443)
 Unmarried36.6% (n = 476)34.2% (n = 2450)34.6% (n = 2926)
 Widow or Widower0.1% (n = 1)0% (n = 2)0% (n = 3)
Proportion of missing posttests
 081.5% (n = 1059)85.1% (n = 6089)84.5% (n = 7148)
 ⅓8.9% (n = 116)7.2% (n = 517)7.5% (n = 633)
 ⅔4% (n = 52)3% (n = 218)3.2% (n = 270)
 15.5% (n = 72)4.7% (n = 335)4.8% (n = 407)
Missing English (grade five) score
 Non-missing77.8% (n = 1011)79.8% (n = 5712)79.5% (n = 6723)
Missing reading (grade five) score
 Non-missing89.5% (n = 1163)91% (n = 6517)90.8% (n = 7680)
Missing numeracy (grade five) score
 Non-missing93.9% (n = 1220)95.3% (n = 6819)95% (n = 8039)
Missing English (grade eight) score
 Non-missing88.1% (n = 1145)90.4% (n = 6471)90% (n = 7616)
Missing reading (grade eight) score
 Non-missing89.5% (n = 1162)90.9% (n = 6511)90.7% (n = 7673)
Missing numeracy (grade eight) score
 Non-missing88.8% (n = 1154)91.3% (n = 6537)90.9% (n = 7691)
VariableControlInitiatedTotal
Sex
 Female29.6% (n = 385)25.5% (n = 1827)26.2% (n = 2212)
 Male70.4% (n = 914)74.5% (n = 5332)73.8% (n = 6246)
Birth month
 16.9% (n = 89)7.3% (n = 521)7.2% (n = 610)
 27.2% (n = 93)7.2% (n = 513)7.2% (n = 606)
 37.3% (n = 95)7.7% (n = 554)7.7% (n = 649)
 47.9% (n = 102)7.4% (n = 527)7.4% (n = 629)
 57.5% (n = 97)8.2% (n = 585)8.1% (n = 682)
 69.2% (n = 119)8.3% (n = 595)8.4% (n = 714)
 78.5% (n = 110)9.1% (n = 652)9% (n = 762)
 89.2% (n = 119)9.2% (n = 659)9.2% (n = 778)
 910% (n = 130)9.1% (n = 648)9.2% (n = 778)
 108.8% (n = 114)9% (n = 642)8.9% (n = 756)
 118.9% (n = 116)8.8% (n = 631)8.8% (n = 747)
 128.9% (n = 115)8.8% (n = 632)8.8% (n = 747)
Birth year
 200012.2% (n = 159)11.6% (n = 832)11.7% (n = 991)
 200112.6% (n = 164)11.2% (n = 805)11.5% (n = 969)
 200211.5% (n = 149)11.9% (n = 854)11.9% (n = 1003)
 200313.5% (n = 175)12.6% (n = 905)12.8% (n = 1080)
 200413.7% (n = 178)13% (n = 928)13.1% (n = 1106)
 200512.9% (n = 168)12.2% (n = 875)12.3% (n = 1043)
 200611.2% (n = 146)13.4% (n = 958)13.1% (n = 1104)
 200712.3% (n = 160)14% (n = 1002)13.7% (n = 1162)
Parity
 047% (n = 610)45.2% (n = 3234)45.4% (n = 3844)
 133.6% (n = 436)34.2% (n = 2450)34.1% (n = 2886)
 212.8% (n = 166)15.1% (n = 1082)14.8% (n = 1248)
 34.9% (n = 64)4.1% (n = 293)4.2% (n = 357)
 41.2% (n = 16)0.9% (n = 63)0.9% (n = 79)
 5 or higher0.5% (n = 7)0.5% (n = 37)0.5% (n = 44)
No. of siblings with any health diagnosis
 081.6% (n = 1060)77.9% (n = 5577)78.5% (n = 6637)
 116.4% (n = 213)20% (n = 1434)19.5% (n = 1647)
 2 or more2% (n = 26)2.1% (n = 148)2.1% (n = 174)
Mother’s civil status at grade six
 Married or in partnership46.3% (n = 601)48.9% (n = 3500)48.5% (n = 4101)
 Divorced or separated16.6% (n = 215)16.8% (n = 1203)16.8% (n = 1418)
 Unmarried37.2% (n = 483)34.3% (n = 2452)34.7% (n = 2935)
 Widow or Widower0.1% (n = 4)0% (n = 4)
Father’s civil status at grade six
 Married or in partnership45.9% (n = 596)48.7% (n = 3490)48.3% (n = 4086)
 Divorced or separated17.4% (n = 226)17% (n = 1217)17.1% (n = 1443)
 Unmarried36.6% (n = 476)34.2% (n = 2450)34.6% (n = 2926)
 Widow or Widower0.1% (n = 1)0% (n = 2)0% (n = 3)
Proportion of missing posttests
 081.5% (n = 1059)85.1% (n = 6089)84.5% (n = 7148)
 ⅓8.9% (n = 116)7.2% (n = 517)7.5% (n = 633)
 ⅔4% (n = 52)3% (n = 218)3.2% (n = 270)
 15.5% (n = 72)4.7% (n = 335)4.8% (n = 407)
Missing English (grade five) score
 Non-missing77.8% (n = 1011)79.8% (n = 5712)79.5% (n = 6723)
Missing reading (grade five) score
 Non-missing89.5% (n = 1163)91% (n = 6517)90.8% (n = 7680)
Missing numeracy (grade five) score
 Non-missing93.9% (n = 1220)95.3% (n = 6819)95% (n = 8039)
Missing English (grade eight) score
 Non-missing88.1% (n = 1145)90.4% (n = 6471)90% (n = 7616)
Missing reading (grade eight) score
 Non-missing89.5% (n = 1162)90.9% (n = 6511)90.7% (n = 7673)
Missing numeracy (grade eight) score
 Non-missing88.8% (n = 1154)91.3% (n = 6537)90.9% (n = 7691)

‘Initiated’ refers to having been dispensed to their name an attention-deficit/hyperactivity disorder medication from a pharmacy in a grace period between 365 and 455 days after their grade five national test. ‘Control’ refers to not having been dispensed any attention-deficit/hyperactivity disorder medication from a pharmacy during the same grace period.

Table 2 shows summary statistics for continuous variables. Standardized scores across all domains and for grades five and eight have a mean of around −0.5 SDs. This is expected, as scores are standardized relative to all children and our study sample concerns only children diagnosed with ADHD, which by definition involves negative academic impact [1]. There are few, if any, differences in the selected variables across the emulated arms, including the national test scores or grade point averages. Yet, we can see that the cutoffs chosen for time-zero and grace period generated a clear distinction of groups. The average estimated number of medication blocks—periods of treatment with no interruption longer than 30 days—was of 1.31 (SD, 0.64) for the ‘initiated’ condition and 0.41 (SD, 0.67) for the ‘control’ condition. This can also be seen in Fig. 3, which shows the estimated percentage of medicated days between time-zero and the grade eight national test. In both the not-previously medicated sample and the no-washout sample, pupils in the ‘initiated’ group were medicated for about 85% of the days between their assignment and the posttest, whereas those in the ‘control’ group were medicated for about 12% of the days. In the ‘control’ group, 67.9% of pupils had zero estimated medicated days (72.1% in no-washout sample), while in the ‘initiated’ group, 6.1% of pupils had 100% estimated medicated days (5.8% in no-washout sample).

Two-panels histogram for two-different groups each. Top panel (a) shows a concentrated count around zero for the ‘control’ arm, with annotated text indicating a mean of 12.4% medicated days, while for the ‘initiated’ arm it shows a less concentrated count around 100, with annotated text indicating a mean of 85.1% medicated days. Bottom panel (b) shows a similar trend, with a lower total count for the ‘control’ arm, but similar mean values of 12.1% and 84.6%, respectively.
Figure 3.

Histogram for number of estimated days of attention-deficit/hyperactivity disorder pharmaceutical treatment by emulated treatment arm. Dashed line marks the respective mean of medicated days for each treatment arm. Total days in the period are 731 (=100%). Number of bins: 15. (a) Main analysis on the not-previously medicated sample (n = 8458). (b) Sensitivity analysis on the analytic sample (n = 11 835) with no-washout period. ‘Initiated’ refers to individuals that had dispensed to their name an attention-deficit/hyperactivity disorder medication from a pharmacy in a grace period between 365 and 455 days after their grade five national test. ‘Control’ refers to individuals that did not have dispensed any attention-deficit/hyperactivity disorder medication from a pharmacy during the same grace period.

Table 2.

Summary statistics of demographic, clinical, and academic continuous variables in the washout analytic sample (n = 8458) stratified by emulated treatment arm

VariableEmulated ArmMissing %Mean (SD)Median (IQR)Mode [Min; Max]
Numeracy (grade five) rawInitiated4.7520.35 (9.11)19 (14)18 [1; 45]
Control6.0820.06 (9.13)19 (14)20 [1; 44]
Total4.9520.3 (9.11)19 (14)18 [1; 45]
English (grade five) rawInitiated20.2122.52 (10.03)21 (14)15 [1; 50]
Control22.1722.74 (10.38)21 (15.5)14 [1; 49]
Total20.5122.56 (10.08)21 (14)15 [1; 50]
Reading (grade five) rawInitiated8.9716.64 (6.69)16 (11)15 [1; 33]
Control10.4716.47 (6.8)16 (11)14 [1; 32]
Total9.216.62 (6.7)16 (11)15 [1; 33]
Numeracy (grade eight) rawInitiated8.6920.25 (10.16)18 (15)16 [1; 56]
Control11.1619.48 (10.09)18 (13)15 [1; 57]
Total9.0720.14 (10.16)18 (15)15 [1; 57]
English (grade eight) rawInitiated9.6123.11 (12.32)21 (20)10 [1; 56]
Control11.8623.03 (12.48)22 (21)10 [1; 56]
Total9.9623.1 (12.34)21 (20)10 [1; 56]
Reading (grade eight) rawInitiated9.0520.29 (8.68)19 (12)20 [1; 47]
Control10.5519.85 (8.89)19 (13)15 [1; 46]
Total9.2820.22 (8.72)19 (13)15 [1; 47]
Numeracy (grade five) std.Initiated4.75−0.47 (0.99)−0.58 (1.47){−0.67; −1.01} [−2.98; 2.25]
Control6.08−0.5 (0.99)−0.63 (1.48)−1.4 [−2.93; 2.25]
Total4.95−0.47 (0.99)−0.58 (1.47)−1.4 [−2.98; 2.25]
English (grade five) std.Initiated20.21−0.4 (1.01)−0.55 (1.51){−1.19;−1.04} [−3.07; 2.08]
Control22.17−0.38 (1.05)−0.52 (1.54)−0.49 [−3.34; 2.22]
Total20.51−0.4 (1.02)−0.55 (1.51)−1.19 [−3.34; 2.22]
Reading (grade five) std.Initiated8.97−0.46 (1.03)−0.48 (1.64)−0.78 [−3.27; 2.06]
Control10.47−0.49 (1.05)−0.62 (1.64){−1.13; −0.93; 0.2} [−3.1; 1.92]
Total9.2−0.46 (1.03)−0.52 (1.64)−0.78 [−3.27; 2.06]
Numeracy (grade eight) std.Initiated8.69−0.58 (0.94)−0.74 (1.34)−1.24 [−2.51; 2.43]
Control11.16−0.66 (0.92)−0.83 (1.32){−1.57; −0.33} [−2.46; 2.4]
Total9.07−0.59 (0.94)−0.75 (1.33){−1.24; −0.81} [−2.51; 2.43]
English (grade eight) std.Initiated9.61−0.41 (1.02)−0.56 (1.68)−1.59 [−2.35; 2.18]
Control11.86−0.43 (1.04)−0.57 (1.73)−1.49 [−2.34; 2.02]
Total9.96−0.41 (1.03)−0.56 (1.69){−1.59; −1.49} [−2.35; 2.18]
Reading (grade eight) std.Initiated9.05−0.61 (0.96)−0.7 (1.41)−1 [−3.05; 2.16]
Control10.55−0.67 (0.98)−0.77 (1.48)−1.32 [−2.98; 1.82]
Total9.28−0.62 (0.97)−0.7 (1.42)−1 [−3.05; 2.16]
Age at grade five national testInitiated10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Control10.21 (0.3)10.17 (0.42)10.58 [9.66; 11.67]
Total10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Mother's age at birthInitiated27.78 (5.38)28 (8)28 [15; 48]
Control28.1 (5.54)28 (8)28 [16; 45]
Total27.83 (5.41)28 (8)28 [15; 48]
Visits learning problemsInitiated2.15 (7.98)0 (0)0 [0; 157]
Control2.4 (9.18)0 (0)0 [0; 153]
Total2.19 (8.18)0 (0)0 [0; 157]
Visits sleeping disordersInitiated0.23 (1.86)0 (0)0 [0; 53]
Control0.22 (1.83)0 (0)0 [0; 44]
Total0.23 (1.86)0 (0)0 [0; 53]
Estimated medicated daysInitiated622.18 (162.86)700 (123)731 [1; 1096]
Control90.76 (171.37)0 (92.5)0 [0; 639]
Total540.56 (252.34)685 (280)0 [0; 1096]
Estimated number of blocksInitiated1.31 (0.64)1 (0)1 [1; 9]
Control0.41 (0.67)0 (1)0 [0; 3]
Total1.18 (0.72)1 (0)1 [0; 9]
VariableEmulated ArmMissing %Mean (SD)Median (IQR)Mode [Min; Max]
Numeracy (grade five) rawInitiated4.7520.35 (9.11)19 (14)18 [1; 45]
Control6.0820.06 (9.13)19 (14)20 [1; 44]
Total4.9520.3 (9.11)19 (14)18 [1; 45]
English (grade five) rawInitiated20.2122.52 (10.03)21 (14)15 [1; 50]
Control22.1722.74 (10.38)21 (15.5)14 [1; 49]
Total20.5122.56 (10.08)21 (14)15 [1; 50]
Reading (grade five) rawInitiated8.9716.64 (6.69)16 (11)15 [1; 33]
Control10.4716.47 (6.8)16 (11)14 [1; 32]
Total9.216.62 (6.7)16 (11)15 [1; 33]
Numeracy (grade eight) rawInitiated8.6920.25 (10.16)18 (15)16 [1; 56]
Control11.1619.48 (10.09)18 (13)15 [1; 57]
Total9.0720.14 (10.16)18 (15)15 [1; 57]
English (grade eight) rawInitiated9.6123.11 (12.32)21 (20)10 [1; 56]
Control11.8623.03 (12.48)22 (21)10 [1; 56]
Total9.9623.1 (12.34)21 (20)10 [1; 56]
Reading (grade eight) rawInitiated9.0520.29 (8.68)19 (12)20 [1; 47]
Control10.5519.85 (8.89)19 (13)15 [1; 46]
Total9.2820.22 (8.72)19 (13)15 [1; 47]
Numeracy (grade five) std.Initiated4.75−0.47 (0.99)−0.58 (1.47){−0.67; −1.01} [−2.98; 2.25]
Control6.08−0.5 (0.99)−0.63 (1.48)−1.4 [−2.93; 2.25]
Total4.95−0.47 (0.99)−0.58 (1.47)−1.4 [−2.98; 2.25]
English (grade five) std.Initiated20.21−0.4 (1.01)−0.55 (1.51){−1.19;−1.04} [−3.07; 2.08]
Control22.17−0.38 (1.05)−0.52 (1.54)−0.49 [−3.34; 2.22]
Total20.51−0.4 (1.02)−0.55 (1.51)−1.19 [−3.34; 2.22]
Reading (grade five) std.Initiated8.97−0.46 (1.03)−0.48 (1.64)−0.78 [−3.27; 2.06]
Control10.47−0.49 (1.05)−0.62 (1.64){−1.13; −0.93; 0.2} [−3.1; 1.92]
Total9.2−0.46 (1.03)−0.52 (1.64)−0.78 [−3.27; 2.06]
Numeracy (grade eight) std.Initiated8.69−0.58 (0.94)−0.74 (1.34)−1.24 [−2.51; 2.43]
Control11.16−0.66 (0.92)−0.83 (1.32){−1.57; −0.33} [−2.46; 2.4]
Total9.07−0.59 (0.94)−0.75 (1.33){−1.24; −0.81} [−2.51; 2.43]
English (grade eight) std.Initiated9.61−0.41 (1.02)−0.56 (1.68)−1.59 [−2.35; 2.18]
Control11.86−0.43 (1.04)−0.57 (1.73)−1.49 [−2.34; 2.02]
Total9.96−0.41 (1.03)−0.56 (1.69){−1.59; −1.49} [−2.35; 2.18]
Reading (grade eight) std.Initiated9.05−0.61 (0.96)−0.7 (1.41)−1 [−3.05; 2.16]
Control10.55−0.67 (0.98)−0.77 (1.48)−1.32 [−2.98; 1.82]
Total9.28−0.62 (0.97)−0.7 (1.42)−1 [−3.05; 2.16]
Age at grade five national testInitiated10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Control10.21 (0.3)10.17 (0.42)10.58 [9.66; 11.67]
Total10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Mother's age at birthInitiated27.78 (5.38)28 (8)28 [15; 48]
Control28.1 (5.54)28 (8)28 [16; 45]
Total27.83 (5.41)28 (8)28 [15; 48]
Visits learning problemsInitiated2.15 (7.98)0 (0)0 [0; 157]
Control2.4 (9.18)0 (0)0 [0; 153]
Total2.19 (8.18)0 (0)0 [0; 157]
Visits sleeping disordersInitiated0.23 (1.86)0 (0)0 [0; 53]
Control0.22 (1.83)0 (0)0 [0; 44]
Total0.23 (1.86)0 (0)0 [0; 53]
Estimated medicated daysInitiated622.18 (162.86)700 (123)731 [1; 1096]
Control90.76 (171.37)0 (92.5)0 [0; 639]
Total540.56 (252.34)685 (280)0 [0; 1096]
Estimated number of blocksInitiated1.31 (0.64)1 (0)1 [1; 9]
Control0.41 (0.67)0 (1)0 [0; 3]
Total1.18 (0.72)1 (0)1 [0; 9]

IQR, interquartile range; SD, standard deviation.

‘Initiated’ refers to having been dispensed to their name an attention-deficit/hyperactivity disorder medication from a pharmacy in a grace period between 365 and 455 days after their grade five national test. ‘Control’ refers to not having been dispensed any attention-deficit/hyperactivity disorder medication from a pharmacy during the same grace period.

Table 2.

Summary statistics of demographic, clinical, and academic continuous variables in the washout analytic sample (n = 8458) stratified by emulated treatment arm

VariableEmulated ArmMissing %Mean (SD)Median (IQR)Mode [Min; Max]
Numeracy (grade five) rawInitiated4.7520.35 (9.11)19 (14)18 [1; 45]
Control6.0820.06 (9.13)19 (14)20 [1; 44]
Total4.9520.3 (9.11)19 (14)18 [1; 45]
English (grade five) rawInitiated20.2122.52 (10.03)21 (14)15 [1; 50]
Control22.1722.74 (10.38)21 (15.5)14 [1; 49]
Total20.5122.56 (10.08)21 (14)15 [1; 50]
Reading (grade five) rawInitiated8.9716.64 (6.69)16 (11)15 [1; 33]
Control10.4716.47 (6.8)16 (11)14 [1; 32]
Total9.216.62 (6.7)16 (11)15 [1; 33]
Numeracy (grade eight) rawInitiated8.6920.25 (10.16)18 (15)16 [1; 56]
Control11.1619.48 (10.09)18 (13)15 [1; 57]
Total9.0720.14 (10.16)18 (15)15 [1; 57]
English (grade eight) rawInitiated9.6123.11 (12.32)21 (20)10 [1; 56]
Control11.8623.03 (12.48)22 (21)10 [1; 56]
Total9.9623.1 (12.34)21 (20)10 [1; 56]
Reading (grade eight) rawInitiated9.0520.29 (8.68)19 (12)20 [1; 47]
Control10.5519.85 (8.89)19 (13)15 [1; 46]
Total9.2820.22 (8.72)19 (13)15 [1; 47]
Numeracy (grade five) std.Initiated4.75−0.47 (0.99)−0.58 (1.47){−0.67; −1.01} [−2.98; 2.25]
Control6.08−0.5 (0.99)−0.63 (1.48)−1.4 [−2.93; 2.25]
Total4.95−0.47 (0.99)−0.58 (1.47)−1.4 [−2.98; 2.25]
English (grade five) std.Initiated20.21−0.4 (1.01)−0.55 (1.51){−1.19;−1.04} [−3.07; 2.08]
Control22.17−0.38 (1.05)−0.52 (1.54)−0.49 [−3.34; 2.22]
Total20.51−0.4 (1.02)−0.55 (1.51)−1.19 [−3.34; 2.22]
Reading (grade five) std.Initiated8.97−0.46 (1.03)−0.48 (1.64)−0.78 [−3.27; 2.06]
Control10.47−0.49 (1.05)−0.62 (1.64){−1.13; −0.93; 0.2} [−3.1; 1.92]
Total9.2−0.46 (1.03)−0.52 (1.64)−0.78 [−3.27; 2.06]
Numeracy (grade eight) std.Initiated8.69−0.58 (0.94)−0.74 (1.34)−1.24 [−2.51; 2.43]
Control11.16−0.66 (0.92)−0.83 (1.32){−1.57; −0.33} [−2.46; 2.4]
Total9.07−0.59 (0.94)−0.75 (1.33){−1.24; −0.81} [−2.51; 2.43]
English (grade eight) std.Initiated9.61−0.41 (1.02)−0.56 (1.68)−1.59 [−2.35; 2.18]
Control11.86−0.43 (1.04)−0.57 (1.73)−1.49 [−2.34; 2.02]
Total9.96−0.41 (1.03)−0.56 (1.69){−1.59; −1.49} [−2.35; 2.18]
Reading (grade eight) std.Initiated9.05−0.61 (0.96)−0.7 (1.41)−1 [−3.05; 2.16]
Control10.55−0.67 (0.98)−0.77 (1.48)−1.32 [−2.98; 1.82]
Total9.28−0.62 (0.97)−0.7 (1.42)−1 [−3.05; 2.16]
Age at grade five national testInitiated10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Control10.21 (0.3)10.17 (0.42)10.58 [9.66; 11.67]
Total10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Mother's age at birthInitiated27.78 (5.38)28 (8)28 [15; 48]
Control28.1 (5.54)28 (8)28 [16; 45]
Total27.83 (5.41)28 (8)28 [15; 48]
Visits learning problemsInitiated2.15 (7.98)0 (0)0 [0; 157]
Control2.4 (9.18)0 (0)0 [0; 153]
Total2.19 (8.18)0 (0)0 [0; 157]
Visits sleeping disordersInitiated0.23 (1.86)0 (0)0 [0; 53]
Control0.22 (1.83)0 (0)0 [0; 44]
Total0.23 (1.86)0 (0)0 [0; 53]
Estimated medicated daysInitiated622.18 (162.86)700 (123)731 [1; 1096]
Control90.76 (171.37)0 (92.5)0 [0; 639]
Total540.56 (252.34)685 (280)0 [0; 1096]
Estimated number of blocksInitiated1.31 (0.64)1 (0)1 [1; 9]
Control0.41 (0.67)0 (1)0 [0; 3]
Total1.18 (0.72)1 (0)1 [0; 9]
VariableEmulated ArmMissing %Mean (SD)Median (IQR)Mode [Min; Max]
Numeracy (grade five) rawInitiated4.7520.35 (9.11)19 (14)18 [1; 45]
Control6.0820.06 (9.13)19 (14)20 [1; 44]
Total4.9520.3 (9.11)19 (14)18 [1; 45]
English (grade five) rawInitiated20.2122.52 (10.03)21 (14)15 [1; 50]
Control22.1722.74 (10.38)21 (15.5)14 [1; 49]
Total20.5122.56 (10.08)21 (14)15 [1; 50]
Reading (grade five) rawInitiated8.9716.64 (6.69)16 (11)15 [1; 33]
Control10.4716.47 (6.8)16 (11)14 [1; 32]
Total9.216.62 (6.7)16 (11)15 [1; 33]
Numeracy (grade eight) rawInitiated8.6920.25 (10.16)18 (15)16 [1; 56]
Control11.1619.48 (10.09)18 (13)15 [1; 57]
Total9.0720.14 (10.16)18 (15)15 [1; 57]
English (grade eight) rawInitiated9.6123.11 (12.32)21 (20)10 [1; 56]
Control11.8623.03 (12.48)22 (21)10 [1; 56]
Total9.9623.1 (12.34)21 (20)10 [1; 56]
Reading (grade eight) rawInitiated9.0520.29 (8.68)19 (12)20 [1; 47]
Control10.5519.85 (8.89)19 (13)15 [1; 46]
Total9.2820.22 (8.72)19 (13)15 [1; 47]
Numeracy (grade five) std.Initiated4.75−0.47 (0.99)−0.58 (1.47){−0.67; −1.01} [−2.98; 2.25]
Control6.08−0.5 (0.99)−0.63 (1.48)−1.4 [−2.93; 2.25]
Total4.95−0.47 (0.99)−0.58 (1.47)−1.4 [−2.98; 2.25]
English (grade five) std.Initiated20.21−0.4 (1.01)−0.55 (1.51){−1.19;−1.04} [−3.07; 2.08]
Control22.17−0.38 (1.05)−0.52 (1.54)−0.49 [−3.34; 2.22]
Total20.51−0.4 (1.02)−0.55 (1.51)−1.19 [−3.34; 2.22]
Reading (grade five) std.Initiated8.97−0.46 (1.03)−0.48 (1.64)−0.78 [−3.27; 2.06]
Control10.47−0.49 (1.05)−0.62 (1.64){−1.13; −0.93; 0.2} [−3.1; 1.92]
Total9.2−0.46 (1.03)−0.52 (1.64)−0.78 [−3.27; 2.06]
Numeracy (grade eight) std.Initiated8.69−0.58 (0.94)−0.74 (1.34)−1.24 [−2.51; 2.43]
Control11.16−0.66 (0.92)−0.83 (1.32){−1.57; −0.33} [−2.46; 2.4]
Total9.07−0.59 (0.94)−0.75 (1.33){−1.24; −0.81} [−2.51; 2.43]
English (grade eight) std.Initiated9.61−0.41 (1.02)−0.56 (1.68)−1.59 [−2.35; 2.18]
Control11.86−0.43 (1.04)−0.57 (1.73)−1.49 [−2.34; 2.02]
Total9.96−0.41 (1.03)−0.56 (1.69){−1.59; −1.49} [−2.35; 2.18]
Reading (grade eight) std.Initiated9.05−0.61 (0.96)−0.7 (1.41)−1 [−3.05; 2.16]
Control10.55−0.67 (0.98)−0.77 (1.48)−1.32 [−2.98; 1.82]
Total9.28−0.62 (0.97)−0.7 (1.42)−1 [−3.05; 2.16]
Age at grade five national testInitiated10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Control10.21 (0.3)10.17 (0.42)10.58 [9.66; 11.67]
Total10.2 (0.29)10.17 (0.5)10.58 [9.08; 11.67]
Mother's age at birthInitiated27.78 (5.38)28 (8)28 [15; 48]
Control28.1 (5.54)28 (8)28 [16; 45]
Total27.83 (5.41)28 (8)28 [15; 48]
Visits learning problemsInitiated2.15 (7.98)0 (0)0 [0; 157]
Control2.4 (9.18)0 (0)0 [0; 153]
Total2.19 (8.18)0 (0)0 [0; 157]
Visits sleeping disordersInitiated0.23 (1.86)0 (0)0 [0; 53]
Control0.22 (1.83)0 (0)0 [0; 44]
Total0.23 (1.86)0 (0)0 [0; 53]
Estimated medicated daysInitiated622.18 (162.86)700 (123)731 [1; 1096]
Control90.76 (171.37)0 (92.5)0 [0; 639]
Total540.56 (252.34)685 (280)0 [0; 1096]
Estimated number of blocksInitiated1.31 (0.64)1 (0)1 [1; 9]
Control0.41 (0.67)0 (1)0 [0; 3]
Total1.18 (0.72)1 (0)1 [0; 9]

IQR, interquartile range; SD, standard deviation.

‘Initiated’ refers to having been dispensed to their name an attention-deficit/hyperactivity disorder medication from a pharmacy in a grace period between 365 and 455 days after their grade five national test. ‘Control’ refers to not having been dispensed any attention-deficit/hyperactivity disorder medication from a pharmacy during the same grace period.

Main analysis

Figure 4 shows the posterior distribution for the different causal effects estimated, both for the never-medicated-before sample (panel a in the top) and the no-washout sample (panel b in the bottom).

Top panel (a) shows three panels for three different domains of national test, each showing three different bell-like shaped curves for each estimate's posterior distribution of the average effect of initiation medication on grade eight national test standardized score. The compatibility intervals shown suggest the effects are mostly within 0 and 0.1 in standardized score points, with the Numeracy domain being the largest effect, particularly for the SATU. Bottom panel (b) shows a similar image, with the main difference being that for the numeracy domain the SATE posterior distribution is further away from zero than the SATT distribution.
Figure 4.

Posterior distributions for the expected difference in grade eight national test standardized score between the different emulated arms for attention-deficit/hyperactivity disorder pharmaceutical treatment. Estimates come from a semiparametric outcome model using Bayesian additive regression trees (BART) with random intercepts and slopes for school at grades five and eight. Dotted lines denote 95% compatibility intervals. Domains: English (EN), numeracy or mathematics (MA), reading (RE). Estimands: sample average treatment effect (SATE), sample average treatment effect on the treated (SATT), and sample average treatment effect on the untreated (SATU). (a) Main analysis on the not-previously medicated sample (n = 8458). (b) Sensitivity analysis on the no-washout sample (n = 11 835).

For the never-medicated-before sample, the SATE had a posterior mean (and compatibility interval) of 0.050 (CI95, 0.016; 0.084), 0.078 (CI95, 0.042; 0.117), 0.073 (CI95, 0.035; 0.108) standardized score points for the domains of English, numeracy, and reading, respectively (see Table 3).

Table 3.

Sample average treatment effect estimates of initiating attention-deficit/hyperactivity disorder pharmaceutical treatment on national test scores by domain across different model specifications

Model specificationDomainMean difference [CI95]N (% of eligible)
Complete case washout.g-formula w/BARTNumeracy0.054 [0.037; 0.07]5721 (63.7%)
English−0.019 [−0.037; −0.002]5710 (63.6%)
Reading0.017 [−0.002; 0.035]5726 (63.8%)
Multiple imputation—washout.g-formula w/BARTEnglish0.05 [0.016; 0.084]8458 (94.2%)
Reading0.073 [0.038; 0.107]8458 (94.2%)
Numeracy0.078 [0.042; 0.114]8458 (94.2%)
Multiple imputation—no-washout.g-formula w/BARTEnglish0.037 [−0.003; 0.076]11 835 (94.4%)a
Reading0.071 [0.03; 0.111]11 835 (94.4%)a
Numeracy0.063 [0.016; 0.111]11 835 (94.4%)a
Model specificationDomainMean difference [CI95]N (% of eligible)
Complete case washout.g-formula w/BARTNumeracy0.054 [0.037; 0.07]5721 (63.7%)
English−0.019 [−0.037; −0.002]5710 (63.6%)
Reading0.017 [−0.002; 0.035]5726 (63.8%)
Multiple imputation—washout.g-formula w/BARTEnglish0.05 [0.016; 0.084]8458 (94.2%)
Reading0.073 [0.038; 0.107]8458 (94.2%)
Numeracy0.078 [0.042; 0.114]8458 (94.2%)
Multiple imputation—no-washout.g-formula w/BARTEnglish0.037 [−0.003; 0.076]11 835 (94.4%)a
Reading0.071 [0.03; 0.111]11 835 (94.4%)a
Numeracy0.063 [0.016; 0.111]11 835 (94.4%)a

BART: Bayesian additive regression trees; CI95, compatibility interval.

a

This reported percentage is relative to the eligible sample without exclusions for being previously medicated.

Table 3.

Sample average treatment effect estimates of initiating attention-deficit/hyperactivity disorder pharmaceutical treatment on national test scores by domain across different model specifications

Model specificationDomainMean difference [CI95]N (% of eligible)
Complete case washout.g-formula w/BARTNumeracy0.054 [0.037; 0.07]5721 (63.7%)
English−0.019 [−0.037; −0.002]5710 (63.6%)
Reading0.017 [−0.002; 0.035]5726 (63.8%)
Multiple imputation—washout.g-formula w/BARTEnglish0.05 [0.016; 0.084]8458 (94.2%)
Reading0.073 [0.038; 0.107]8458 (94.2%)
Numeracy0.078 [0.042; 0.114]8458 (94.2%)
Multiple imputation—no-washout.g-formula w/BARTEnglish0.037 [−0.003; 0.076]11 835 (94.4%)a
Reading0.071 [0.03; 0.111]11 835 (94.4%)a
Numeracy0.063 [0.016; 0.111]11 835 (94.4%)a
Model specificationDomainMean difference [CI95]N (% of eligible)
Complete case washout.g-formula w/BARTNumeracy0.054 [0.037; 0.07]5721 (63.7%)
English−0.019 [−0.037; −0.002]5710 (63.6%)
Reading0.017 [−0.002; 0.035]5726 (63.8%)
Multiple imputation—washout.g-formula w/BARTEnglish0.05 [0.016; 0.084]8458 (94.2%)
Reading0.073 [0.038; 0.107]8458 (94.2%)
Numeracy0.078 [0.042; 0.114]8458 (94.2%)
Multiple imputation—no-washout.g-formula w/BARTEnglish0.037 [−0.003; 0.076]11 835 (94.4%)a
Reading0.071 [0.03; 0.111]11 835 (94.4%)a
Numeracy0.063 [0.016; 0.111]11 835 (94.4%)a

BART: Bayesian additive regression trees; CI95, compatibility interval.

a

This reported percentage is relative to the eligible sample without exclusions for being previously medicated.

Regarding heterogeneity, the estimated SATT posterior distributions show a small shift towards negative values for the numeracy domain. An ANOVA indicates that about 31.2% of the variability in effect estimates is accounted for by distinguishing the domain of the national test.

Sensitivity analyses

Regarding results for the no-washout sample, the corresponding SATE and compatibility intervals were of 0.037 (CI95, −0.003; 0.076), 0.063 (CI95, 0.016; 0.111), 0.071 (CI95, 0.03; 0.111) standardized score points for the domains of English, numeracy, and reading, respectively (see Fig. 5 and Table 3).

A line range plot with different model specifications and the corresponding difference between posttest and pretest national test standardized scores. There are three models for three different subsets: complete case with full washout period, multiple imputation with full washout period, and multiple imputation without washout period. All estimated effects and intervals are contained between 0 and 0.12, except for English in the complete case subset.
Figure 5.

Sample average treatment effect estimates of initiating attention-deficit/hyperactivity disorder pharmaceutical treatment on national test scores by domain across different model specifications. 95% intervals correspond to compatibility intervals. Domains: English (EN), numeracy or mathematics (MA), reading (RE).

Given the stark difference relative to Ref. [19], we performed an informal replication analysis to see whether the different effect sizes could be explained by the differing populations. We managed to estimate a similar point estimate after adjustment: −0.180 (standard error [SE], 0.07) and −0.176 (SE, 0.203), for Keilow et al.’s population and ours, respectively. We did not manage to replicate their unadjusted effects, suggesting that the two populations are only somewhat comparable. Further details are available in the shared code.

Discussion

Key results

Our main analysis suggests that, on average, initiating medication has a small positive effect on national test scores in the long-term in the population of children diagnosed with ADHD. Informally, the estimated compatibility intervals suggest that medication initiation bridges between 3.9% and 20.5% of the average gap between the grade eight scores of the ADHD sample relative to the general sample (see right panel of Supplementary Fig. S1). This is a slight overestimation as the achievement gap is calculated with the scores of medicated children. The estimated magnitude is consistent with long-term effect results from reviews [18] and from the MTA trial [11–13].

Effects vary minimally across test domains, with numeracy showing a slightly more positive effect. Detected heterogeneity of the effects via random slopes or interactions was negligible. Unobserved confounding was detected only for the reading domain (see Supplementary Material, Section E), warranting cautious interpretation.

Regarding differences of effect sizes compared with some previous studies, we mostly attribute them to differences in estimand [40]. Under a comparable analytic and identifying strategy to Ref. [19] we are also able to estimate a larger order of magnitude effect in the Norwegian population. However, this effect is not a long-term intention-to-treat effect, but, rather, the effect of discontinuation among those who initiated medication (assuming random and sequentially exchangeable non-responsiveness to medication).

Generalizability

Our studied population is restricted to children born in Norway, due to available registry data. Children born outside Norway face significant challenges throughout their education, which is reflected in lower levels of achievement and performance [41]. However, they are also less frequently diagnosed with ADHD [42], limiting their study participation so that their inclusion would not drastically alter our findings.

Moreover, the present study focused only on a long-term ‘as-started’ effect. This estimand informs about general policy effectiveness but does not address whether adhering to ADHD medication improves academic achievement, which may be more relevant to users themselves. However, discontinuation of treatment is not necessarily a treatment protocol violation [43], so ideally we would have data on the reason for discontinuation. Still, if in addition to satisfactory adjustment for baseline confounding we unrealistically assume perfect protocol adherence in the initiated arm, our intention-to-treat analysis would well approximate a per-protocol analysis. The estimated effect would be attenuated because 32.1% of control arm pupils had at least one medicated day.

Limitations

Ideally, we would have intended daily dose, treatment duration, and prescription purpose (titration or stockpiling) data to inform about the actual treatment adherence of children. Instead, we used an algorithm [44] to estimate daily dosages and treatment duration, introducing an additional source of uncertainty in the process.

As with most observational studies, our causal interpretation rests on assuming no unobserved confounding. Any variable that would negatively affect academic achievement and increase the probability of treatment could partly explain why our estimated effect is so small. For example, disruptive behaviour in the classroom influences medication initiation independent of symptom severity [45]. However, this unobserved confounder should go beyond what was already adjusted for in our flexible model with a rich set of covariates, and should also have been undetected by the negative control exposure analysis.

Implications

The main result of the present study suggests that initiating or continuing ADHD medication approximately one year after the grade five national test (and having an adherence pattern as naturally occurring in this subpopulation of children in this period) has a very small positive average effect on the later grade eight national test, for all three domains. This estimated magnitude was robust across a variety of model specifications, population subsets, and identifying strategies, although results for reading are more uncertain. This implies that ADHD pharmaceutical treatment initiation has a negligible average long-term effect on the acquisition of basic academic skills as measured by the national tests.

This does not rule out substantial positive (or negative) effects for specific children. However, individual treatment effects are harder to identify [46], and our model captured little to no heterogeneity of effects across measured covariates.

Our analysis is solely concerned with academic achievement, excluding other outcomes for which children might be medicated, like behavioural regulation.

Future research

Besides replicating the analysis in other national registries with different contexts, we emphasize the need for studies that better quantify possible bias sources [47]. For example, studies on the causes of prescription and dispensation of ADHD pharmaceutical treatment, as well as the causes of exemption from national tests among children diagnosed with ADHD would inform on the extent of confounding and selection bias present, respectively. Likewise, quantitative validation studies into diagnosis practices among general practitioners and specialists in Norway would better inform inclusion criteria and the relevance of our analytic samples in generalizing to the broad population of children diagnosed with ADHD outside Norway.

Conclusion

A methodological conclusion is the importance of differentiating estimands [40] and triangulating identifying strategies for the estimation of causal effects [48, 49]. Moreover, we highlight the use of negative controls in registry-based studies, where often a wide range of variables at different time points is available.

A clinical conclusion is that pharmaceutical treatment—as it was provided and implemented in the period and population of eligible children in this study—is insufficient to, on average, improve academic achievement to a relevant degree. Put differently, the achievement gap between the general population of children and those diagnosed with ADHD is far from being bridged by medication initiation.

Acknowledgements

The authors thank the anonymous reviewers for their valuable suggestions, as well as Jon Michael Gran for his input on the design stage and Johan de Aguas for his helpful comments and suggestions on the article.

Author contributions

Conceptualization: Tomás Varnet Pérez and Guido Biele; Methodology: Tomás Varnet Pérez and Guido Biele; Formal Analysis: Tomás Varnet Pérez; Data curation: Guido Biele and Tomás Varnet Pérez; Writing - Original Draft: Tomás Varnet Pérez; Writing - Review & Editing: Tomás Varnet Pérez, Guido Biele, Kristin Romvig Øvergaard, and Arnoldo Frigessi; Visualization: Tomás Varnet Pérez; Supervision: Guido Biele; Arnoldo Frigessi; Kristin Romvig Øvergaard; Project Administration: Guido Biele; Funding Acquisition: Guido Biele. All authors read and approved the final article version.

Supplementary data

Supplementary data is available at IJE online.

Conflict of interest: None declared.

Funding

This work was supported by a grant from the Research Council of Norway [grant number 302899].

Data availability

All code used for statistical analysis is available on the GitHub repository at https://github.com/tvarnetperez/TTE ADHD. Code used for data linkage and preparation is available upon request to G.B. The sensitive nature of the data does not allow for data sharing. Data access can be requested to the corresponding organizations.

Ethics approval

This study has been approved by the Regional Committees for Medical and Healthcare Research Ethics (REK, Regionale komiteer for medisinsk og helsefaglig forskningsetikk), approval number 96604.

Use of artificial intelligence (AI) tools

Large language model families ChatGPT (developed by OpenAI) and Claude (developed by Anthropic) were selectively used for improving the readability and English grammar of the original written text.

References

1

American Psychiatric Association (APA
). Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision. Washington DC: American Psychiatric Association Publishing, 2022.

2

Dardani
C
,
Riglin
L
,
Leppert
B
 et al.  
Is genetic liability to ADHD and ASD causally linked to educational attainment?
 
Int J Epidemiol
 
2022
;
50
:
2011
23
.

3

Jangmo
A
,
Stålhandske
A
,
Chang
Z
 et al.  
Attention-deficit/hyperactivity disorder, school performance, and effect of medication
.
J Am Acad Child Adolesc Psychiatry
 
2019
;
58
:
423
32
.

4

de Zeeuw
EL
,
van Beijsterveldt
CE
,
Ehli
EA
,
de Geus
EJ
,
Boomsma
DI.
 
Attention deficit hyperactivity disorder symptoms and low educational achievement: evidence supporting a causal hypothesis
.
Behav Genet
 
2017
;
47
:
278
89
.

5

Bussing
R
,
Porter
P
,
Zima
BT
,
Mason
D
,
Garvan
C
,
Reid
R.
 
Academic outcome trajectories of students with ADHD: does exceptional education status matter?
 
J Emot Behav Disord
 
2012
;
20
:
131
43
.

6

Dalsgaard
S
,
Nielsen
HS
,
Simonsen
M.
 
Consequences of ADHD medication use for children’s outcomes
.
J Health Econ
 
2014
;
37
:
137
51
.

7

Beauchaine
TP
,
Ben-David
I
,
Bos
M.
 
ADHD, financial distress, and suicide in adulthood: a population study
.
Sci Adv
 
2020
;
6
:
eaba1551
.

8

Garas
P
,
Balazs
J.
 
Long-term suicide risk of children and adolescents with attention deficit and hyperactivity disorder—a systematic review
.
Front Psychiatry
 
2020
;
11
:
557909
.

9

Cortese
S
,
Adamo
N
,
Del Giovane
C
 et al.  
Comparative efficacy and tolerability of medications for attention-deficit hyperactivity disorder in children, adolescents, and adults: a systematic review and network meta-analysis
.
Lancet Psychiatry
 
2018
;
5
:
727
38
.

10

Group
MC.
 
A 14-month randomized clinical trial of treatment strategies for attention-deficit/hyperactivity disorder
.
Arch Gen Psychiatry
 
1999
;
56
:
1073
86
.

11

Swanson
J
,
Arnold
LE
,
Kraemer
H
 et al. ,
MTA Cooperative Group
.
Evidence, interpretation, and qualification from multiple reports of long-term outcomes in the Multimodal Treatment Study of Children with ADHD (MTA) Part I: executive summary
.
J Atten Disord
 
2008
;
12
:
4
14
.

12

Swanson
J
,
Arnold
LE
,
Kraemer
H
 et al. ,
MTA Cooperative Group
.
Evidence, interpretation, and qualification from multiple reports of long-term outcomes in the Multimodal Treatment Study of Children with ADHD (MTA) Part II: supporting details
.
J Atten Disord
 
2008
;
12
:
15
43
.

13

Molina
BS
,
Hinshaw
SP
,
Swanson
JM
 et al. ,
MTA Cooperative Group
.
The MTA at 8 years: prospective follow-up of children treated for combined-type ADHD in a multisite study
.
J Am Acad Child Adolesc Psychiatry
 
2009
;
48
:
484
500
.

14

Storebø
OJ
,
Ramstad
E
,
Krogh
HB
 et al.  
Methylphenidate for children and adolescents with attention deficit hyperactivity disorder (ADHD)
.
Cochrane Db Syst Rev
 
2015
;
11
:CD009885.

15

Storebø
OJ
,
Pedersen
N
,
Ramstad
E
 et al.  
Methylphenidate for attention deficit hyperactivity disorder (ADHD) in children and adolescents–assessment of adverse events in non-randomised studies
.
Cochrane Db Syst Rev
 
2018
;
5
:CD012069.

16

Pelham
WE
III,
Altszuler
AR
,
Merrill
BM
 et al.  
The effect of stimulant medication on the learning of academic curricula in children with ADHD: a randomized crossover study
.
J Consult Clin Psychol
 
2022
;
90
:
367
80
.

17

Arnold
LE
,
Hodgkins
P
,
Kahle
J
,
Madhoo
M
,
Kewley
G.
 
Long-term outcomes of ADHD: academic achievement and performance
.
J Atten Disord
 
2020
;
24
:
73
85
.

18

Langberg
JM
,
Becker
SP.
 
Does long-term medication use improve the academic outcomes of youth with attention-deficit/hyperactivity disorder?
 
Clin Child Fam Psychol Rev
 
2012
;
15
:
215
33
.

19

Keilow
M
,
Holm
A
,
Fallesen
P.
 
Medical treatment of attention deficit/hyperactivity disorder (ADHD) and children’s academic performance
.
PLoS One
 
2018
;
13
:
e0207905
.

20

Hernán
M
,
Robins
J.
 
Causal Inference: What If
.
Boca Raton
:
Chapman & Hall/CRC
,
2020
.
(Revised July 2023)
.

21

Hernán
MA
,
Robins
JM.
 
Using big data to emulate a target trial when a randomized trial is not available
.
Am J Epidemiol
 
2016
;
183
:
758
64
.

22

Kutcher
SA
,
Brophy
JM
,
Banack
HR
,
Kaufman
JS
,
Samuel
M.
 
Emulating a randomised controlled trial with observational data: an introduction to the target trial framework
.
Can J Cardiol
 
2021
;
37
:
1365
77
.

23

Hernán
MA
,
Sauer
BC
,
Hernández-Díaz
S
,
Platt
R
,
Shrier
I.
 
Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses
.
J Clin Epidemiol
 
2016
;
79
:
70
5
.

24

Hernán
MA
,
Wang
W
,
Leaf
DE.
 
Target trial emulation: a framework for causal inference from observational data
.
JAMA
 
2022
;
328
:
2446
7
.

25

Li
L
,
Zhu
N
,
Zhang
L
 et al.  
ADHD pharmacotherapy and mortality in individuals with ADHD
.
JAMA
 
2024
;
331
:
850
60
.

26

Kazda
L
,
McGeechan
K
,
Bell
K
,
Thomas
R
,
Barratt
A.
 
Association of attention-deficit/hyperactivity disorder diagnosis with adolescent quality of life
.
JAMA Netw Open
 
2022
;
5
:
e2236364
.

27

Fiks
AG
,
Mayne
S
,
DeBartolo
E
,
Power
TJ
,
Guevara
JP.
 
Parental preferences and goals regarding ADHD treatment
.
Pediatrics
 
2013
;
132
:
692
702
.

28

Von Elm
E
,
Altman
DG
,
Egger
M
,
Pocock
SJ
,
Gøtzsche
PC
,
Vandenbroucke
JP
,
STROBE Initiative
.
The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies
.
Lancet
 
2007
;
370
:
1453
7
.

29

Benchimol
EI
,
Smeeth
L
,
Guttmann
A
 et al. ,
RECORD Working Committee
.
The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement
.
PloS Med
 
2015
;
12
:
e1001885
.

30

Langan
SM
,
Schmidt
SA
,
Wing
K
 et al.  
The reporting of studies conducted using observational routinely collected health data statement for pharmacoepidemiology (RECORD-PE)
.
BMJ
 
2018
;
363
:
k3532
.

31

Chipman
HA
,
George
EI
,
McCulloch
RE.
 
BART: Bayesian additive regression trees
.
Ann Appl Stat
 
2010
;
4
:
266
98
.

32

Tan
YV
,
Roy
J.
 
Bayesian additive regression trees and the general BART model
.
Stat Med
 
2019
;
38
:
5048
69
.

33

VanderWeele
TJ.
 
Principles of confounder selection
.
Eur J Epidemiol
 
2019
;
34
:
211
9
.

34

Rubin
DB.
 
Should observational studies be designed to allow lack of balance in covariate distributions across treatment groups?
 
Stat Med
 
2009
;
28
:
1420
3
.

35

R Core Team
. R: A Language and Environment for Statistical Computing. Vienna, Austria. https://www.R-project.org/ (14 January
2025
, date last accessed).

36

Dorie
V.
 stan4bart: Bayesian Additive Regression Trees with Stan-Sampled Parametric Extensions. R package version 0.0–6. https://CRAN.R-project.org/package=stan4bart (14 January
2025
, date last accessed).

37

Lipsitch
M
,
Tchetgen
ET
,
Cohen
T.
 
Negative controls: a tool for detecting confounding and bias in observational studies
.
Epidemiology
 
2010
;
21
:
383
8
.

38

Zivich
PN
,
Cole
SR
,
Edwards
JK
,
Mulholland
GE
,
Shook-Sa
BE
,
Tchetgen Tchetgen
EJ.
 
Introducing proximal causal inference for epidemiologists
.
Am J Epidemiol
 
2023
;
192
:
1224
7
.

39

Tchetgen Tchetgen
EJ
,
Ying
A
,
Cui
Y
,
Shi
X
,
Miao
W.
 
An introduction to proximal causal inference
.
Statist Sci
 
2024
;
39
:
375
90
.

40

Lundberg
I
,
Johnson
R
,
Stewart
BM.
 
What is your estimand? Defining the target quantity connects statistical evidence to theory
.
Am Sociol Rev
 
2021
;
86
:
532
65
.

41

Taguma
M
,
Shewbridge
C
,
Huttova
J
,
Hoffman
N.
 OECD Reviews of Migrant Education. Norway. OECD Paris. https://www.regjeringen.no/ (14 January
2025
, date last accessed).

42

Hansen
T
,
Hauge
L
,
Biele
G
 et al.  
Developmental disorders among Norwegian born children with immigrant parents
.
Child Adolesc Psychiatry Ment Health
 
2023
;
17
:
3
11
.

43

Hernán
MA
,
Robins
JM.
 
Per-protocol analyses of pragmatic trials
.
N Engl J Med
 
2017
;
377
:
1391
8
.

44

Tanskanen
A
,
Taipale
H
,
Koponen
M
 et al.  
From prescription drug purchases to drug use periods–a second generation method (PRE2DUP)
.
BMC Med Inform Decis Mak
 
2015
;
15
:
1
13
.

45

Russell
A
,
Ford
T
,
Russell
G.
 
Barriers and predictors of medication use for childhood ADHD: findings from a UK population-representative cohort
.
Soc Psychiatry Psychiatr Epidemiol
 
2019
;
54
:
1555
64
.

46

Vegetabile
BG.
On the Distinction Between “Conditional Average Treatment Effects” (CATE) and “Individual Treatment Effects” (ITE) Under Ignorability Assumptions. arXiv, , 10 August
2021
, preprint: not peer reviewed.

47

Greenland
S.
 
Multiple-bias modelling for analysis of observational data
.
J R Stat Soc Ser A Stat Soc
 
2005
;
168
:
267
306
.

48

Biele
G
,
de Aguas
J
,
Varnet Pérez
T.
 
What can we conclude about the effect of parental income on offspring mental health?
 
Int J Epidemiol
 
2023
;
52
:
641
3
.

49

Hammerton
G
,
Munafò
MR.
 
Causal inference with observational data: the need for triangulation of evidence
.
Psychol Med
 
2021
;
51
:
563
78
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Supplementary data