Early Stimulation and Nutrition: The Impacts of a Scalable Intervention

Costs of the original program and its improvement US |${\$}$| per child per year.

	Original program	Additional intervention costs
Materials	\|${\$}$\|8	\|${\$}$\|27
Other administration costs	\|${\$}$\|2	–
Salary FAMI mother	\|${\$}$\|240	–
Mentoring	0	\|${\$}$\|88
Total without nutrition	\|${\$}$\|250	\|${\$}$\|115
Nutrition	\|${\$}$\|77	\|${\$}$\|209
Total with nutrition	\|${\$}$\|327	\|${\$}$\|322
FAMI training	N.A.	\|${\$}$\|11 one-time cost

	Original program	Additional intervention costs
Materials	\|${\$}$\|8	\|${\$}$\|27
Other administration costs	\|${\$}$\|2	–
Salary FAMI mother	\|${\$}$\|240	–
Mentoring	0	\|${\$}$\|88
Total without nutrition	\|${\$}$\|250	\|${\$}$\|115
Nutrition	\|${\$}$\|77	\|${\$}$\|209
Total with nutrition	\|${\$}$\|327	\|${\$}$\|322
FAMI training	N.A.	\|${\$}$\|11 one-time cost

Table 1.

Costs of the original program and its improvement US |${\$}$| per child per year.

	Original program	Additional intervention costs
Materials	\|${\$}$\|8	\|${\$}$\|27
Other administration costs	\|${\$}$\|2	–
Salary FAMI mother	\|${\$}$\|240	–
Mentoring	0	\|${\$}$\|88
Total without nutrition	\|${\$}$\|250	\|${\$}$\|115
Nutrition	\|${\$}$\|77	\|${\$}$\|209
Total with nutrition	\|${\$}$\|327	\|${\$}$\|322
FAMI training	N.A.	\|${\$}$\|11 one-time cost

	Original program	Additional intervention costs
Materials	\|${\$}$\|8	\|${\$}$\|27
Other administration costs	\|${\$}$\|2	–
Salary FAMI mother	\|${\$}$\|240	–
Mentoring	0	\|${\$}$\|88
Total without nutrition	\|${\$}$\|250	\|${\$}$\|115
Nutrition	\|${\$}$\|77	\|${\$}$\|209
Total with nutrition	\|${\$}$\|327	\|${\$}$\|322
FAMI training	N.A.	\|${\$}$\|11 one-time cost

The cost of the intervention we are evaluating, which is relevant both for its scalability and cost-effectiveness, should not be the same as the cost of the original program. As shown in Table 1, a substantial part of the cost of the original program is the salary of the FAMI mothers, which did not change, as the intervention did not hire additional FAMI mothers or decrease the number of children served by each FAMI. However, a substantial component of the cost of improving the existing program is the monitoring and mentoring that the FAMI mothers now receive. This amounts to |${\$}$|88 US per year per child, which covers the salaries of the tutors. For comparison, the FAMI mother’s salary corresponds to |${\$}$|240 US per child per year. Including the |${\$}$|27 US for materials yields a total cost of the coaching component of |${\$}$|115 US. Excluding the nutritional component in both the original program and this intervention, the FAMI intervention we are considering increases the cost of the program by about 46%. We consider the initial facilitator training (⁠|${\$}$|11 US) as a one-off expense to be incurred in the first year. As it could benefit subsequent cohorts of children, it should be seen as an investment with some durability.¹⁴ The largest increase in cost comes from the added nutritional package, which costs 2.71 times more than what it regularly costs, from |${\$}$|77 US to |${\$}$|209 US per child per year. Overall, the total increase in the cost of the program is of |${\$}$|322 US (or |${\$}$|333 US adding the one-off initial training), which effectively amounts to doubling the original cost of |${\$}$|327 US per child per year. Online Appendix A offers additional details on the cost of each component, and Online Appendix B includes a more thorough discussion on costs and scalability.

3. Sampling Design, Descriptive Statistics, and Implementation

The study took place between September 2014 and July 2016. At the start of the project, we prepared a pre-analysis plan and registered the trial at the ISRCTN registry (Online Appendix H).¹⁵ The intervention was intended to operate for 15 months between the end of 2014 and March 2016. In practice, the total duration varied by community, mainly to accommodate the initial training, and lasted an average of 45 weeks (10.4 months) with a range of 34–58 weeks. The logistics of rolling out the intervention implied a considerable amount of variation in exposure for the target children, mainly due to organizational issues.

The study towns were located in three departments in central Colombia (Cundinamarca, Boyacá, and Santander). They were all chosen to have (i) fewer than 40,000 inhabitants, to avoid large urban centers; (ii) at least two FAMI units;¹⁶ and (iii) no more than one unit of another public parenting program called Modalidad Familiar (MF) to minimize attrition towards this alternative program. MF is a public parenting program, similar to FAMI, that was introduced during the first half of 2014.¹⁷ The presence of MF is balanced between control and treatment sample towns, so we are de facto estimating the effect of enhancing the FAMI program in the presence of some MF. Importantly for interpreting the results of our evaluation, the presence of MF in the study sample is minimal, with only 7% of the target children leaving FAMI to join MF. We further discuss this issue below.

Out of a universe of 135 such towns in these departments, we randomly drew 49 for the treatment group and 47 for the control. We assigned the remaining 39 towns to a randomly ordered waiting list. Towns in this waiting list were used to replace towns that had completely transitioned to the new MF program (whether in treatment or control). We could successfully replace 10 of the 19 towns that no longer ran the FAMI program, which yielded a final sample of 87 towns: 46 in the treatment group and 41 in the control group.

The average number of children younger than two per FAMI unit in the sample was 9.5 (SD = 2.9), and the average number of pregnant women was 2.1 (SD = 1.7). This implies an average of 11.6 (SD = 2.8) total beneficiaries per FAMI unit. Within each unit, we enrolled in the study all children under 12 months of age at baseline, leading to a sample of N = 1,460 children (4.3 children per FAMI and 17 per town, on average). We chose this subsample of children in order to maximize the potential time of exposure to our intervention, before children outgrow the FAMI program at age two. Overall, a total of 702 children in 171 FAMI units in 46 towns received the treatment (our enhanced version of the FAMI program); and 758 children in 169 FAMI units in 41 towns were in the control group, and therefore continued to receive the FAMI program as usual. At follow-up, we tried to reach all children in the study sample, regardless of whether they were still attending a FAMI or not, and regardless of the length of their exposure to FAMI.

Online Appendix C provides further details on the study design, including power calculations, the study flow of participants, and the geographic distribution of treatment and control towns.

3.1. Data

As described in the pre-analysis plan, reported in Appendix H, we defined a number of primary outcomes. These included measures of nutritional status, namely, externally standardized height-for-age Z-scores, constructed following the World Health Organization (WHO) standards (Bayley 2006); and socio-emotional development, as measured by the Ages and Stages Questionnaire: Socio-Emotional (ASQ:SE) (Squires, Bricker, and Twombly 2009). We chose developmental tests that have been extensively used in evaluations of early care or education and/or have been recommended for LMICs (Fernald et al. 2017a). These instruments were either available in Spanish or had been previously translated, as they had been used in Colombia before among similar populations. Anthropometric measures were collected in both rounds, whereas developmental measures were only collected at follow-up. At baseline, children were younger than one year of age. Given the limited resources we had and how complex and expensive it is to reliably assess the development of such young children, we decided not to.¹⁸

For the analyses, we used internally age-standardized Bayley-III scores, where raw scores were standardized using the sample mean and SD calculated from weighted local smoothing regressions. We also aggregated all Bayley-III subscales using the factor model described in Online Appendix D, which we interpret to reflect the child’s “cognitive” development. Children with extreme values for developmental or nutritional outcomes, according to international standards, were excluded from the analyses.¹⁹

In order to obtain an understanding of the mechanisms at play, we also estimate impacts on intermediate outcomes that could have mediated the effect of the intervention on children’s developmental outcomes. In particular, we collected by maternal report, both at baseline and at follow-up, information on variables that measure the quality of the home environment, maternal self-efficacy, maternal knowledge about child development, and food insecurity.

For the quality of the home environment, we used four variables constructed from items in UNICEF’s Family Care Indicators (FCI, Kariger et al. 2012): the number of magazines, books, or newspapers in the home; the number of toy sources; the number of varieties of play materials in the home; and the number of varieties of play activities the child engaged in with an adult over the 3 days before the interview, which were summarized in a single factor, labelled “parental investment” and estimated using the factor model described in Online Appendix D. We assessed maternal self-efficacy using the self-efficacy in the nurturing role scale in Porter and Hsu (2003). This scale contains 16 items rated in 7-point scales that pertain to mothers’ perceptions of their competence on basic skills required in caring for an infant. To measure maternal knowledge about child development, we used 10 items, some selected from the Knowledge of Infant Development Inventory (KIDI, MacPhee 1981) and some developed by the research team.

Food insecurity was collected with the Latin American Scale for the Measurement of Food Insecurity (ELCSA scale), both at baseline and at follow-up. The ELCSA had been previously validated in Colombia (Álvarez Uribe and Instituto Colombiano de Bienestar Familiar 2008) and allows classifying households in four food insecurity levels: secure, mild insecurity, moderate insecurity, and severe insecurity (Álvarez Uribe and Instituto Colombiano de Bienestar Familiar 2008). In the analysis, we use an indicator that equals 1 if the household is food insecure (mild, moderate, or severe) and 0 otherwise.

Detailed socio-economic household information was also collected, including maternal vocabulary scores (a proxy for maternal IQ), which was assessed on the Spanish version of the Peabody Picture Vocabulary Test (PPVT), Test de Vocabulario en Imagenes Peabody (TVIP) (Padilla, Lugo, and Dunn, 1986).

Finally, background information on FAMI mothers was gathered directly from them in both rounds. In addition to basic socio-demographic characteristics, we also collected their vocabulary scores and knowledge of child development using the same tests as for mothers.

3.2. Descriptive Statistics

Table 2 shows baseline characteristics by treatment status. At baseline, children were, on average, for both the treatment and control groups, 5.6 months of age, and in about 27% of the cases, the father was absent from their household. Households had two children, on average; maternal average schooling was 8.6 years; and 23% of mothers were teenagers. In 2010, the teenage pregnancy rate was 21% nationwide and 30% for young girls living in households in the poorest income quintile.

Table 2.

Sociodemographic characteristics of children and their families at baseline.

	Treatment	Control	p-value	RW
Sociodemographic characteristics
Child’s age in months	5.72	5.51	0.353	0.945
	(3.39)	(3.26)
Child’s birth weight (gr)	3189	3156	0.442	0.956
	(572)	(500)
Maternal age (number of years)	26.16	26.47	0.421	0.956
	(6.84)	(6.70)
Maternal years of schooling	8.85	8.41	0.121	0.688
	(3.42)	(3.31)
Household Income (COP thousands)	526.1	477.2	0.232	0.883
	(388.1)	(340.7)
Household size	4.08	4.10	0.932	0.976
	(1.47)	(1.43)
Maternal PPVT (raw score)	22.32	19.76	0.037	0.386
	(8.53)	(8.08)
Child’s gender (% male)	51.9	50.9	0.729	0.976
First born (%)	46.6	45.1	0.655	0.976
Teenage mothers (%)	25.4	20.9	0.059	0.508
Father present (%)	69.7	75.1	0.031	0.386
Owns home (%)	37.1	39.6	0.623	0.976
Household in poverty (%)\|$^{\rm a}$\|	58.7	64	0.298	0.920
Intermediate outcomes
Parental Investment\|$^{\rm b}$\|	-0.03	0.03	0.625	0.948
	(0.96)	(1.02)
Maternal knowledge\|$^{\rm c}$\|	29.26	29.49	0.680	0.948
	(3.61)	(3.44)
Maternal self-efficacy	26.50	26.49	0.974	0.978
	(5.51)	(4.67)
Food insecurity (%)	50.4	41.9	0.219	0.631
No. of observations	700	756

	Treatment	Control	p-value	RW
Sociodemographic characteristics
Child’s age in months	5.72	5.51	0.353	0.945
	(3.39)	(3.26)
Child’s birth weight (gr)	3189	3156	0.442	0.956
	(572)	(500)
Maternal age (number of years)	26.16	26.47	0.421	0.956
	(6.84)	(6.70)
Maternal years of schooling	8.85	8.41	0.121	0.688
	(3.42)	(3.31)
Household Income (COP thousands)	526.1	477.2	0.232	0.883
	(388.1)	(340.7)
Household size	4.08	4.10	0.932	0.976
	(1.47)	(1.43)
Maternal PPVT (raw score)	22.32	19.76	0.037	0.386
	(8.53)	(8.08)
Child’s gender (% male)	51.9	50.9	0.729	0.976
First born (%)	46.6	45.1	0.655	0.976
Teenage mothers (%)	25.4	20.9	0.059	0.508
Father present (%)	69.7	75.1	0.031	0.386
Owns home (%)	37.1	39.6	0.623	0.976
Household in poverty (%)\|$^{\rm a}$\|	58.7	64	0.298	0.920
Intermediate outcomes
Parental Investment\|$^{\rm b}$\|	-0.03	0.03	0.625	0.948
	(0.96)	(1.02)
Maternal knowledge\|$^{\rm c}$\|	29.26	29.49	0.680	0.948
	(3.61)	(3.44)
Maternal self-efficacy	26.50	26.49	0.974	0.978
	(5.51)	(4.67)
Food insecurity (%)	50.4	41.9	0.219	0.631
No. of observations	700	756

Notes. Standard deviations (clustered by town) in parentheses. RW: p-values adjusted for multiple testing using the Romano–Wolf (Romano and Wolf 2005, 2016) step-down method. In this case all hypotheses in the panel are included in the RW p-value calculation. Household Income is measured in thousands of Colombian Pesos (COP).

a. % of households with total income below the poverty line in 2014 (⁠|${\$}$|50 US person/month).

b. Factor score of FCI subscales.

c. Only available at follow-up (raw scores presented).

Table 2.

Sociodemographic characteristics of children and their families at baseline.

	Treatment	Control	p-value	RW
Sociodemographic characteristics
Child’s age in months	5.72	5.51	0.353	0.945
	(3.39)	(3.26)
Child’s birth weight (gr)	3189	3156	0.442	0.956
	(572)	(500)
Maternal age (number of years)	26.16	26.47	0.421	0.956
	(6.84)	(6.70)
Maternal years of schooling	8.85	8.41	0.121	0.688
	(3.42)	(3.31)
Household Income (COP thousands)	526.1	477.2	0.232	0.883
	(388.1)	(340.7)
Household size	4.08	4.10	0.932	0.976
	(1.47)	(1.43)
Maternal PPVT (raw score)	22.32	19.76	0.037	0.386
	(8.53)	(8.08)
Child’s gender (% male)	51.9	50.9	0.729	0.976
First born (%)	46.6	45.1	0.655	0.976
Teenage mothers (%)	25.4	20.9	0.059	0.508
Father present (%)	69.7	75.1	0.031	0.386
Owns home (%)	37.1	39.6	0.623	0.976
Household in poverty (%)\|$^{\rm a}$\|	58.7	64	0.298	0.920
Intermediate outcomes
Parental Investment\|$^{\rm b}$\|	-0.03	0.03	0.625	0.948
	(0.96)	(1.02)
Maternal knowledge\|$^{\rm c}$\|	29.26	29.49	0.680	0.948
	(3.61)	(3.44)
Maternal self-efficacy	26.50	26.49	0.974	0.978
	(5.51)	(4.67)
Food insecurity (%)	50.4	41.9	0.219	0.631
No. of observations	700	756

	Treatment	Control	p-value	RW
Sociodemographic characteristics
Child’s age in months	5.72	5.51	0.353	0.945
	(3.39)	(3.26)
Child’s birth weight (gr)	3189	3156	0.442	0.956
	(572)	(500)
Maternal age (number of years)	26.16	26.47	0.421	0.956
	(6.84)	(6.70)
Maternal years of schooling	8.85	8.41	0.121	0.688
	(3.42)	(3.31)
Household Income (COP thousands)	526.1	477.2	0.232	0.883
	(388.1)	(340.7)
Household size	4.08	4.10	0.932	0.976
	(1.47)	(1.43)
Maternal PPVT (raw score)	22.32	19.76	0.037	0.386
	(8.53)	(8.08)
Child’s gender (% male)	51.9	50.9	0.729	0.976
First born (%)	46.6	45.1	0.655	0.976
Teenage mothers (%)	25.4	20.9	0.059	0.508
Father present (%)	69.7	75.1	0.031	0.386
Owns home (%)	37.1	39.6	0.623	0.976
Household in poverty (%)\|$^{\rm a}$\|	58.7	64	0.298	0.920
Intermediate outcomes
Parental Investment\|$^{\rm b}$\|	-0.03	0.03	0.625	0.948
	(0.96)	(1.02)
Maternal knowledge\|$^{\rm c}$\|	29.26	29.49	0.680	0.948
	(3.61)	(3.44)
Maternal self-efficacy	26.50	26.49	0.974	0.978
	(5.51)	(4.67)
Food insecurity (%)	50.4	41.9	0.219	0.631
No. of observations	700	756

a. % of households with total income below the poverty line in 2014 (⁠|${\$}$|50 US person/month).

b. Factor score of FCI subscales.

c. Only available at follow-up (raw scores presented).

The target population was particularly poor: Average household income was COP 501,000 per month (US 178), which represents 81% of the legal monthly minimum wage in 2014. Close to 70% of these households had answered the SISBEN survey for screening of social program eligibility, a good proxy for poverty, and 96% of those surveyed were deemed eligible for social programs (i.e. they scored in SISBEN levels 1 and 2). Similarly, 62% of households in the sample had a total income below the poverty line adjusted for household size. In 2014, the poverty rate was 42% in semi-urban and rural areas of Colombia.

The environment in which the sample children grew up is highly deprived: In terms of the home learning environment (“parental investment”), on average, these households owned 2.6 books, magazines, or newspapers and 1.4 different varieties of play materials for young children in the household, and adults were reported to have engaged in 2.5 different types of play activities with young children over the past 3 days.²⁰ For comparison, among a representative sample of low-middle-income households with children aged 6–12 months in Bogota (Colombia’s capital city), we observed an average of 3.2 different varieties of play materials and 3.4 different types of play activities. Moreover, the median household in this sample only owned three books for adults.

In Table 3, we show averages for the baseline nutritional status of children by treatment status. Specifically, we report weight-for-age, height-for-age, and height-for-weight Z-scores, in addition to a variety of nutritional indicators by deficit or excess as identified by international standards. In our sample 12% of the children are stunted. For comparison, stunting was about 9.3% for children younger than one year of age in rural areas in Colombia in 2013 and 11.8% in urban areas (as measured in the Colombian Longitudinal Household Survey, CEDE 2013). Table 3 also shows that an additional 15% of children were at risk of stunting, that is, children whose height-for-age was between -2 SD and -1 SD.

Table 3.

Nutritional status of children at baseline by randomization status.

	Treatment	Control	p-value	RW
Weight-for-age z-score	0.26	0.27	0.921	0.988
	(1.39)	(1.42)
Length/height-for-age z-score	-0.01	-0.21	0.241	0.797
	(1.68)	(1.74)
Weight-for-length z-score	0.37	0.55	0.167	0.749
	(1.59)	(1.65)
Underweight (%)	6.4	5.1	0.465	0.918
Risk of underweight (%)	9.1	10.7	0.415	0.918
Wasting (%)	5.9	6.4	0.775	0.988
Risk of wasting (%)	10.9	8.2	0.179	0.749
Stunting (%)	9.2	13.9	0.081	0.501
Risk of stunting (%)	14.7	15.5	0.793	0.988
Overweight (%)	9.9	9.2	0.707	0.988
Obesity (%)	4.8	7.3	0.174	0.749

	Treatment	Control	p-value	RW
Weight-for-age z-score	0.26	0.27	0.921	0.988
	(1.39)	(1.42)
Length/height-for-age z-score	-0.01	-0.21	0.241	0.797
	(1.68)	(1.74)
Weight-for-length z-score	0.37	0.55	0.167	0.749
	(1.59)	(1.65)
Underweight (%)	6.4	5.1	0.465	0.918
Risk of underweight (%)	9.1	10.7	0.415	0.918
Wasting (%)	5.9	6.4	0.775	0.988
Risk of wasting (%)	10.9	8.2	0.179	0.749
Stunting (%)	9.2	13.9	0.081	0.501
Risk of stunting (%)	14.7	15.5	0.793	0.988
Overweight (%)	9.9	9.2	0.707	0.988
Obesity (%)	4.8	7.3	0.174	0.749

Notes. Standard deviations (clustered by town) in parenthesis. Adjusted p-values using the Romano–Wolf (Romano and Wolf 2005, 2016) procedure (2,000 iterations, clustered by town) are included in the last column. All variables in the table are considered as one group of hypotheses. Underweight: weight-for-age |$<$| -2 SD; risk of underweight: weight-for-age between -1 SD and -2 SD; wasting: weight-for-height |$<$| -2 SD; risk of wasting: weight-for-height between -1 SD and -2 SD; stunting: height-for-age |$<$| -2 SD; risk of stunting: height-for-age between -1 SD and -2 SD; overweight: weight-for-height between 2 SD and 3 SD; and obesity: weight-for-height |$>$| 3 SD.

Table 3.

Nutritional status of children at baseline by randomization status.

	Treatment	Control	p-value	RW
Weight-for-age z-score	0.26	0.27	0.921	0.988
	(1.39)	(1.42)
Length/height-for-age z-score	-0.01	-0.21	0.241	0.797
	(1.68)	(1.74)
Weight-for-length z-score	0.37	0.55	0.167	0.749
	(1.59)	(1.65)
Underweight (%)	6.4	5.1	0.465	0.918
Risk of underweight (%)	9.1	10.7	0.415	0.918
Wasting (%)	5.9	6.4	0.775	0.988
Risk of wasting (%)	10.9	8.2	0.179	0.749
Stunting (%)	9.2	13.9	0.081	0.501
Risk of stunting (%)	14.7	15.5	0.793	0.988
Overweight (%)	9.9	9.2	0.707	0.988
Obesity (%)	4.8	7.3	0.174	0.749

	Treatment	Control	p-value	RW
Weight-for-age z-score	0.26	0.27	0.921	0.988
	(1.39)	(1.42)
Length/height-for-age z-score	-0.01	-0.21	0.241	0.797
	(1.68)	(1.74)
Weight-for-length z-score	0.37	0.55	0.167	0.749
	(1.59)	(1.65)
Underweight (%)	6.4	5.1	0.465	0.918
Risk of underweight (%)	9.1	10.7	0.415	0.918
Wasting (%)	5.9	6.4	0.775	0.988
Risk of wasting (%)	10.9	8.2	0.179	0.749
Stunting (%)	9.2	13.9	0.081	0.501
Risk of stunting (%)	14.7	15.5	0.793	0.988
Overweight (%)	9.9	9.2	0.707	0.988
Obesity (%)	4.8	7.3	0.174	0.749

In Table 4, we report the mean and standard deviation of the cognitive, language, and socio-emotional development levels for the control group as measured at follow-up (ages 17–33 months). These have been standardized with a mean of 100 and a standard deviation of 15, which is the US reference population (composite scores). Subject to all the caveats of such comparisons, this allows us to place our population relative to the expected developmental outcome under favorable conditions. The Bayley-III composite scores were 0.6 SD below the norming sample mean in both the cognitive and language scales, and 0.4 SD below in the motor scale. We also observed that 18% of children score between -1 SD and -2 SD with respect to the norming sample in cognition, 23% in language, and 15% in motor development. Only about 2%–3% would be considered at risk of developmental delay given that their composite scores are below -2 SD.

Table 4.

Developmental outcomes of children in the control group at follow-up.

	Mean (standard deviation)	N
Bayley
Cognitive composite score	91.98 (13.07)	703
Language composite score	91.59 (12.31)	702
Motor composite score	93.97 (12.58)	701
ASQ:SE
% of children at socio-emotional risk	0.38	705

	Mean (standard deviation)	N
Bayley
Cognitive composite score	91.98 (13.07)	703
Language composite score	91.59 (12.31)	702
Motor composite score	93.97 (12.58)	701
ASQ:SE
% of children at socio-emotional risk	0.38	705

Notes. Standard errors are clustered by town in parenthesis. Bayley-III composites are computed based on external standardization provided by test developers. The fraction of children at socio-emotional risk by the ASQ:SE is computed using the thresholds provided by the test developers (Squires, Bricker, and Twombly 2009).

Table 4.

Developmental outcomes of children in the control group at follow-up.

	Mean (standard deviation)	N
Bayley
Cognitive composite score	91.98 (13.07)	703
Language composite score	91.59 (12.31)	702
Motor composite score	93.97 (12.58)	701
ASQ:SE
% of children at socio-emotional risk	0.38	705

	Mean (standard deviation)	N
Bayley
Cognitive composite score	91.98 (13.07)	703
Language composite score	91.59 (12.31)	702
Motor composite score	93.97 (12.58)	701
ASQ:SE
% of children at socio-emotional risk	0.38	705

In terms of socio-emotional development, 38% of the children were at risk of developmental delay according to thresholds defined by the ASQ:SE using the test norming sample. For comparison, we know from the CEDE 2013 that 22% of children younger than two in low-Socioeconomic Status (SES) urban households were at risk of developmental delay by the same measure, 26% in high-SES urban households, and 19% in rural households in 2013.

Finally, in Online Appendix E, we present the basic characteristics of FAMI mothers by study group. On average, they were 42 years of age, had completed 13 years of education, and had almost 12 years of work experience in the FAMI program. They had an average of 2.5 children of their own. There were no jointly significant differences between FAMI mothers in treatment and control towns.

3.3. Attrition, Compliance, and Dosage

In both treatment and control towns, children in the sample might be “lost” in the follow-up survey and/or might drop out of FAMI. The first is an attrition problem, while the latter is a compliance one. At follow-up, we attempted to reassess all children, including those who dropped out of FAMI, to avoid non-random selection.

We report figures on attrition in Online Appendix F (Table F.1). The attrition rate, at 8.6%, was slightly higher in the treatment group (10.6%) than in the control group (6.7%), although the difference is significant only at the 10% level. Children lost at follow-up were older, less likely to have a resident father at home, and more likely to have mothers with lower vocabulary (PPVT) scores. Moreover, as shown by the interactions of the treatment indicator with observables, attrition affected slightly the composition of the treatment and control samples (third column of Online Appendix Table F.1). While the attrition differential between treatment and control towns was not very large, in Online Appendix G, we discuss how we deal with the potential bias that it could introduce to our impact estimates. Furthermore, there we show that attrition does not bias our main findings.

Children who dropped out of the FAMI program between baseline and follow-up, if found, were interviewed at follow-up and their families were asked for the reason to leave FAMI. A total of 47% reported that they outgrew the program eligibility age, 40% that they started attending a different ECD public program (12% a parenting program and 28% a childcare program), and 13% reported to have moved to another municipality. In Tables F.2 and F.3 in Online Appendix F, we show that the treatment slightly reduces the probability of dropping out of FAMI for an alternative program and is not related to the probability of attending MF.

If age-eligible, a family could have attended a maximum of 44 weekly group sessions and received 11 monthly home visits during the study period. In terms of effective attendance, 77.5% of all children in the treatment group assessed at follow-up participated in at least one FAMI pedagogical activity (group session or home visit), while the rest did not attend any at all. Information on participation in specific activities was collected as part of the supervision protocol of the enhanced intervention and therefore is only available for the intervention group. In Figure F.1 in Appendix F (graphs (a) and (b)), we show the distribution of children in the intervention group by total exposure to the pedagogical component of the program. Conditional on having attended at least one session, the median number of pedagogical activities attended was 28 out of a total of 55.²¹

On the main reasons why parents found it difficult to attend group sessions or receive home visits, close to 38% reported child illness, 15% reported maternal illness, and 19% reported conflict with other commitments. An additional 12% reported difficulties in finding or being able to afford transportation to the meetings, and 10% reported bad weather. The remainder reported other reasons. Children with lower program attendance were older, less likely to live with their fathers, and had younger and more educated mothers. While they exhibited better learning environments at home, they were exposed to higher verbal or physical punishment (Table F.4 in Online Appendix F).

Regarding, the nutritional component of the intervention, close to 29% of children in the treatment group did not receive any nutritional supplements, and those who received at least one, received 9.8 supplements on average (SD = 3.6) out of a maximum of 14 (Online Appendix F, Figure F.1, graph (c)). As the supplements were delivered by the FAMI mother during the first group meeting of each month, non-attendance implied that a beneficiary might not receive the supplement. We cannot verify if and how the nutritional supplement was used at home or the extent to which it was shared within the family.

Compliance with both components of the program largely overlapped with the same subsamples of children. In particular, 66% of children in the treatment group received at least one nutritional supplement and attended at least one session, 21% did not receive any nutritional supplements nor attend any sessions, 9% attended at least one session but did not receive any nutritional supplements, and 5% received at least one supplement but never attended sessions (Figure F.1, graph (d) in Online Appendix F).

4. Estimating Average Impacts

For each outcome of interest, we estimate Intent to Treat (ITT) effects on children’s development using the regression

$$\begin{equation} y_{isl,1} = \beta _0 + \beta _1 T_{sl} + \delta ^{\prime } X_{isl,0} + F_{l,0} \sigma + D_0 \theta + Z_{isl,1}\rho + \varepsilon _{isl, 1}, \end{equation}$$

(1)

where |${Y}_{isl,1}$| is an outcome of interest for child i in FAMI unit s in town l at follow-up (t = 1); |$T_{sl}$| is a dummy equal to 1 if the FAMI unit s in town l was in the treatment sample. |$X_{isl,0}$| is a set of baseline child and household characteristics, including child’s age, gender, weight-for-age and height-for-age z-scores, the household’s wealth index, maternal PPVT scores (to proxy for maternal IQ), and an indicator for the mother being an adolescent. These are included to improve efficiency and to correct for any minor baseline imbalances caused by attrition.²² Finally, |$D_0$| represents a set of department fixed effects, which control for regional differences, |$Z_{isl,1}$| is the vector of tester or interviewer dummies, and |$\varepsilon _{isl,1}$| is the residual term. We cluster standard errors of the estimates at the town level, which is the unit of randomization.

The presence of the MF program in the town does not bias our impact estimates. MF was in place before randomization, and our sample of children was drawn from those attending the FAMI center at baseline before randomization. Moreover, as documented in Online Appendix F, treatment did not affect the probability of switching to MF, and it only affected that of switching to other alternatives marginally.

In addition to average impacts, we look at impacts across the distribution of outcomes and also analyze the possibility of heterogeneous impacts in two ways. First, we consider the entire distribution of the outcomes of interest in the treatment and control samples and test for differences in these distributions using the Anderson–Darling statistics (Anderson and Darling 1952).²³ Second, we re-estimate equation (1) for subgroups in the evaluation sample. In particular, we divide the sample by wealth, as measured by a household wealth index, by the mother’s education, and by the child’s gender.

5. The Impact of the Improved FAMI

For most outcomes, we measure impacts in terms of SD units of the variable of interest in the control group. We also include the 95% confidence interval, the standard p-value for two-tailed null hypotheses, and the Romano–Wolf stepdown p-values adjusted for multiple hypotheses testing for the specific group of hypotheses presented in each table. The Romano–Wolf procedure was performed using 2,500 bootstrap replications and clustering by town.

5.1. Main Impacts

In Table 5, we report the average impacts of the intervention on the Bayley-III factor for a summary measure of overall development; the ASQ:SE for socio-emotional development; and the height-for-age Z-score for nutritional status. In subsequent tables, we present results for more disaggregated measures of these outcomes. Impacts are computed regardless of whether children actually attended the program or how many times they attended, that is, these are Ordinary Least Squares (OLS) estimates of equation (1) or ITT.

Table 5.

Impact on children’s outcomes.

	Impact (95% CI)	p-value	RW p-value
Bayley-III factor	0.163\|$^{**}$\|	0.015	0.047
	(0.035, 0.290)
ASQ:SE total score	0.021	0.722	0.704
	(-0.096, 0.139)
Height for age Z-score	0.078	0.190	0.317
	(-0.038, 0.195)

	Impact (95% CI)	p-value	RW p-value
Bayley-III factor	0.163\|$^{**}$\|	0.015	0.047
	(0.035, 0.290)
ASQ:SE total score	0.021	0.722	0.704
	(-0.096, 0.139)
Height for age Z-score	0.078	0.190	0.317
	(-0.038, 0.195)

Notes. 95% confidence interval in parenthesis for two-tailed tests. Standard errors clustered by town. Covariates included: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance to a childcare center, department and interviewer fixed effects, and baseline weight-for-age and height-for-age Z-scores. Bayley-III factor is a factor score of the five age-standardized Bayley-III scales. ASQ:SE total score is the age-standardized ASQ:SE score.

|$^{**}$|p|$<$| 0.05 based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016), as we consider three simultaneous hypotheses for children’s outcomes.

Table 5.

Impact on children’s outcomes.

	Impact (95% CI)	p-value	RW p-value
Bayley-III factor	0.163\|$^{**}$\|	0.015	0.047
	(0.035, 0.290)
ASQ:SE total score	0.021	0.722	0.704
	(-0.096, 0.139)
Height for age Z-score	0.078	0.190	0.317
	(-0.038, 0.195)

	Impact (95% CI)	p-value	RW p-value
Bayley-III factor	0.163\|$^{**}$\|	0.015	0.047
	(0.035, 0.290)
ASQ:SE total score	0.021	0.722	0.704
	(-0.096, 0.139)
Height for age Z-score	0.078	0.190	0.317
	(-0.038, 0.195)

|$^{**}$|p|$<$| 0.05 based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016), as we consider three simultaneous hypotheses for children’s outcomes.

The effect of the program on the Bayley-III factor was 0.163 SD, and it is statistically significant at the 5%, after adjusting for multiple hypotheses testing for the three primary outcomes in the table. We find no significant average impact of the program on socio-emotional development or height-for-age Z-scores. Socio-emotional development is part of the set of potential outcome variables, as the program also aimed at training mothers in sensitive and responsive parenting and appropriate behavior management. However, the curriculum had a stronger focus on cognition and language through the demonstration and practice of specific activities, which might explain the lack of effect on socio-emotional development.²⁴ We discuss further the results on nutritional status below.

As mentioned, the impacts in Table 5 are measured in terms of SD of the outcome of interest in the control group. An alternative meaningful metric would be the fraction of the gap in the outcome of interest that the estimated impact represents in a reference population. To perform such an exercise, we use a subsample of children analyzed by Rubio-Codina et al. (2015). The authors considered a sample of about 1,400 children aged 6–36 months living in families representative of the bottom 85% of the wealth distribution in Bogota and estimated a difference in the Bayley-III cognitive scale of about 0.8 SD between those in the top and the bottom 25% of such a wealth distribution, which corresponds roughly to the 17th and the 68th percentile of the entire population in the city. To make the Bogota and the FAMI samples comparable, we estimated a factor model using both samples simultaneously, but limiting the Bogota sample to children of the same age as the FAMI children. We used the Bayley-III cognitive scale, available in both samples, as an anchor and imposed a loading factor normalized to one. We find that the developmental levels of FAMI children are similar to those of children in the bottom 10% of the Bogota sample, and the impact of the intervention is equivalent to closing the gap between children in the top and bottom wealth decile by 23%.

The size of these effects is not negligible, especially if we take into account that the intervention lasted on average no more than 45 weeks and attendance was incomplete (77.5% attended at least one session). It also compares favorably to the impact of nearly 0.26 SD obtained in Attanasio et al. (2014), which was a one-on-one weekly home visiting program that lasted for 18 months with very high compliance rates.

The Role of Attrition.

As discussed earlier, there has been some attrition, which is a differential between the treatment and control groups, even conditional on observables. To assess the possible bias caused by this, we estimate a selection model where attrition is a function of baseline characteristics as well as indicators for the identity of the interviewers assigned to households at baseline and follow-up. The identity of the interviewers explains attrition, presumably because of differing quality among them. Furthermore, as interviewers were allocated randomly across towns, making their identity orthogonal to individual characteristics, their identity is a valid instrument. We also need to assume that the identity of the interviewers is unrelated to children’s outcomes, which is reasonable since those administering the Bayley-III test were different people from the interviewers collecting the household survey. The attrition equation is estimated jointly with the outcome equation. The results are reported in Table G.1 in Online Appendix G and show that our conclusions are not sensitive to correcting for such non-random attrition.

5.2. ToT and Dosage Effects

ToT Effects.

Since non-compliance with the program is one sided, we can use instrumental variables to identify the effect of ToT, using the random assignment to treatment as an instrument. There are, however, many different ways of thinking of the intensity of the program. If we measure effective participation as the fraction of children who attended at least one of the pedagogical activities of the program (i.e. a group session or a home visit), which is 77.5%, then the ToT on the Bayley-III factor is 0.21 SD. If, instead, we measure effective participation as the fraction of children in the treatment group who attended at least the unconditional median number of sessions (i.e. 21 out of 55 total), which is 53.2%, the ToT on the Bayley-III factor is 0.30 SD. Finally, if we define effective participation as the fraction of children who attended the median number of pedagogical activities conditional on having attended at least one (i.e. 28 sessions), which is 38.6%, then the ToT effect is 0.42 SD.²⁵ Thus, the potential effects are large even for a reasonably short intervention, delivered in groups. To realize such potential compliance, we would need to improve our understanding of the factors that drive attendance and whether parents misperceive the returns of the program in terms of child development. This is a key area of further research.

Dosage Effects.

By the time follow-up data were collected, the FAMI intervention had been running for about 10 months. This short interval was dictated by budgetary considerations. As discussed in Section 2, the intervention involved training the FAMI mothers for about 3.5 weeks. The trainers, divided into several groups, covered all the treatment towns in about 2 months. The end-line data collection itself extended for about 2 months. The combination of these two factors meant that by the time the outcomes were measured the potential intervention dosage that children could be exposed to in the various treatment communities varied considerably, between 34 and 58 weeks. We define the potential dosage of the intervention as the number of sessions that could have been attended during the period comprised between the date in which the children were assessed at end-line and the date on which the training had been completed, divided by 100. For the control sample, dosage is fixed at 0. As this measure of dosage was determined by logistical considerations, it is very likely to be uncorrelated with child development outcomes, and thus, we assumed it is exogenous.

To corroborate this assumption, we test whether dosage correlates with a number of village variables within the treatment group. The results do not show any discernible correlation (see Table F.4 in Online Appendix F). Furthermore, we add to the observable controls in equation (1) a variable that measures the difference in days between follow-up and baseline data collection rounds. This difference was also driven by similar logistic considerations but does not correlate with our measure of dosage.

Given this evidence, we modify equation (1) in the following fashion:

$$\begin{equation} Y_{isl,1}= \beta _0 + \beta _1{Dos}_{sl} + \delta ^{\prime } X_{isl,0} + F_{l,0}\sigma + D_0\theta + Z_{isl,1} \rho + \varepsilon _{isl,1} , \end{equation}$$

(2)

where |${Dos}_{sl}$| is dosage as defined as above. We report the results on the Bayley-III factor as the outcome of interest in Table 6.

Table 6.

Effects of potential dosage on Bayley-III factor.

	Potential dosage (standard error)	Effect of average potential dosage (p-value)
Bayley-III factor	0.209\|$^{**}$\|	0.169\|$^{**}$\|
	(0.079)	(0.010)

Notes. Standard errors clustered by town. Covariates included: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance to a childcare center, department and interviewer fixed effects, baseline weight-for-age and height-for-age Z-scores, the difference in days between baseline, and follow-up data collections. In the treatment group the potential dose varies from 34–58 weeks.

|$^{**} p < 0.05$|⁠.

Table 6.

Open in new tab Download slide

Effects of potential dosage on Bayley-III factor.

	Potential dosage (standard error)	Effect of average potential dosage (p-value)
Bayley-III factor	0.209\|$^{**}$\|	0.169\|$^{**}$\|
	(0.079)	(0.010)

|$^{**} p < 0.05$|⁠.

The estimates show a positive and significant effect (with a p-value of 0.010) of dosage equivalent to an increase of 0.209 SD in cognitive development for every 100 additional sessions. In the last column of the table, we report the impact implied by these results for the average dosage received by children in the treatment group, which is estimated at 0.169. This result is consistent with the impact reported in Table 5. We also experimented with a quadratic specification for dosage. We do not find any significant non-linearity. This result is perhaps not surprising given the relatively short amount of time the intervention had been implemented at the time we collected follow-up data.

5.3. Heterogeneous Impacts

In this subsection, we look at heterogeneity in impacts. As mentioned in Section 4, we consider both unobserved heterogeneity and heterogenous impacts of observable variables, such as wealth and maternal education.

Unobserved Heterogeneity.

Figure 1 reports the distribution of the Bayley-III factor and the ASQ:SE (socio-emotional skills) by treatment and control. To obtain each figure, we first regress the respective outcome on the control variables included in equation (1), and then we plot the distribution of the residuals of this regression for the treatment and the control groups separately. In the graph, we also report the p-value of the Anderson–Darling (AD) and the Kolmogorov–Smirnov (KS) tests for the null hypothesis of identical distributions by groups.²⁶

Figure 1.

Distribution of conditional outcomes by treatment status. Plot of the distribution of the residuals resulting from a regression of outcomes on observed characteristics described in equation (1), for the treatment and the control samples separately.

What is apparent from the graphs and the results of these tests is that the program had a significant impact on the Bayley-III factor (p-values = 0.010 and 0.012 for the AD and KS tests, respectively) and affected the distribution over most of its support. The results for the ASQ:SE are less strong; nevertheless, the p-value for the AD test is 0.067, showing some impact.

As we saw in the descriptive analysis, 12% of the children in our sample are stunted (height-for-age |$<$| -2 SD) and 15% are at the risk of stunting (-2 SD |$<$| height-for-age |$<$| -1 SD). It is well-established that stunting at this age is a good indicator of long-term malnutrition and can have long-run negative impacts on human capital development (Hoddinott et al. 2013). The program included a significant nutritional component, which given the nature of our sample, could have both a short- and a long-term impact. While Table 5 did not show significant impacts on height-for-age, the third graph in Figure 1 shows a more nuanced picture and significant impacts on the distribution of height for age (p-values = 0.050 and 0.075).

We pursue this in Table 7, where we assess the impacts on different parts of the distribution of height-for-age. The results indicate that the fraction of children whose height-for-age was below -1 SD decreased by 6.8 percentage points or 0.15 SD, while the number of children with normal height-for-age increased by a similar fraction (7.6 percentage points). Both results are statistically significant at the 5% level, even after adjusting the p-values for multiple testing, and point to the value of considering the entire distribution. This result is of importance because it has often been proven difficult to impact height-for-age through less intensive interventions (Bernal 2015).

Table 7.

Impacts on height-for-age by ranges of the distribution.

	Impacts (95% CI)	p-value	RW p-value
Pr(Height-for-age between –5 SD and –1 SD)	\|$-0.068^{**}$\|	0.024	0.044
	(⁠\|$-0.126,-$\|0.010)
Pr(Height-for-age between –1 SD and 1 SD)	0.076\|$^{**}$\|	0.013	0.033
	(0.017, 0.134)
Pr(Height-for-age between 1 SD and 5 SD)	-0.001	0.950	0.955
	(-0.025, 0.023)
Observations	Treatment 559	Control 632
	559	632

	Impacts (95% CI)	p-value	RW p-value
Pr(Height-for-age between –5 SD and –1 SD)	\|$-0.068^{**}$\|	0.024	0.044
	(⁠\|$-0.126,-$\|0.010)
Pr(Height-for-age between –1 SD and 1 SD)	0.076\|$^{**}$\|	0.013	0.033
	(0.017, 0.134)
Pr(Height-for-age between 1 SD and 5 SD)	-0.001	0.950	0.955
	(-0.025, 0.023)
Observations	Treatment 559	Control 632
	559	632

Notes. Impacts measure the change in the probabilities considered in each row in a linear probability model. Standard errors clustered by town. Covariates: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance at a childcare center, department and interviewer fixed effects, and baseline weight-for-age and height-for-age Z-scores.

|$^{**} {p} < 0.05$| based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016), considering all three hypotheses jointly.

Table 7.

Impacts on height-for-age by ranges of the distribution.

	Impacts (95% CI)	p-value	RW p-value
Pr(Height-for-age between –5 SD and –1 SD)	\|$-0.068^{**}$\|	0.024	0.044
	(⁠\|$-0.126,-$\|0.010)
Pr(Height-for-age between –1 SD and 1 SD)	0.076\|$^{**}$\|	0.013	0.033
	(0.017, 0.134)
Pr(Height-for-age between 1 SD and 5 SD)	-0.001	0.950	0.955
	(-0.025, 0.023)
Observations	Treatment 559	Control 632
	559	632

	Impacts (95% CI)	p-value	RW p-value
Pr(Height-for-age between –5 SD and –1 SD)	\|$-0.068^{**}$\|	0.024	0.044
	(⁠\|$-0.126,-$\|0.010)
Pr(Height-for-age between –1 SD and 1 SD)	0.076\|$^{**}$\|	0.013	0.033
	(0.017, 0.134)
Pr(Height-for-age between 1 SD and 5 SD)	-0.001	0.950	0.955
	(-0.025, 0.023)
Observations	Treatment 559	Control 632
	559	632

|$^{**} {p} < 0.05$| based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016), considering all three hypotheses jointly.

Observed Heterogeneity.

We now consider how average impacts differed across key groups. This exercise can help us understand whether the intervention helped the most vulnerable and from a policy perspective it helps improve targeting. We investigate whether the effects of the intervention on children’s development, as measured by the Bayley-III factor, varied by maternal education, child gender, and household wealth at baseline.

For each of these three baseline variables, we divided the sample into two groups: less than high school versus more for maternal education; boy versus girl for child’s gender; and household wealth above or below the sample median.²⁷ The results are reported in Table 8. Impacts do not seem to substantially vary by the level of maternal education. Although the point estimates are larger for mothers with complete high school (0.176 SD vs. 0.142 SD), this difference is not significant. Turning to gender, the point estimates suggest that the intervention worked better for boys, but the differences are, again, not significantly different from zero. However, we do find significant effects of wealth on the impacts, even after correcting for multiple testing, across all the six hypotheses considered jointly. The effects, at 0.24 SD, are estimated to be much stronger for children living in poorer households. Moreover, the difference between the impact on children from poorer households and that on children from the higher wealth group is significant, with a RW p-value of 0.060.

Table 8.

Heterogeneous impacts on the Bayley-III factor by child and household characteristics at baseline.

Group (⁠\|${{\mathit N}}$\|⁠)	Impacts (RW-p-value)	Estimated difference (RW-p-value)
Maternal education \|$\ge$\| complete high school (N = 660)	0.176\|$^{*}$\|	0.034
	(0.072)
Maternal education \|$<$\| complete high school (N = 632)	0.142	(0.757)
	(0.234)
Male (N = 673)	0.198\|$^{*}$\|	0.074
	(0.077)
Female (N = 619)	0.125	(0.717)
	(0.231)
Wealth index above the median (N = 657)	0.042	−0.243*
	(0.592)
Wealth index below the median (N = 635)	0.285\|$^{***}$\|	(0.060)
	(0.008)

Group (⁠\|${{\mathit N}}$\|⁠)	Impacts (RW-p-value)	Estimated difference (RW-p-value)
Maternal education \|$\ge$\| complete high school (N = 660)	0.176\|$^{*}$\|	0.034
	(0.072)
Maternal education \|$<$\| complete high school (N = 632)	0.142	(0.757)
	(0.234)
Male (N = 673)	0.198\|$^{*}$\|	0.074
	(0.077)
Female (N = 619)	0.125	(0.717)
	(0.231)
Wealth index above the median (N = 657)	0.042	−0.243*
	(0.592)
Wealth index below the median (N = 635)	0.285\|$^{***}$\|	(0.060)
	(0.008)

Notes. Heterogeneous effects estimated by subsamples: Difference is a cross-model test for ITT associated parameter. Covariates: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance to a childcare center, department and interviewer fixed effects, and baseline weight-for-age and height-for-age Z-scores. Romano–Wolf stepdown p-values for the six multiple hypotheses for the impact and three hypotheses for the differences in the last column.

|$^{*}$|p|$<$| 0.10, |$^{***}$|p|$<$| 0.01 based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016).

Table 8.

Heterogeneous impacts on the Bayley-III factor by child and household characteristics at baseline.

Group (⁠\|${{\mathit N}}$\|⁠)	Impacts (RW-p-value)	Estimated difference (RW-p-value)
Maternal education \|$\ge$\| complete high school (N = 660)	0.176\|$^{*}$\|	0.034
	(0.072)
Maternal education \|$<$\| complete high school (N = 632)	0.142	(0.757)
	(0.234)
Male (N = 673)	0.198\|$^{*}$\|	0.074
	(0.077)
Female (N = 619)	0.125	(0.717)
	(0.231)
Wealth index above the median (N = 657)	0.042	−0.243*
	(0.592)
Wealth index below the median (N = 635)	0.285\|$^{***}$\|	(0.060)
	(0.008)

Group (⁠\|${{\mathit N}}$\|⁠)	Impacts (RW-p-value)	Estimated difference (RW-p-value)
Maternal education \|$\ge$\| complete high school (N = 660)	0.176\|$^{*}$\|	0.034
	(0.072)
Maternal education \|$<$\| complete high school (N = 632)	0.142	(0.757)
	(0.234)
Male (N = 673)	0.198\|$^{*}$\|	0.074
	(0.077)
Female (N = 619)	0.125	(0.717)
	(0.231)
Wealth index above the median (N = 657)	0.042	−0.243*
	(0.592)
Wealth index below the median (N = 635)	0.285\|$^{***}$\|	(0.060)
	(0.008)

|$^{*}$|p|$<$| 0.10, |$^{***}$|p|$<$| 0.01 based on Romano–Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016).

This result is key and contains both a positive and a negative message: The intervention can indeed improve the outcomes of the most deprived group in this already poor population. However, the better-off children from this group are in no way “well-off” or middle class, and neither do they measure up well in their development against, say, even the Bogota middle class, never mind the international standards. Hence, the intervention would need to improve for this group. These results generally highlight the difficulty with improving ECD programs for broad populations, so targeting interventions to the needs of separate groups is likely to be important. No significant heterogeneous effects were found in the case of socio-emotional or nutritional outcomes.

Lastly, we investigate whether intervention impacts varied by quality of implementation and FAMI mother characteristics. We do not find any significant differences in impacts by any of the measures of implementation fidelity available, nor by FAMI mother’s age or education. The only variable for which we find some marginally significant differences in impact is a measure of FAMI mother’s “motivation”, as assessed by the tutors: Children who attended centers by a FAMI mother reported to be more “motivated” than the median, registered a higher impact (0.22SD vs. 0.07). This 0.15 difference is significant with a p-value of 0.099.

6. Understanding the Impacts

In this section, we study possible mechanisms that could have generated the documented impacts on final outcomes. We start by estimating the impact of the intervention on a number of inputs that are relevant for child development, following Heckman, Pinto, and Savelyev (2013). We then take a structural approach to estimate the causal link between the relevant inputs we consider and child development, taking into account the possible endogeneity of the former, through a production function framework similar to that in Cunha and Heckman (2008), Cunha, Heckman, and Schennach (2010), and Attanasio et al. (2020).

6.1. Effects on Intermediate Outcomes and Mediation Analysis

The intervention we are studying is a transfer in kind of early education and nutritional supplementation. As with other transfers in kind, the intervention can induce parents to change their contributions to their child’s development in other dimensions. The food supplement delivered by the intervention we are evaluating could be clawed back by reducing other food inputs to the target child, or perhaps sharing it in the family and even selling it; and the additional stimulation received by the target children could cause parents to switch attention to other children or to themselves, therefore mitigating the intervention’s impact. On the other hand, it is also possible that low-income parents are not fully aware of the returns to investing in their children (Cunha, Elo, and Culhane 2013; Attanasio, Cunha, and Jervis 2019), so that the effects of the intervention may have been generated by an increase in investment induced by a change in these beliefs. Therefore, there are also good reasons to believe that, instead of crowding out, the intervention could have led to a crowding in of resources. In this case, adding to the transfer from the intervention may have particularly high returns. Indeed, Attanasio et al. (2020) evaluates another early years stimulation intervention in Colombia and shows that, in response to it, parents crowd-in resources by increasing investments. Exploring the mediating factors and the mechanisms underlying intervention impacts is a way of obtaining answers to some of these questions. Moreover, understanding these is critical to improve the design and targeting of public policies.

We start by presenting, in Table 9, the effects of the program on the intermediate outcomes described in Section 3.1. The first row reports the impact of the intervention on parental investment, estimated from the FCI index, which captures the quality of the home environment, combines books, magazines and newspapers, play activities, and play materials in the home (see Online Appendix D). The following rows assess impacts on maternal knowledge about child development, maternal self-efficacy, and food insecurity. Maternal knowledge and self-efficacy as potential mediators capture the idea that, through the intervention, parents (mothers, in particular) might become more effective in their childrearing practices.

Table 9.

Program impacts on intermediate outcomes.

	Impact as fraction of SD in control group (95% CI)	p-value	RW p-value
Parental investment	0.340\|$^{***}$\|	0.000	0.000
	(0.207, 0.472)
Maternal knowledge (raw score)	-0.016	0.831	0.828
	(-0.160, 0.128)
Maternal self-efficacy (raw score)	0.039	0.604	0.828
	(-0.108, 0.186)
ELCSA food insecurity status	-0.089	0.220	0.496
	(-0.231, 0.052)

	Impact as fraction of SD in control group (95% CI)	p-value	RW p-value
Parental investment	0.340\|$^{***}$\|	0.000	0.000
	(0.207, 0.472)
Maternal knowledge (raw score)	-0.016	0.831	0.828
	(-0.160, 0.128)
Maternal self-efficacy (raw score)	0.039	0.604	0.828
	(-0.108, 0.186)
ELCSA food insecurity status	-0.089	0.220	0.496
	(-0.231, 0.052)

Notes: |$^{*} p < 0.10$|⁠, |$^{**} p < 0.05$|⁠, |$^{***} p < 0.01$| based on Romano-Wolf adjusted p-values (RW, Romano and Wolf 2005, 2016), considering all four hypotheses jointly. 95% confidence interval in parenthesis for two-tailed tests. OLS estimation; standard errors clustered by town. Impacts are measured in terms of SD of the control group. Covariates: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance to a childcare center, and department and interviewer fixed effects. Parental investment is measured by a factor model estimated using the subscales of FCI Home Environment Quality, as discussed in Online Appendix D.

Table 9.

Program impacts on intermediate outcomes.

	Impact as fraction of SD in control group (95% CI)	p-value	RW p-value
Parental investment	0.340\|$^{***}$\|	0.000	0.000
	(0.207, 0.472)
Maternal knowledge (raw score)	-0.016	0.831	0.828
	(-0.160, 0.128)
Maternal self-efficacy (raw score)	0.039	0.604	0.828
	(-0.108, 0.186)
ELCSA food insecurity status	-0.089	0.220	0.496
	(-0.231, 0.052)

	Impact as fraction of SD in control group (95% CI)	p-value	RW p-value
Parental investment	0.340\|$^{***}$\|	0.000	0.000
	(0.207, 0.472)
Maternal knowledge (raw score)	-0.016	0.831	0.828
	(-0.160, 0.128)
Maternal self-efficacy (raw score)	0.039	0.604	0.828
	(-0.108, 0.186)
ELCSA food insecurity status	-0.089	0.220	0.496
	(-0.231, 0.052)

The impact on the quality of the home environment was 0.34 of a SD in the control group and statistically significant, with a p-value of zero. This is a strong result and indicates that the intervention induces parents to invest more in their children. However, we do not find any statistically significant program effects on maternal knowledge about child development, maternal self-efficacy, or food insecurity.²⁸

6.2. A Structural Interpretation of the Impacts: Production Function Estimates

Given the results on intermediate outcomes, we proceed to estimate a model where child development is determined by a production function, which depends on parental investment and other background variables. Both child development and parental inputs are represented by latent variables, which are not observed directly but for which we have informative markers that allow us to estimate them by factor analysis. Given the evidence in Table 10, the sole mediator we consider for child development is parental investment. This approach is a similar to that of Heckman, Pinto, and Savelyev (2013). However, here, following Attanasio et al. (2020), we also consider the possible endogeneity of parental investments.

Table 10.

IV estimation of the production function for Bayley-III factor.

	OLS		First stage	IV
	Bayley-III factor		Parental investment	Bayley-III factor
	(1)	(2)	(3)	(4)	(5)
Treatment (T)	0.135\|$^{**}$\|	0.079	0.294\|$^{***}$\|	0.006
	(0.065)	(0.065)	(0.068)	(0.110)
Parental investment (PI)		0.185\|$^{***}$\|		0.467\|$^{*}$\|	0.454\|$^{***}$\|
		(0.036)		(0.249)	(0.171)
Time to town hall	-0.099\|$^{***}$\|	-0.079\|$^{***}$\|	-0.040	-0.048	-0.049
	(0.027)	(0.028)	(0.030)	(0.043)	(0.037)
Time to FAMI			-0.143\|$^{***}$\|
			(0.035)
First stage F-statistics
IV: time to FAMI			16.86
IV: time to FAMI and treatment			19.15
Overidentification p-value					0.956
N	1,292	1,292	1,292	1,292	1,292

	OLS		First stage	IV
	Bayley-III factor		Parental investment	Bayley-III factor
	(1)	(2)	(3)	(4)	(5)
Treatment (T)	0.135\|$^{**}$\|	0.079	0.294\|$^{***}$\|	0.006
	(0.065)	(0.065)	(0.068)	(0.110)
Parental investment (PI)		0.185\|$^{***}$\|		0.467\|$^{*}$\|	0.454\|$^{***}$\|
		(0.036)		(0.249)	(0.171)
Time to town hall	-0.099\|$^{***}$\|	-0.079\|$^{***}$\|	-0.040	-0.048	-0.049
	(0.027)	(0.028)	(0.030)	(0.043)	(0.037)
Time to FAMI			-0.143\|$^{***}$\|
			(0.035)
First stage F-statistics
IV: time to FAMI			16.86
IV: time to FAMI and treatment			19.15
Overidentification p-value					0.956
N	1,292	1,292	1,292	1,292	1,292

Notes. Standard errors are clustered by town in parenthesis. Covariates: child’s gender, an indicator of high household wealth index, maternal PPVT score, teenage mother, an indicator of high municipality population, previous attendance at a childcare center, department and interviewer fixed effects, and baseline weight-for-age and height-for-age Z-scores.

|$^{*} p < 0.10$|⁠, |$^{**} p < 0.05$|⁠, |$^{***} p < 0.01$|⁠.

Table 10.

IV estimation of the production function for Bayley-III factor.

	OLS		First stage	IV
	Bayley-III factor		Parental investment	Bayley-III factor
	(1)	(2)	(3)	(4)	(5)
Treatment (T)	0.135\|$^{**}$\|	0.079	0.294\|$^{***}$\|	0.006
	(0.065)	(0.065)	(0.068)	(0.110)
Parental investment (PI)		0.185\|$^{***}$\|		0.467\|$^{*}$\|	0.454\|$^{***}$\|
		(0.036)		(0.249)	(0.171)
Time to town hall	-0.099\|$^{***}$\|	-0.079\|$^{***}$\|	-0.040	-0.048	-0.049
	(0.027)	(0.028)	(0.030)	(0.043)	(0.037)
Time to FAMI			-0.143\|$^{***}$\|
			(0.035)
First stage F-statistics
IV: time to FAMI			16.86
IV: time to FAMI and treatment			19.15
Overidentification p-value					0.956
N	1,292	1,292	1,292	1,292	1,292

	OLS		First stage	IV
	Bayley-III factor		Parental investment	Bayley-III factor
	(1)	(2)	(3)	(4)	(5)
Treatment (T)	0.135\|$^{**}$\|	0.079	0.294\|$^{***}$\|	0.006
	(0.065)	(0.065)	(0.068)	(0.110)
Parental investment (PI)		0.185\|$^{***}$\|		0.467\|$^{*}$\|	0.454\|$^{***}$\|
		(0.036)		(0.249)	(0.171)
Time to town hall	-0.099\|$^{***}$\|	-0.079\|$^{***}$\|	-0.040	-0.048	-0.049
	(0.027)	(0.028)	(0.030)	(0.043)	(0.037)
Time to FAMI			-0.143\|$^{***}$\|
			(0.035)
First stage F-statistics
IV: time to FAMI			16.86
IV: time to FAMI and treatment			19.15
Overidentification p-value					0.956
N	1,292	1,292	1,292	1,292	1,292

|$^{*} p < 0.10$|⁠, |$^{**} p < 0.05$|⁠, |$^{***} p < 0.01$|⁠.

We estimate a production function for human capital development, which we assume to be a function of parental investment, several other environmental factors, and, potentially, the intervention itself. In particular, we assume that child development can be expressed by the Cobb–Douglas production function:

$$\begin{equation} \ln ({\textit {CD}}_{isl}) = \gamma _0 + \gamma _1 \ln (\textit {PI}_{isl}) + \gamma _2 T_{sl} + \delta ^{\prime } X_{isl} + F_l \sigma + D \theta + Z_{isl} \rho + u_{isl}, \end{equation}$$

(3)

where |${\textit {CD}}_{isl}$| is the child development latent variable and |${\textit {PI}}_{isl}$| represents the parental investments latent variable, both estimated by the factor model described in Online Appendix D and used to estimate the reduced form impacts in Tables 5 and 10. In equation (3), the treatment allocation |$T_{sl}$| can affect child development both directly and through its impact on parental investments (PI). The covariates |$X_{isl}$| include the child’s gender, household wealth, maternal PPVT score, a dummy variable for teenage mothers, and distance to the municipality’s Town Hall to capture unobserved differences in household socio-economic condition. We also control for baseline childcare attendance and municipal population. Earlier studies also controlled for lagged child development. However, as explained, we did not collect baseline developmental outcomes since the children were too young to obtain a precise measure with the resources we had available. Instead, we control for the child’s nutritional status at baseline’namely, height-for-age and weight-for-age. As before, D represents department fixed effects, and |$Z_{isl}$| is the vector of tester fixed effects. Finally, |$u_{isl}$| represents unobservable factors determining child development, including shocks experienced by the child and additional inputs not observed by the researchers but possibly chosen by parents. The Cobb–Douglas assumption is consistent with the evidence in Cunha, Heckman, and Schennach (2010) and in Attanasio et al. (2020), who performed a similar analysis on another early stimulation intervention in Colombia delivered through home visits rather than group sessions.

The main challenge in estimating the parameters in equation (3) is the fact that parental investment |${\textit {PI}}_{isl}$| is likely to be endogenous, as the parents might be reacting to shocks experienced by the child or might choose investment jointly with other inputs. While the treatment is exogenous by construction, since it is assigned randomly across communities, it is not necessarily a valid exclusion restriction because it can have an independent effect on the outcome. Indeed, a question we pose is whether the treatment affects child development directly or whether its impact is mediated by parental investment. To answer this question, we need to establish the causal link between investment and child development. We therefore need an instrument, |$W_{isl}$|⁠, that affects parental investment while not affecting child development directly. For this purpose, we use the travel time from the household residence to the FAMI center. To control for differences between households that are centrally located versus households that live in more outlying areas (that could differ in unobservable dimensions), we control for distance to the Town Hall when estimating equation (3) by instrumental variables (IV). Therefore, we estimate a first-stage investment equation of the form:

$$\begin{equation} \ln (\textit {PI}_{isl})=\pi _0 + \pi _1T_s + \pi _2W_{isl} + \gamma ^{\prime } X_{is} + v_{is}, \end{equation}$$

(4)

where the covariates |$X_{is}$| are the same as those in the production function in equation (3).

In the first column of Table 10, we report the treatment effect on the Bayley-III factor, estimated by OLS, and in the second column, we introduce parental investment, also using OLS. The coefficient on treatment is reduced in size, and it is no longer statistically different from zero, demonstrating that if the OLS assumption is valid, then the impact is mediated by parental investments (although we cannot necessarily ignore the coefficient on treatment because it is quite large, albeit imprecisely estimated).

In the third column of Table 10, we report the estimates of the investment equation coefficients |$\pi _1$| associated with treatment allocation and |$\pi _2$| associated with travel time to FAMI, which serves as an instrument when we estimate the production function shown in the subsequent columns. This is strongly significant, even conditional on distance to the Town Hall, which is intended to capture how centrally the household is located. Importantly the F-statistic is large enough to rule out a weak instrument problem, whether treatment is used an additional exclusion restriction or not (see bottom of column (3)).

In the fourth column of Table 10, we re-estimate the production function, as in column (2) but using IV. These estimates show a much higher impact of investments and a zero direct effect of treatment: The point estimates imply that the entire effect of treatment is driven by an increase in parental investments through the intervention. The difference between the investment coefficients in columns (2) and (4) from 0.185 to 0.467 is significant at the 10% level and consistent with the results reported in Attanasio et al. (2020), where the coefficient in the production function of child development also increased considerably after accounting for the endogeneity of parental investment. This suggests that parents are compensating for negative shocks when choosing an investment.

Given this last consideration, in the fifth column of Table 10, we remove the intervention from the production function (3). Now the coefficient on investment is 0.454, and it is significant at the 1% level. We notice that the model is now overidentified, as we now have two instruments for the single endogenous variable, |${\textit {PI}}_{is}$|⁠. When testing the implied overidentifying restriction, we do not reject the null of the correct specification.

7. Discussion and Conclusions

Interventions that promote ECD, starting from birth, may well be the key to successful human capital policies, particularly in poor environments. However, the characteristics and effectiveness of such programs at scale are not well understood yet. In recent years, many early years interventions have been implemented worldwide, but effective and sustainable programs at scale are rare. Furthermore, many institutionalized initiatives are of low quality (Lo, Das, and Horton 2016). Scaling up is not only a question of funds, but also of the available human resources in a variety of different contexts. A possible approach to deploying early years intervention at scale is to determine whether existing large-scale programs (and their infrastructure) can be successfully improved, so to guarantee the quality required for them to have significant impacts on children.

In this study, we present results from an experiment where we designed and implemented a scalable intervention that was added to an existing government group-based parenting support intervention, combined with nutritional supplementation. Effectively, the intervention we study is an improvement of an existing national program, consisting of incorporating structured content (curriculum of activities) and training and coaching for program facilitators, as well as nutrition education and a larger and higher quality nutritional supplement. As we have discussed, this design offers a directly scalable policy, both in terms of its costs and in its implementability, given the existing infrastructure and human resources. We should stress that we are not evaluating the impact of FAMI as it exists or of our intervention compared to a situation with no program. As we have mentioned, FAMI has existed for many years, and a direct evaluation of it does not exist and would be difficult, if not impossible, to perform. On the other hand, we think that our exercise is useful and relevant for the current policy debate, which is considering improvements and not the abolition of FAMI.

Our curriculum is an adaptation of RU, a home visitation program shown to be effective in altering the long-run cognitive trajectory of children from deprived environments in its original implementation in Jamaica (Walker et al. 2011; Gertler et al. 2014). Adaptations of the curriculum to a variety of contexts and countries have also had positive impacts on developmental outcomes (see Grantham-McGregor and Smith (2016) for a review).

Evaluation of group-based adaptations of RU and other parenting programs is, however, more limited. Yet, they represent a promising and natural low-cost approach to improving outcomes in vulnerable populations in a more efficient manner as delivery is less intensive in human resources. Furthermore, while the delivery of the RU curriculum in groups might imply a reduced focus on the specific needs of an individual child, well-run groups might induce positive effects by improving existing networks and acquaintances and provide role models for some mothers.

The fact that we find reasonably-sized positive impacts in the short time span covered by our data collections is important, in practice, the intervention would last longer, and children would hopefully graduate into pre-schools where they could gradually build up their abilities and school readiness, thus addressing one key cause of poverty persistence. The evidence we present also points to potentially large gains where they are most needed, namely, among the poorest. The importance of these results is even more apparent if we consider the fact that compliance with the number of sessions actually attended by children and their caregivers was relatively low and the intervention was relatively short, at least in comparison with the most successful efficacy trials referred to in this study. And yet our intervention had an ITT effect of 16% of a SD and a ToT effect of up to 42% of a SD in development. Moreover, there was a reduction in the fraction of children whose height-for-age was below -1 SD of 5.8 percentage points.

Some features of this particular study make us believe that these estimates are lower bounds of the potential of this intervention. First, the control group had access to the basic program, without the improved intervention, unlike similar studies in the literature in which the control group did not receive any intervention. Second, as stressed, the average impact reflects larger impacts for the children most in need and a small or null impact for the better-off children. Third, and most importantly, it was not possible to fully control and enforce the many relevant implementation aspects that might be needed to ensure fidelity of the intervention and impact development.²⁹ In fact, the implementation of the intervention was far from smooth and faced various challenges. Examples of the problems encountered include the low duration of participant exposure to the program, logistical difficulties for the delivery of pedagogical materials and the nutritional supplements in complicated rural geographies, heterogeneity in the fidelity of program implementation, and initial resistance of program providers to change their behavior. The implementation problems we document in our context are common to many programs implemented at scale.

The focus on the scalability is one of the most salient aspects of this study and reflects the difficulties policy makers face when moving from small trials to larger studies with reduced control over what actually happens in the field. As we suggest above, when an intervention is scaled-up, one needs to consider not only financial costs but also the possibility of sustaining the quality of implementation given the existing service infrastructure. On the latter, we notice that our intervention was implemented on top an existing program, with a minimal involvement on the part of the researcher team. Our results indicate that despite a number of implementation problems, which were in part present because we wanted to work with a model that could be reproduced at scale, the enhancement we evaluated had a sizeable effect on the children most in need. However, we do recognize that it is not obvious that a scaled-up intervention could maintain the level and quality of training and mentoring that were achieved during the study, although we stress that the evaluation did not use personnel with special qualifications. In any case, it is clear that proper mentoring should be developed with care.

Regarding the financial cost of the intervention, we notice that the cost of the pedagogical component of the intervention was |${\$}$|115 US per child per year (⁠|${\$}$|27 US for pedagogical materials and |${\$}$|88 US for coaching) plus a |${\$}$|11 US one-off cost per child for FAMI pre-service training. At scale, there could be important economies of scale in the mentoring system, by far the largest component of the total pedagogical cost, which could reduce these figures substantially. The cost of the additional nutritional supplementation was |${\$}$|209 US per child per year. By the end of this study, the Colombian government adopted the nutritional supplementation evaluated herein nationwide, with an investment of |${\$}$|10 US million. The pedagogical component corresponds to 40% of the operational cost of the unenhanced version of the FAMI program, equivalent to 1.7 monthly minimum wages per year. In contrast, center-based childcare services cost |${\$}$|1,100 US per child per year. Or the transition to large childcare centers, which has been one of the center pieces of recent government policy, costs |${\$}$|780 US per child per year, more than twice the intervention we are studying. Therefore, the cost of our intervention is moderate, especially, in comparison to other ECD programs in the country, and financially sustainable.

As we stressed above, the impacts of the intervention we evaluated are relative to a status quo where children of the same age were receiving an unimproved program. To interpret these results, it is useful to put them in the context of the quality of other public early-years services in Colombia. Bernal (2013) presents a diagnostic of public childcare quality by modality, using standard measures. Quality levels are low for all modalities, close to minimum standards. This pattern is also found in other Latin-American countries. Part of the problem is precisely the lack of a structured curriculum and supervision/mentoring strategies, which is what the improvement we evaluate introduces to FAMI. What we show is that scaling up services with quality is possible within an existing institutional infrastructure that allows for such coaching and mentoring strategies. The evidence we presented suggests that it is possible to gradually improve the quality of nationwide programs at scale in a way that is affordable. Ours is an enhancement of an existing program that leverages on local low-skilled human resources. Critically, the intervention specifically aims at improving process quality (such as the integration of a structured curriculum and improved interactions between caregivers and children supported by coaching and mentoring), which the literature has shown to be critically associated with child developmental outcomes (Yoshikawa, Weiland, and Brooks-Gunn 2016).

A key question is whether these short-term impacts sustain over time. Andrew et al. (2018) reports that the effects on child development and parental investment documented in Attanasio et al. (2014) disappear two years after the end of the intervention. The authors mention that this result might be due to a small initial effect (similar to ours) and/or the lack of continued family support for early stimulation. The impact fade-out observed for the intervention studied by Attanasio et al. (2014) is not unique. Several studies have found that medium-term program impacts might vanish but reappear later in the child’s life-cycle (Lawrence et al. 2005).

In Attanasio et al. (2014), intervention activities ended as soon as the study ended. In our case, however, the intervention effectively kept running since an important part of it consisted of the training of the facilitators in the pre-existing program. In particular, most treated FAMI providers continued to use the curriculum even though they were no longer being coached. In addition, participants in public programs are more likely to continue to be enrolled in similar public programs as children grow. For example, children could have moved on to home-based childcare, provided through the Hogares Comunitarios program (Bernal and Fernández 2013), which could help reinforce or maintain these effects over time.

The total number of FAMI beneficiaries has decreased since 2013. However, close to 150,000 children are still part of this program. Crucially, the toolkit developed for this intervention is flexible and easily adaptable to any ECD programs facilitated by paraprofessional personnel, as many are in Colombia, as well as in other developing countries. As we discuss in detail in Online Appendix B, it would be straightforward to replicate at scale the training and coaching strategy proposed in this study by leveraging on the already existing monitoring and supervision infrastructure for community-based programs, including FAMI. Training professional staff in local ICBF offices would be feasible, and they could easily implement both training and coaching of FAMI and similar programs run by paraprofessional personnel.

While the pre-existing program is present everywhere in Colombia, we implemented and evaluated the improvement in Central Colombia. This choice was motivated by the fact that this region tends to be more culturally and ethnically homogeneous with respect to other parts of Colombia, such as the coastal regions (both Pacific and Atlantic) where Afro-Colombians and indigenous) are more likely to reside. Scale up in these regions would likely require additional piloting and adaptation.

To conclude, we show that a scalable program can have substantial effects on child development in highly deprived populations at a low cost and based on government infrastructure. Improving the quality of large-scale programs in developing countries can form a key element of the policy toolkit for fighting poverty.

Acknowledgments

This study was funded by Grand Challenges Canada Grant (GCC) 0462-03-10 and Fundación Éxito (FE). Attanasio was partly funded by an European Research Council Advanced Grant AdG—695300. Meghir was partly funded by National Institutes of Health grant R01HD7210, the Cowles Foundation, and Institution for Social and Policy Studies at Yale. Attanasio, Bernal, Meghir and Rubio-Codina were funded by the Jacobs Foundation Marbach Residence Program in 2017, for a visit that contributed greatly to this study. This trial is registered at the ISRCTN Registry, trial no. ISRCTN93757590. The Universidad de los Andes ethics committee (no. 287/2014) and the University College London research ethics committee (no. 2168/011) approved this study. We thank the Instituto Colombiano de Bienestar Familiar (ICBF), the ICBF program supervisors, and program coordinators at FE for facilitating the intervention; the FAMI program providers, children, and families who willingly participated in the study; all the staff including nine tutors and field manager M. L. Gómez; all research staff: Santiago Lacouture, Alejandro Sánchez, Sara Ramírez, and Diana Pérez; the IQuartil data collection team; and the experts from GCC. The editor and three anonymous referees provided useful comments. The study presents the authors’ views and not those of the institutions they belong to, including the Inter-American Development Bank, its board of directors, or the countries they represent. Attanasio is a Research Associate at the NBER. Meghir is a Research Associate at the NBER and a Research Fellow at the CEPR and IZA.

Notes

The editor in charge of this paper was Paola Giuliano.

Footnotes

Cunha et al. (2006), Heckman (2006), Engle et al. (2007), Doyle et al. (2009), Almond and Currie (2011), Pongcharoen et al. (2012), Shonkoff et al. (2012), and Yoshikawa et al. (2013).

Moran et al. (2004) and Nowak and Heinrichs (2008) reported that the most effective parenting programs included an evidence-based curriculum, systematic training of frontline workers, and opportunities for parents to learn and practice with children.

The p-values we report are adjusted for multiple testing as explained in the main body of the paper.

Such as the integration of a structured curriculum and improved interactions between caregivers and children supported by coaching and mentoring.

Examples from the United States include the Nurse-Family Partnership (NFP) (Olds et al. 1986a,b, 1994; Heckman et al. 2017) and the Promising Practices (Brooks-Gunn et al. 1994; McCormick et al. 2006). In LIMCs, there is the well-known Jamaica home visiting model, which provided early stimulation (play-based activities) and nutritional supplementation (powdered milk) to stunted children in slums in Kingston for 24 months and obtained large impacts on ECD outcomes in the short term that translated into improved IQ and mental health (Walker et al. 2011) and higher wages (Gertler et al. 2014) in adulthood.

This approach has applied to all public ECD services in the country to date. The Board for Early Childhood has emphasized the principle of curricular freedom, and national standards are intentionally broad. Program providers are expected to adapt the learning standards to their own programs.

While we received authorization from the ICBF to implement and evaluate the intervention, its deployment was not publicized.

This was done in two stages: an initial stage of 2 weeks and a second stage of 1.5 weeks about 2 months later, on average. More specifically, towns with less than five FAMI units received 75 hours of training in 3 weeks, towns with six–nine FAMI units were trained for 100–125 hours in 5–6 weeks, and towns with more than ten FAMI units received training during 150–175 hours offered during 6–7 weeks.

Specific differences with respect to how they had typically worked were: (i) practicing play activities with mothers and their children; (ii) practicing language activities with babies; (iii) making home-made toys with mothers; (iv) encouraging parents to play with their children at home; and (v) listening to parents about their achievements at home. Almost all of them (99%) reported that they would continue to use the proposed curriculum after the end of the project.

10.

In fact, it included more than that as it allowed for a potential consumption of up to 20% of the supplement’s nutritional content by other household members.

11.

The package contained tuna, sardines, canola oil, iron-fortified whole milk (the only micronutrient included), beans, and lentils.

12.

We could not evaluate the stimulation component alone, that is, without the nutritional component, because both were part of the original program. Dissociating them for evaluation purposes was not feasible, both logistically and ethically.

13.

This is conservative, given an average of 9.5 children younger than two per FAMI unit in the sample (plus 2.1 pregnant women); and a nationwide average of 13 (SD = 1.4; range = [10, 24]), as computed using administrative data for 2013 (before the intervention started).

14.

While a similar argument on durability could be made for the materials, experience has taught us that their depreciation rate is quite high as they are rotated among families. Hence, it is safe to assume that they do need to be replaced, approximately, on a yearly basis.

15.

The trial registration is at http://www.isrctn.com/ISRCTN93757590.

16.

This requirement is associated with the power calculations for the trial and to facilitate the logistics associated with the training and coaching carried out by the tutors, who had to travel across various towns.

17.

MF is similar to FAMI in that it serves beneficiaries through monthly home visits and weekly group meetings, but (1) it serves children 0–5 years of age, while FAMI serves children aged 0–2; (2) it has a set-up infrastructure for group meetings (a center), while FAMI uses other community spaces or FAMI’s own home; (3) it serves, on average, 45 beneficiaries as compared to close to 15 in FAMI; (4) it is led by a professional and an assistant, as compared to a single person who is not required to have a college degree in FAMI; (5) it offers a nutritional supplement five times larger than that of FAMI; and (6) it has access to a group of professionals including a psychologist and a nutritionist who support MF activities.

18.

Child development assessments and anthropometric measures were collected by testers with degrees in psychology and health, respectively. The remaining variables in the household survey were collected by regular enumerators prior to the child assessments.

19.

Specifically, we excluded 12 children who scored more than 3 SD below the mean on the Bayley-III cognitive scale (possible disability) and 15 children who were 6 SD below the mean and 6 SD above the mean of height-for-age (extreme observations).

20.

These variables are not shown in Table 1 but correspond to the components of the FCI “parental investment” factor.

21.

Some other children in the treatment group might have dropped out of the FAMI program between baseline and the beginning of the intervention due to the time elapsed to complete the training of the FAMI mother (up to 4 months). These children, therefore, would not have attended any of the enhanced sessions.

22.

Item non-response in baseline covariates is not correlated with treatment status. Thus, we imputed missing covariate values with the average of the non-missing observations and accounted for this imputation with a dummy variable in equation (1). The exact fraction of imputed observations varies by covariate up to a maximum of 6.8%.

23.

Such a test is considered more powerful to detect differences in the tails of the distribution than the Kolmogorov–Smirnoff test (Engmann and Cousineau 2011).

24.

Note also that the measures used to capture socio-emotional development might not be very precise.

25.

There is an additional complication in estimating ToT effects from the ITT impacts we report. As we mentioned above, our estimate represents the impact of the improved FAMI relative to the standard FAMI (status quo), which is attended by the children in the control group. Presumably, there are also compliance problems in the control program on which, unfortunately, we do not have data. The ToT estimate we have discussed should be interpreted as the impact of a fully compliant improved FAMI over the business-as-usual FAMI in which compliance does not change.

26.

The Anderson–Darling test focuses more on the tails of the distribution and has been shown to have greater power than alternative tests, such as the Kolmogorov–Smirnov test (Bennett 2008), which focuses on first-order dominance.

27.

The wealth index is computed as the first principal component of a number of dwelling characteristics (such as the material of walls, floors, and roofs, the number of bathrooms and rooms, access to utilities, etc.), and durable goods ownership.

28.

The effect we found is not as strong as that reported in Attanasio et al. (2014) of 0.5 SD on play materials and play activities with adults at home and resulting from a home visiting intervention in Colombia. We return to this issue in the Discussion section.

29.

FAMI providers continued to be paid and supervised by the government with no legal obligation or additional monetary incentive to participate in our program. They were strongly encouraged to do so, but they could choose not to without any practical consequence.

References

Almond

Douglas

Currie

Janet

(

2011

). “

Human Capital Development Before Age Five

.” In

Handbook of Labor Economics

, vol.

, edited by

Ashenfelter

Card

, 1st ed.,

chap. 15

Elsevier

, pp.

1315

–

1486

Álvarez

Uribe

Cecilia

Martha

de Bienestar Familiar

Instituto Colombiano

Instituto Colombiano de Bienestar Familiar

(

2008

Adaptacion y validación interna y externa de la Escala Latinoamericana y el Caribe para la Medición de Seguridad Alimentaria en el Hogar -ELCSA- Colombia: componente adaptación lingüística de la ELCSA

Proyecto de Fortalecimiento a la Seguridad Alimentaria y Nutricional en Colombia—PROSEAN

Anderson

T. W.

Darling

D. A.

(

1952

). “

Asymptotic Theory of Certain “Goodness of Fit” Criteria Based on Stochastic Processes

.”

The Annals of Mathematical Statistics

193

–

212

Anderson

Theodore W.

Rubin

Herman

(

1956

). “

Statistical Inference in Factor Analysis

.” In

Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability

, vol.

, pp.

111

–

150

Andresen

Elena M.

Malmgren

Judith A.

Carter

William B.

Patrick

Donald L.

(

1994

). “

Screening for Depression in Well Older Adults: Evaluation of a Short Form of the CES-D

.”

American Journal of Preventive Medicine

–

Andrew

Alison

Attanasio

Orazio

Bernal

Raquel

Sosa

Lina Cardona

Krutikova

Sonya

Rubio-Codina

Marta

(

2019

). “

Preschool Quality and Child Development

.”

NBER Working Paper No. 26191

Andrew

Alison

Attanasio

Orazio

Fitzsimons

Emla

Grantham-McGregor

Sally

Meghir

Costas

Rubio-Codina

Marta

(

2018

). “

Impacts 2 Years After a Scalable Early Childhood Development Intervention to Increase Psychosocial Stimulation in the Home: A Follow-Up of a Cluster Randomised Controlled Trial in Colombia

.”

PLoS Medicine

e1002556

Araujo

M. Caridad

Dormal

Marta

Grantham-McGregor

Sally

Lazarte

Fabiola

Rubio-Codina

Marta

Schady

Norbert

(

2021

). “

Home Visiting at Scale and Child Development

.”

Journal of Public Economics Plus

100003

Attanasio

Orazio

Cattan

Sarah

Fitzsimons

Emla

Meghir

Costas

Rubio-Codina

Marta

(

2020

). “

Estimating the Production Function for Human Capital: Results from a Randomized Controlled Trial in Colombia

.”

American Economic Review

110

(

–

Attanasio

Orazio

Cunha

Flávio

Jervis

Pamela

(

2019

). “

Subjective Parental Beliefs. Their Measurement and Role

.”

NBER Working Paper No. 26516

Attanasio

Orazio P.

Fernández

Camila

Fitzsimons

Emla O. A.

Grantham-McGregor

Sally M.

Meghir

Costas

Rubio-Codina

Marta

(

2014

). “

Using the Infrastructure of a Conditional Cash Transfer Program to Deliver a Scalable Integrated Early Child Development Program in Colombia: Cluster Randomized Controlled Trial

.”

BMJ

349

g5785

Bayley

Nancy

(

2006

Bayley Scales of Infant and Toddler Development: Bayley-III

Working paper

Harcourt Assessment, Psych. Corporation

Becker

Gary

(

1994

Human Capital: A Theoretical and Empirical Analysis with Special Reference to Education

, 3rd ed.

NBER

Bennett

Christopher J.

(

2008

). “

Consistent Integral-Type Tests for Stochastic Dominance

.”

Working paper

Vanderbilt University.

Bernal

Raquel

(

2013

). “

The Cost of Early Childhood Policy in Colombia

.”

Working paper

Universidad de los Andes

Bernal

Raquel

(

2015

). “

The Impact of a Vocational Education Program for Childcare Providers on Children’s Well-Being

.”

Economics of Education Review

165

–

183

Bernal

Raquel

Attanasio

Orazio

Peña

Ximena

Vera-Hernández

Marcos

(

2019

). “

The Effects of the Transition from Home-Based Childcare to Childcare Centers on Children’s Health and Development in Colombia

.”

Early Childhood Research Quarterly

418

–

431

Bernal

Raquel

Fernández

Camila

(

2013

). “

Subsidized Childcare and Child Development in Colombia: Effects of Hogares Comunitarios de Bienestar as a Function of Timing and Length of Exposure

.”

Social Science & Medicine

241

–

249

Bitler

Marianne P.

Hoynes

Hilary W.

Domina

Thurston

(

2014

). “

Experimental Evidence on Distributional Effects of Head Start

.”

NBER Working paper

Black

Maureen M.

Walker

Susan P.

Fernald

Lia C. H.

Andersen

Christopher T.

DiGirolamo

Ann M.

Chunling

McCoy

Dana C.

Fink

Günther

Shawar

Yusra R.

Shiffman

Jeremy

et al. (

2017

). “

Early Childhood Development Coming of Age: Science Through the Life Course

.”

The Lancet

389

–

Bradley

Robert H.

Putnick

Diane L.

(

2012

). “

Housing Quality and Access to Material and Learning Resources within the Home Environment in Developing Countries

.”

Child Development

–

Britto

Pia R.

Lye

Stephen J.

Proulx

Kerrie

Yousafzai

Aisha K.

Matthews

Stephen G.

Vaivada

Tyler

Perez-Escamilla

Rafael

Rao

Nirmala

Patrick

Fernald

Lia C. H.

et al. (

2017

). “

Nurturing Care: Promoting Early Childhood Development

.”

The Lancet

389

–

102

Brooks-Gunn

Jeanne

McCarton

Cecilia M.

Casey

Patrick H.

McCormick

Marie C.

Bauer

Charles R.

Bernbaum

Judy C.

Tyson

Jon

Swanson

Mark

Bennett

Forrest C.

Scott

David T.

et al. (

1994

). “

Early Intervention in Low-Birth-Weight Premature Infants: Results through age 5 Years from the Infant Health and Development Program

.”

JAMA

272

1257

–

1262

Cattan

Sarah

Conti

Gabriella

Farquharson

Christine

Ginja

Rita

(

2019

). “

The Health Effects of Sure Start

.”

Working paper

Institute for Fiscal Studies

Centro de Estudios sobre Desarrollo Económico (CEDE), Facultad de Economía, Universidad de los Andes

(

2013

). “

Encuesta Longitudinal Colombiana de la Universidad de los Andes

.”

Working paper, Public data. Bogotá, Colombia, https://encuestalongitudinal.uniandes.edu.co/es/datos-elca/2013-ronda-2

Chang

Susan M.

Grantham-McGregor

Sally M.

Powell

Christine A.

Vera-Hernández

Marcos

Lopez-Boo

Florencia

Baker-Henningham

Helen

Walker

Susan P.

(

2015

). “

Integrating a Parenting Intervention with Routine Primary Health Care: A Cluster Randomized Trial

.”

Pediatrics

136

272

–

280

Cunha

Flávio

Elo

Irma

Culhane

Jennifer

(

2013

). “

Eliciting Maternal Expectations about the Technology of Cognitive Skill Formation

.”

NBER Working Paper No. 19144

Cunha

Flavio

Heckman

James J.

(

2008

). “

Formulating, Identifying and Estimating the Technology of Cognitive and Noncognitive Skill Formation

.”

Journal of Human Resources

738

–

782

Cunha

Flavio

Heckman

James J.

Lochner

Lance

Masterov

Dimitriy V.

(

2006

). “

Interpreting the Evidence on Life Cycle Skill Formation

.” In

Handbook of the Economics of Education

, vol.

, edited by

Hanushek

Welch

chap. 12

Elsevier

, pp.

697

–

812

Cunha

Flavio

Heckman

James J.

Schennach

Susanne M.

(

2010

). “

Estimating the Technology of Cognitive and Noncognitive Skill Formation

.”

Econometrica

883

–

931

PubMed

Devercelli

Amanda Epstein

Sayre

Rebecca Kraft

Denboba

Amina Debissa

(

2016

). “

What do We Know About Early Childhood Development Policies in Low and Middle Income Countries?

”

Working paper

The World Bank

Doyle

Orla

Harmon

Colm P.

Heckman

James J.

Tremblay

Richard E.

(

2009

). “

Investing in Early Human Development: Timing and Economic Efficiency

.”

Economics & Human Biology

–

Engle

Patrice L.

Black

Maureen M.

Behrman

Jere R.

Cabral De Mello

Meena

Gertler

Paul J.

Kapiriri

Lydia

Martorell

Reynaldo

Eming Young

Mary

Child Development Steering Group

International

et al. (

2007

). “

Strategies to Avoid the Loss of Developmental Potential in More than 200 Million Children in the Developing World

.”

The Lancet

369

229

–

242

Engmann

Sonja

Cousineau

Denis

(

2011

). “

Comparing Distributions: The Two-Sample Anderson-Darling Test As An Alternative to the Kolmogorov–Smirnoff Test

.”

Journal of Applied Quantitative Methods

–

Fernald

Lia C. H.

Prado

Elizabeth

Kariger

Patricia

Raikes

Abbie

(

2017a

A Toolkit for Measuring Early Childhood Development in Low and Middle-Income Countries

No. 29000 in World Bank Publications - Books, World Bank

Fernald

Lia C. H.

Kagawa

Rose

Knauer

Heather A.

Schnaas

Lourdes

Garcia Guerra

Armando

Neufeld

Lynnette M.

(

2017b

). “

Promoting Child Development Through Group-Based Parent Support Within a Cash Transfer Program: Experimental Effects on Children’s Outcomes

.”

Developmental Psychology

222

Frongillo

Sywulka

Kariger

Patricia

(

2003

). “

UNICEF Psychosocial Care Indicators Project. Final Report to UNICEF

.”

Working paper

División de Ciencias de la Nutrición, Universidad de Cornell

Gertler

Paul

Heckman

James

Pinto

Rodrigo

Zanolini

Arianna

Vermeersch

Christel

Walker

Susan

Chang

Susan M.

Grantham-McGregor

Sally

(

2014

). “

Labor Market Returns to an Early Childhood Stimulation Intervention in Jamaica

.”

Science

344

998

–

1001

Gibaud-Wallston

Wandersman

L. P.

(

1979

). “

Development and Utility of the Parenting Sense of Competence Scale

.”

Working paper

American Psychological Association

Grantham-McGregor

Walker

(

2015

). “

The Jamaican Early Childhood Home Visiting Intervention

.”

Early Childhood Matters

124

–

Grantham-McGregor

Sally

Adya

Akanksha

Attanasio

Orazio

Augsburg

Britta

Behrman

Jere

Caeyers

Bet

Day

Monimalika

Jervis

Pamela

Kochar

Reema

Makkar

Prerna

et al. (

2020

). “

Group Sessions or Home Visits for Early Childhood Development in India: A Cluster RCT

.”

Pediatrics

146

e2020002725

Grantham-McGregor

Sally

Smith

Joanne A.

(

2016

). “

Extending the Jamaican Early Childhood Development Intervention

.”

Journal of Applied Research on Children: Informing Policy for Children at Risk

Hamadani

Jena D.

Mehrin

Syeda F.

Tofail

Fahmida

Hasan

Mohammad I.

Huda

Syed N.

Baker-Henningham

Helen

Ridout

Deborah

Grantham-McGregor

Sally

(

2019

). “

Integrating an Early Childhood Development Programme into Bangladeshi Primary Health-Care Services: An Open-Label, Cluster-Randomised Controlled Trial

.”

The Lancet Global Health

e366

–

e375

Heckman

James

Pinto

Rodrigo

Savelyev

Peter

(

2013

). “

Understanding the Mechanisms Through Which An Influential Early Childhood Program Boosted Adult Outcomes

.”

American Economic Review

103

(6),

2052

–

2086

Heckman

James J.

(

2006

). “

Skill Formation and the Economics of Investing in Disadvantaged Children

.”

Science

312

1900

–

1902

Heckman

James J.

Holland

Margaret L.

Makino

Kevin K.

Pinto

Rodrigo

Rosales-Rueda

Maria

(

2017

). “

An Analysis of the Memphis Nurse-Family Partnership Program

.”

NBER Working Paper No. 23610

Hjort

Jonas

et al. (

2017

). “

Universal Investment in Infants and Long-Run Health: Evidence from Denmark’s 1937 Home Visiting Program

.”

American Economic Journal: Applied Economics

–

104

Hoddinott

John

Behrman

Jere R.

Maluccio

John A.

Melgar

Paul

Quisumbing

Agnes R.

Ramirez-Zea

Manuel

Stein

Aryeh D.

Yount

Kathryn M.

Martorell

Reynaldo

(

2013

). “

Adult Consequences of Growth Failure in Early Childhood

.”

The American Journal of Clinical Nutrition

1170

–

1178

Hoynes

Hilary

Whitmore Schanzenbach

Diane

Almond

Douglas

(

2016

). “

Long-Run Impacts of Childhood Access to the Safety Net

.”

American Economic Review

106

903

–

934

Kariger

Patricia

Frongillo

Edward A.

Engle

Patrice

Rebello Britto

Pia M.

Sywulka

Sara M.

Menon

Purnima

(

2012

). “

Indicators of Family Care for Development for Use in Multicountry Surveys

.”

Journal of Health, Population, and Nutrition

472

PubMed

Kline

Patrick

Walters

Christopher R.

(

2016

). “

Evaluating Public Programs with Close Substitutes: The Case of Head Start

.”

Quarterly Journal of Economics

131

1795

–

1848

Lawrence

Montie Schweinhart

Jeanne

Zongping

William

S. B.

Clive

R. B.

Milagros

(

2005

). “

Lifetime Effects: The High/Scope Perry Preschool Study Through Age 40

.”

The Academy of Experimental Criminology

–

Selina

Das

Pamela

Horton

Richard

(

2016

). “

A Good Start in Life Will Ensure a Sustainable Future for All

.”

Lancet (London, England)

389

–

Love

John M.

Eliason Kisker

Ellen

Ross

Christine

Raikes

Helen

Constantine

Jill

Boller

Kimberly

Brooks-Gunn

Jeanne

Chazan-Cohen

Rachel

Banks Tarullo

Louisa

Brady-Smith

Christy

et al. (

2005

). “

The Effectiveness of Early Head Start for 3-Year-Old Children and Their Parents: Lessons for Policy and Programs

.”

Developmental Psychology

885

MacPhee

David

(

1981

). “

Knowledge of Infant Development Inventory

.”

Educational Testing Service

Princeton, NJ

McCormick

Marie C.

Brooks-Gunn

Jeanne

Buka

Stephen L.

Goldman

Julie

Jennifer

Salganik

Mikhail

Scott

David T.

Bennett

Forrest C.

Kay

Libby L.

Bernbaum

Judy C.

et al. (

2006

). “

Early Intervention in Low Birth Weight Premature Infants: Results at 18 Years of Age for the Infant Health and Development Program

.”

Pediatrics

117

771

–

780

Moran

Patricia

Ghate

Deborah

Van Der Merwe

Amelia

Policy Research Bureau

(

2004

What Works in Parenting Support?: A Review of The International Evidence

DFES Publications

London

Neville

Helen

Pakulak

Eric

Stevens

Courtney

(

2015

). “

Family-Based Training to Improve Cognitive Outcomes for Children from Lower Socioeconomic Status Backgrounds: Emerging Themes and Challenges

.”

Current Opinion in Behavioral Sciences

166

–

170

Nowak

Christoph

Heinrichs

Nina

(

2008

). “

A Comprehensive Meta-Analysis of Triple P-Positive Parenting Program using Hierarchical Linear Modeling: Effectiveness and Moderating Variables

.”

Clinical Child and Family Psychology Review

114

–

144

Olds

David L.

Henderson

Charles R.

Chamberlin

Robert

Tatelbaum

Robert

(

1986a

). “

Preventing Child Abuse and Neglect: A Randomized Trial of Nurse Home Visitation

.”

Pediatrics

–

Olds

David L.

Henderson

Charles R.

Tatelbaum

Robert

Chamberlin

Robert

(

1986b

). “

Improving the Delivery of Prenatal Care and Outcomes of Pregnancy: A Randomized Trial of Nurse Home Visitation

.”

Pediatrics

–

Olds

David L.

Henderson Jr

Charles R.

Kitzman

Harriet

(

1994

). “

Does Prenatal and Infancy Nurse Home Visitation Have Enduring Effects on Qualities

.”

Child Abuse and Neglect

–

Padilla

E. R.

Lugo

D. E.

Dunn

L. M.

(

1986

Test de Vocabulario en Imágenes Peabody (TVIP)

American Guidance Service

Circle Pines, MN

Pedersen

F. A.

Bryan

Y. E.

Huffman

Del Carmen

(

1989

). “

Constructions of Self and Offspring in the Pregnancy and Early Infancy Periods

.”

Working paper

Society for Research in Child Development

Pongcharoen

Tippawan

Ramakrishnan

Usha

DiGirolamo

Ann M.

Winichagoon

Pattanee

Flores

Rafael

Singkhornard

Jintana

Martorell

Reynaldo

(

2012

). “

Influence of Prenatal and Postnatal Growth on Intellectual Functioning in School-Aged Children

.”

Archives of Pediatrics & Adolescent Medicine

166

411

–

416

Porter

Christin L.

Hsu

Hui-Chin

(

2003

). “

First-Time Mothers’ Perceptions of Efficacy During the Transition to Motherhood: Links to Infant Temperament

.”

Journal of Family Psychology

–

Robling

Michael

Bekkers

Marie-Jet

Bell

Kerry

Butler

Christopher C.

Cannings-John

Rebecca

Channon

Sue

Corbacho Martin

Belen

Gregory

John W.

Hood

Kerry

Kemp

Alison

et al. (

2016

). “

Effectiveness of a Nurse-Led Intensive Home-Visitation Programme for First-Time Teenage Mothers (Building Blocks): A Pragmatic Randomised Controlled Trial

.”

The Lancet

387

146

–

155

Romano

Joseph P.

Wolf

Michael

(

2005

). “

Stepwise Multiple Testing as Formalized Data Snooping

.”

Econometrica

1237

–

1282

Romano

Joseph P.

Wolf

Michael

(

2016

). “

Efficient Computation of Adjusted P-Values for Resampling-Based Stepdown Multiple Testing

.”

Statistics & Probability Letters

113

–

Rubio-Codina

Marta

Attanasio

Orazio

Meghir

Costas

Varela

Natalia

Grantham-McGregor

Sally

(

2015

). “

The Socioeconomic Gradient of Child Development: Cross-Sectional Evidence from Children 6–42 Months in Bogota

.”

Journal of Human Resources

464

–

483

Shonkoff

Jack P.

Garner

Andrew S.

Siegel

Benjamin S.

Dobbins

Mary I.

Earls

Marian F.

McGuinn

Laura

Pascoe

John

Wood

David L.

Dependent Care Committee on Psychosocial Aspects of Child, Family Health, Committee on Early Childhood, Adoption

et al. (

2012

). “

The Lifelong Effects of Early Childhood Adversity and Toxic Stress

.”

Pediatrics

129

e232

–

e246

Singla

Daisy R.

Kumbakumba

Elias

Aboud

Frances E.

(

2015

). “

Effects of a Parenting Intervention to Address Maternal Psychological Wellbeing and Child Development and Growth in Rural Uganda: A Community-Based, Cluster-Randomised Trial

.”

The Lancet Global Health

e458

–

e469

Squires

Bricker

Twombly

(

2009

Technical Report on ASQ: SE

Paul H. Brookes Publishing

Baltimore, Co

Walker

Susan P.

Chang

Susan M.

Powell

Christine A.

Grantham-McGregor

Sally M.

(

2005

). “

Effects of Early Childhood Psychosocial Stimulation and Nutritional Supplementation on Cognition and Education in Growth-Stunted Jamaican Children: Prospective Cohort Study

.”

The Lancet

366

1804

–

1807

Walker

Susan P.

Chang

Susan M.

Powell

Christine A.

Simonoff

Emily

Grantham-McGregor

Sally M.

(

2006

). “

Effects of Psychosocial Stimulation and Dietary Supplementation in Early Childhood on Psychosocial Functioning in Late Adolescence: Follow-Up of Randomised Controlled Trial

.”

BMJ

333

472

Walker

Susan P.

Chang

Susan M.

Smith

Joanne A.

Baker-Henningham

Helen

(

2018

). “

The Reach up Early Childhood Parenting Program: Origins, Content, and Implementation

.”

Zero to Three

–

Walker

Susan P.

Chang

Susan M.

Vera-Hernández

Marcos

Grantham-McGregor

Sally

(

2011

). “

Early Childhood Stimulation Benefits Adult Competence and Reduces Violent Behavior

.”

Pediatrics

127

849

–

857

World Health Organization

(

2006

WHO Child Growth Standards: Length/Height-For-Age, Weight-For-Age, Weight-For-Length, Weight-For-Height and Body Mass Index-For-Age: Methods and Development

World Health Organization

World Health Organization

(

2007

WHO Child Growth Standards: Head Circumference-For-Age, Arm Circumference-For-Age, Triceps Skinfold-For-Age and Subscapular Skinfold-For-Age: Methods and Development

World Health Organization

Yoshikawa

Weiland

Brooks-Gunn

Burchinal

M. R.

Espinosa

L. M.

Gormley

W. T.

Zaslow

M. J.

(

2013

Investing in Our Future: The Evidence Base on Preschool Education

Society for Research in Child Development

Yoshikawa

Hirokazu

Weiland

Christina

Brooks-Gunn

Jeanne

(

2016

). “

When does Preschool Matter?

”

The Future of Children

–

Yousafzai

Aisha K.

Obradović

Jelena

Rasheed

Muneera A.

Rizvi

Arjumand

Portilla

Ximena A.

Tirado-Strayer

Nicole

Siyal

Saima

Memon

Uzma

(

2016

). “

Effects of Responsive Stimulation and Nutrition Interventions on Children’s Development and Growth at Age 4 Years in a Disadvantaged Population in Pakistan: A Longitudinal Follow-up of a Cluster-Randomised Factorial Effectiveness Trial

.”

The Lancet Global Health

e548

–

e558

Yousafzai

Aisha K.

Rasheed

Muneera A.

Rizvi

Arjumand

Armstrong

Robert

Bhutta

Zulfiqar A.

(

2014

). “

Effect of Integrated Responsive Stimulation and Nutrition Interventions in the Lady Health Worker programme in Pakistan on Child Development, Growth, and Health Outcomes: A Cluster-Randomised Factorial Effectiveness Trial

.”

The Lancet

384

1282

–

1293