-
PDF
- Split View
-
Views
-
Cite
Cite
Eva-Maria Egger, Cecilia Poggi, Héctor Rufrancos, Does the depth of informality influence welfare in urban Sub-Saharan Africa?, Oxford Economic Papers, Volume 76, Issue 1, January 2024, Pages 187–206, https://doi.org/10.1093/oep/gpac052
- Share Icon Share
Abstract
We explore the relationship between household welfare and informality, measuring household informality as the share of members’ activities (hours worked or income) without social insurance. We discretize these measures into four bins or portfolios and assess their influence on consumption, as a measure of welfare. Cross-sectional regressions for five urban Sub-Saharan African countries reveal a non-linear relationship between the depth of informality and household welfare. A mixed formality household portfolio has at least the same welfare as a fully formal one. Using panel data for Nigeria, we assess household switches in informality portfolios, accounting for the selection on unobservables, and find it explains most welfare differences. Switching informality portfolios does not change welfare trajectories, with the notable exception of welfare gains for fully informal households becoming fully formal. From a policy perspective, our results suggest that policies incentivizing the formalization of the marginal worker may not result in perceivable welfare effects.
1. Introduction
Recent estimates suggest that around 70% of the workers in developing countries are informal and that the size of the informal workforce is positively correlated to poverty rates (Ohnsorge and Yu, 2021) and negatively correlated to Gross Domestic Product per capita (Slonimczyk, 2014). Efforts are ongoing among researchers and policy-makers to understand better how informality affects the well-being of populations (e.g. Freeman, 2010; ILO, 2018; OECD and ILO, 2019; Ulyssea, 2020; Deléchat and Medina, 2021). While poverty is usually measured at the household level, informality is mainly referred to as an individual worker’s or enterprise’s characteristic. In this article, we argue that in the context of welfare analysis, informality should not be considered solely from an individual but also from the household perspective.
For individual workers, the literature shows that by moving into formal employment, workers can realize income gains (Danquah et al., 2021). Existing research indicates that some policies favour this formalization and thus contribute to income growth (Jessen and Kluve, 2021), but not all policies are effective, like reducing the costs of entering the formal sector (Ulyssea, 2020). A reduction in informality at the country level is not always associated with a reduction in poverty (OECD and ILO, 2019), which raises questions about how the composition of informality influences the welfare of households. Informality is associated with risks for livelihoods, such as lack of insurance against a health shock. The dearth of protective measures in case of need either directly pulls people into poverty or keeps them poor by changing their behaviour to less risky and less profitable activities (Dercon, 2002; Gertler and Gruber, 2002). Moreover, households may rely on members’ employment in specific activities as a risk-coping strategy, such that the income generation portfolio of a household may directly influence its welfare trajectory.
We define two continuous measures of the depth of informality at the household level, one defined as the share of income from informal activities and the other as the proportion of informal labour. We then discretize these household measures into four bins or portfolios and investigate how such informality portfolios relate to household welfare in low-income settings. This analysis does not propose or test a specific theoretical model. However, it assumes an interplay between labour market participation and the choice of a formal or informal occupation that could directly affect the overall welfare of the household. We define a formal wage occupation as wage employment with access to social insurance, such as health insurance or pension coverage.1 A formal wage occupation is more difficult to lay off, thus providing greater job stability and a more stable wage stream to contribute to personal as well as household expenditures. An underlying assumption of our analysis is that social insurance through formal wage employment could serve as a buffer now or in the future and that its selection (beyond availability) may depend on the value that people attach to its benefits. In principle, wage earners may accept somewhat lower wages (net of social security contributions) if the social security benefits are considered worth that sacrifice. Moreover, depending on the national setting, the availability of health insurance could in principle reduce out-of-pocket health expenditures.
Informal jobs are likely to be more readily available—there are fewer barriers to recruitment, they may abound seasonally, remuneration may be instantaneous with daily payments or based on task—and are characterized by a greater degree of flexibility than formal ones, being part-time, with variable shifts, allowing shorter distances like home-based work, allowing childcare in the workplace, etc. There thus exists an important inter-dependence between labour market participation and informality, as many of its defining traits may be pivotal to the participation of some groups, like women. Our analysis parts from the premise that, even in urban labour markets where formal employment opportunities are scarce, households choose ‘portfolios’ of formal and informal job combinations. Different combinations of jobs at the household’s disposal could in principle diversify income and thus improve its welfare. Thus, it is an empirical question whether the welfare of a household is affected by its portfolio composition in an urban developing country setting.
We do not model individual choices between formal and informal jobs (Falco, 2014), nor intra-household dynamics on the determinants of expenditure (Hoddinott and Haddad, 1995). We implicitly assume that portfolios are derived from a model of common preferences among household members (Thomas, 1990) jointly diversifying their production and/or pooling resources (Levy, 2008, Ch. 3). Thus, a household’s portfolio characterizes the income generation capacity among formal and informal wage-employment activities. An example of modelling such a setting can be found in Anderberg (2003).
Our analysis consists of two stages. First, we perform a piece-wise non-linear regression analysis relating the depth of household informality to household consumption in urban areas of five low-income countries. We compare this relationship to that of using a simple dummy of informality of the household head. This measure is commonly used to control for employment characteristics in welfare analysis at the household level. We draw on cross-sectional data of the nationally representative Living Standards Measurement Study—Integrated Surveys on Agriculture (LSMS-ISA) from Ethiopia, Malawi, Niger, Nigeria, and Tanzania. All countries have both sizeable informal sector participation and some form of social insurance that, albeit fragmented, is an option to some occupational arrangements within and outside of public sector wage employment. It is widely documented that informality beyond self-employed farm work is pervasive in the African region, also among wage workers and especially in urban areas (see e.g. OECD and ILO, 2019; Danquah et al., 2021; Azunre et al., 2022).
Secondly, we assess the role of transitions between different informality portfolios for household welfare in a panel analysis using the LSMS-ISA data from Nigeria. We investigate whether the relationships estimated from the cross-sectional data are robust to longitudinal estimation. We study six potential portfolio changes. Building upon the labour economics literature that inspects participation into activities such as unionization (e.g. Freeman, 1984; Swaffield, 2001; Addison et al., 2014; Gutierrez Rufrancos, 2019) or informality (e.g. Gong et al., 2004; Bosch and Maloney, 2010), we compare changes in the households’ income portfolio (between fully formal, mixed, and fully informal) in a traditional dynamic setting between status ‘switchers’ versus ‘stayers’ using a two-period two-way fixed effects (TWFEs) estimation strategy. However, this approach is beset by the issue of households selecting into changes. In order to mitigate these issues and that of unobserved heterogeneity, we adopt a strategy that compares portfolio switchers with those who make the same transition in the following period (Goodman-Bacon, 2021).
The findings of our analysis provide four main insights: First, both our measures of depth of informality defined as income or labour input shares present similar results. Thus, one can apply our definition also in settings where only one of the two is available in the data. Secondly, the welfare penalty for an informal household head is directly comparable with that of fully informal households as defined by our measure. Thirdly, using cross-sectional data, we find that households with a mix of formal and informal income sources (or activities) show no worse welfare outcomes than fully formal households in all five countries and consistently better outcomes than fully informal ones. We find that households with some formal employment fare even better if the household diversifies their income-generating portfolio with additional income from informal activities. Thus, the literature focus on a dummy capturing a household’s head informality may obscure the income diversification effects found. Finally, we adopt a differences-in-differences strategy using Nigerian data to investigate household switching. We account for the selection on unobservables by comparing switchers with soon-to-be switchers and find that this explains the cross-sectional gap we observed almost entirely. This finding indicates that households optimize their portfolio seeking to smooth out welfare changes. We also find a substantial increase in consumption of 39.5% after moving from full informality to full formality, but we do not find any other switches resulting in welfare changes. Although these estimates have the limitation to pertain to a short time period, we confirm the robustness of our findings to small sample sizes by applying randomization inference (Young, 2019).
This article contributes to two distinct strands of literature. First, it speaks to the development literature that studies household welfare maximization (Dercon, 2002; Attanasio and Lechene, 2002; De Weerdt and Dercon, 2006). There is growing evidence that social insurance and assistance can reach beyond the direct beneficiary, as is the case for the social pension in South Africa (Duflo, 2000, 2003; Bertrand et al., 2003; Burns et al., 2005; Posel et al., 2006). The welfare literature seldom considers informality, and when it does, it is often characterized solely through binary classification, such as via the household head’s informality status. However, the information of the household head alone does not sufficiently inform anti-poverty policies (Brown and van de Walle, 2021). This study takes a more nuanced approach, by considering the composition of a household’s employment portfolio. Secondly, it speaks to the labour literature on employment transitions or earnings gaps between informal and formal sectors (see e.g. Maloney, 1999; Gong et al., 2004; Bosch and Maloney, 2007, 2010; Bosch et al., 2007; Falco et al., 2011; Nordman et al., 2016; Danquah et al., 2021). In the few studies on informality in Sub-Saharan Africa (SSA), identified factors influencing the variance of individual earnings between formal and informal occupations are occupational differences (high or low skilled), unobserved marketability, relative position in the earnings distribution, but also sectoral effects matter like the size of the enterprise where the workers are engaged (Falco et al., 2011; Nordman et al., 2016; Danquah et al., 2021). None of these approaches consider the whole household.
In the next section, we present the data and variables. Section 3 briefly proposes the estimation strategy of the cross-sectional regressions and their results. The model specification and results for the dynamic analysis in urban Nigeria are presented and discussed in Section 4. Section 5 concludes.
2. Data, variables, and descriptive analysis
2.1 Data
The data used for this study are the LSMS-ISA from five countries in SSA, namely Ethiopia, Malawi, Niger, Nigeria, and Tanzania.2 The surveys were chosen for three reasons. First, the comparability of survey questionnaires allows to construct the same variables for all countries. There are still some limitations in the identification of access to social insurance, employment status, or type due to differences in the content of questionnaires across countries as well as different social insurance systems. When applicable, we comment on the comparability between countries. Secondly, all household surveys are nationally representative and comprehensively cover both, income from and labour input in various livelihood activities. This enables us to construct indicators of formality at the household level that go beyond simple headcounts of formal workers. Furthermore, the surveys all include comparable extensive consumption modules which can be used for the welfare outcome. Thirdly, in contrast to, for example, Latin American countries, social insurance coverage remains sparse in SSA but is increasingly gaining attention in the policy debate. We focus on urban areas where the scarce opportunities for social insurance through wage employment are most common.3 In all countries, some urban households had to be dropped from the sample due to missing observations in the relevant variables after cleaning the data.
2.2 Measuring informality at the household level and welfare
We construct two measures of informality at the household level: the share of all informal full-time equivalents (FTEs) worked and the share of income from informal sources relative to all income-generating activities of the household. These measures allow for one household member to work in different activities, of which one or both could be informal. Thus, they account for secondary job holdings and do not simply count informal job holders among household members. Our analysis is concerned with the overall household informality level, as such a job as self-employed is considered informal. The wage formality definition related to social security coverage by country is further detailed in Supplementary Appendix A, Section A1. In order to calculate heterogeneity in relatively small urban sample sizes (cross-sectional observations in the urban LSMS data range between 1,100 and 1,600), for the cross-sectional analysis, we group households into four bins by the size of their informality share, clustering households without any informal income, those with 1% to 50%, those with 51% to 99%, and those with 100% informal income. In the panel analysis, we explore dynamic transitions over three bins (fully informal, mixed, and fully formal).
FTE shares are computed as follows. First, all hours worked annually in each activity are computed for each working-age household member and assigned to be formal or informal based on our formality definition. Then, these hours are divided by the FTE of hours worked in a year. Full-time work is assumed to be 12 months per year, 4.3 weeks per month, and 40 h per week resulting in 2,016 h per year. All FTEs are replaced to 0 for those not engaging in any activity and a maximum of 16 h of work per day and 52 weeks a year are imposed. Despite these limits, it is still possible for an individual to have greater than two FTEs in total due to multiple employment activities, so we re-scale these to be maximum two per individual but distributed across various activities in the same proportion as before the capping at two. FTEs are then summed up at the household level by formal and informal activities and the share of informal FTEs overall FTEs of all household members is computed.
To get informal household income shares, we compute the income from each activity that is assigned as formal or informal and then aggregate these at the household level. The share of income from informal activities relative to total household income is computed. In Supplementary Appendix A, Section A2, we report descriptive statistics at the country level of these measures compared with a binary household head. We show that the average share of informal income varies between 80% in Ethiopia and 88% in Tanzania and that the magnitude is relatively consistent across definitions (Supplementary Table A2). Moreover, explore and discuss in Supplementary Table A3, Section A2 for each country the distribution of households across the informal income and FTE share, finding similarly across urban areas that there are indeed heterogeneous household groupings of job types, with the bin 1–50% of informal jobs accounting for around 10% of households across countries (8–9% applying the FTE measure) and the bin 51–99% of informal jobs being smaller at roughly 5% of households using both measures.
Our outcome of interest is household welfare, measured as daily per capita expenditure expressed in constant US$ Purchasing Power Parity from 2011. This variable is expressed on a logarithmic scale to reduce the influence of outliers and express differences in terms of percentages. Although for our main analysis we consider only consumption, we provide extensive additional results showing the robustness of our story to an alternative welfare indicator in the form of poverty in Supplementary Appendix B, and other measures of welfare such as a wealth index in Supplementary Appendix C.
Looking at household welfare levels and other characteristics (Supplementary Table A4) sheds some more light on our new proposed measures. In terms of welfare (per capita consumption), we observe that households with a fully formal income portfolio (first panel of Supplementary Table A4) have higher expenditure levels than fully informal households (last panel). This also confirms that formal jobs receive on average higher remuneration. However, the households with informality shares between these extremes reveal a less obvious pattern. For example, in Ethiopia, Niger, and Tanzania, average consumption is higher among households with 1% to 50% informal income than for those with no informal income. In the other two countries, this is not the case, but consumption levels are closer to those of a fully formal household than a fully informal household. The households with 51% to 99% of their income from informal sources have welfare levels between the fully formal and fully informal ones, and the differences vary by country. One argument for the similar welfare levels of the first two groups (0% informal, 1–50% informal) could be that in households with 1–50% informal income one member has a formal job and another member works informally resulting in a higher total income and welfare than a household with 0% informal income where only one member works in a formal job. Looking at the number of jobs, household size, and dependency ratios, we observe some indications for this mechanic. Households are on average larger, have a lower dependency ratio, and report more jobs in the categories of a mix of informal and formal income compared to households with a fully formal or fully informal income portfolio. In terms of education, households with any formal income, that is, informal income share is less than 100%, are better educated. In contrast, land ownership seems positively correlated with the share of informal income. In the countries of this study, land ownership goes hand in hand with some agricultural activity by the household, which is considered informal by our definition. Thus, households in the first group, the fully formal ones, barely own any land.
3. Depth of informality in cross-sectional setting
3.1 Estimation strategy
We run a simple regression of the welfare outcome Yi on our informality measure Inf and other household, Xi, and local, Zl, characteristics:
Yi is the log of per capita expenditure. βk refers to our continuous informality measure that has been discretized into four bins. The lowest bin represents 0.1% to 50% of informality, followed by 51% to 99% of informality, the final bin captures all observations with 100, meaning households whose income is earned fully from or all FTEs worked are in the informal sector. These dummies are all relative to the base category of 0 informal work or income shares, that is the households that are fully formal. The logic behind this functional form is to avoid rigidly assuming that there is a linear dose–response function with respect to a household’s informality mix. We use only four bins due to the limited sample size for the urban areas of the cross-sectional data used. Xi is a vector of household-level controls and includes the sex and age of the household head, the share of household members with secondary education, the household size, the dependency ratio, a dummy whether the household owns any land, and sum of jobs from all working household members. The vector Zl is a vector of geographic controls that include dummies for the administrative areas of the highest level to capture structural differences between regions.
The household characteristics in the vector Xi are not only included because they are relevant determinants of household welfare levels widely used in the literature, but they are also confounding factors when assessing the relationship between our informality measures and welfare. As reported in detail in Supplementary Appendix A, when performing a descriptive statistics for the informality portfolios in the urban areas under analysis, households with a mixed-income/activity portfolio in terms of informality shares are on average larger and have a lower dependency ratio than other portfolios. Thus, a concern could arise that we estimate a simple mechanical relation wherein more jobs equal more income. As the estimation accounts for potential workers in a household as relevant confounding factors in the regressions, our informality shares are not simple job counts positively correlated with household size (extensive margin) but reflect labour allocation decisions across formal and informal activities at the intensive margin. Controlling for household size, dependency ratio, and number of jobs thus enables us to provide a meaningful interpretation of the estimation results.
The results of these regressions will be contrasted with those of a regression where a simple dummy for a household head working in the informal sector is used instead of our gradual measure of informality.
3.2 Results
Regression results are plotted as coefficient graphs for ease of presentation. The graphs include the coefficient of the simple informal household head dummy (labelled as Dummy) and then, from a separate regression, plot the three coefficients of each informality bin compared to the 0 bin (= fully formal). The full regression results with control variables can be found in the Supplementary Appendix E. Figures 1 and 2 plot the coefficients of the regressions of household consumption on the income share earned from informal sources (Figure 1) or the share of FTEs worked in informal activities (Figure 2). The first coefficient in each figure is from the separate regression with the dummy of the informal status of the household head.

Coefficients of informality measures from regression of per capita expenditure. Share of income earned from informal sources. Base category is fully formal income.
Notes: The graphs plot coefficients and confidence intervals from two different regressions for each country. The first coefficient is that of the dummy indicating an informal household head from one regression. The other three coefficients are those of the informality bins from the regression as specified in Equation (1). The base category is households with no informal income source.
Source: Authors’ compilation based on cross-sectional data.

Coefficients of informality measures from regression of per capita expenditure. Share of FTEs worked in informal activities. Base category is fully formal activities.
Notes: The graphs plot coefficients and confidence intervals from two different regressions for each country. The first coefficient is that of the dummy indicating an informal household head from one regression. The other three coefficients are those of the informality bins from the regression as specified in Equation (1). The base category is households with no informal income source.
Source: Authors’ compilation based on cross-sectional data.
In Malawi, Niger, Nigeria, and Tanzania, households with an informally employed household head have significantly lower levels of consumption than households with formally employed heads. In Ethiopia, this relationship is just insignificant. Looking at the depth of informality the main insights are that households with an informally working household head have significantly lower per capita expenditure, around 20% less in Niger, 30% in Malawi, Nigeria, and Tanzania. Only in Ethiopia do we not find any significant consumption differences for any of the informality measures. In Malawi, Niger, and Nigeria, the size of these expenditure gaps are the same as when comparing households with 100% of household income earned from informal sources to those with fully formal income sources (fourth coefficient).
Looking at the partially formal households (second and third coefficient), in all countries, there are no significant differences between partially formal and fully formal households. In Malawi and Tanzania, when using labour shares (Figure 2), households with less than half of their income from informal sources are even richer than fully formal households by 15% and 43%, respectively.
One channel through which these informality portfolios translate into different consumption levels is through accumulation of durable goods or housing. Thus, we conduct the same analysis using a wealth index as a dependent variable and confirm the same patterns for expenditure levels. Results are presented in Supplementary Appendix C and Supplementary Figures C1 and C2.
The results presented provide interesting insights in three aspects: First, the informality status of a household head is strongly associated with our measure of fully informal households in terms of income shares or FTE shares. However, they do not always yield the same results and most importantly, the simple household head measure obscures important nuances of household income generation and welfare. These are revealed in the second aspect. Households with a mix of income sources or activities show better welfare outcomes than fully informal households and in some countries even better outcomes than fully formal households. It appears that households with a formally employed member fare even better if the household earns other income or spends time in other activities that are in the informal sector controlling for household size, dependency ratio, and the number of jobs. In some countries, an income or activity share in formal activities of less than half can already make the household better off than fully informal income generation. Lastly, comparing results using income shares with those using FTE shares we find comparable patterns. This encourages the applicability of our measure in contexts when only income or only work hour data are available. Overall, it should be highlighted that the results are impressively consistent across very different country contexts. Furthermore, we replicate the analysis using poverty status as outcome instead of consumption (see Supplementary Appendix B). This allows us to zoom in on a relevant threshold within the consumption distribution. The results confirm our main insights.
4. Depth of informality in a dynamic setting
In this section, we take a dynamic perspective on how changes in the depth of informality within a household are associated with changes in its welfare, accounting for several econometric concerns arising from the cross-sectional analysis.
The limits of estimating Equation (1) in a cross-section lie in the presence of unobserved characteristics that, on the one hand, may determine simultaneously a household’s income portfolio as well as its welfare outcomes. On the other hand, such traits could predict a household’s selection into formal or informal income-generating activities. These issues can be addressed with longitudinal data in which we observe households in several time periods, to then apply an estimator that accounts for unobserved characteristics that are non-time varying. In general, there are examples in the literature that capture the information contained in panel data to investigate individual worker transitions between the informal and formal sectors (Bosch and Maloney, 2010; Danquah et al., 2021). However, to the best of our knowledge, none of this literature has considered informality transitions more widely as a household decision.
To that aim we use three waves of the Nigerian LSMS-ISA panel data, namely waves 1–3 corresponding to the years 2010, 2012, and 2015/16.4 While also the other countries include panel elements, none fulfilled the requirements of comparability of relevant variables for three consecutive waves. We use only FTE shares to measure the depth of informality again due to varying questionnaire designs that did not allow us to capture income shares consistently. However, based on the cross-sectional results we are confident that the dynamic FTE results would be very similar if we used income shares instead.
4.1 Estimating transitions
We define three categories of the depth of informality: Fully informal (I), mixed (M), and fully formal (F) and then assess transitions over time between these states. Our dynamic analysis proceeds in three steps. First, we document the differences in welfare by each type of transition that a household can make. Then we focus these transitions on the directions of interest that are in and out of full informality. Lastly, we apply the most robust definition of control groups and transition directions to control also for selection on unobservables.
For the first part, households may switch status from one of three types into any permutation. This yields six possible transitions that a household may do over two periods and three instances of staying put. We compute statistics of the means before and after transition by type of switcher. Table 1 presents the results of a simple longitudinal analysis of these data. We compare transitions across a one period ‘hop’, so we are only ever considering the short-run effects of moving from one status to another between two consecutive waves. This approach is adopted to mitigate any potential issues in time-series estimation such as inconsistent standard errors due to serial correlation as raised by Bertrand et al. (2004). The table shows five columns. It first records the mean outcome for the relevant group’s welfare indicator before and after their transition (Columns 1 and 2). Column (3) in the table presents the raw gap between the relevant group and the never-changers group relative to staying in one’s origin for both periods. For example, this would imply that those going from having a fully informal household portfolio to a mixed portfolio are compared with those households who across two periods would always remain informal. Column (4) presents the conditional gap where we control land ownership, female headship, share of secondary school leavers, and year-fixed effects. These estimates treat the data as a pooled cross-section. Finally, Column (5) provides TWFEs results. While there is variation in the estimates in Column (5), we observe a general pattern. The TWFEs estimates are generally smaller than the cross-sectional estimates. This suggests that to the extent that anything is uncovered in the cross-section, these estimates may not ultimately hold up in longitudinal analysis as estimated effects rely on variation from groups that may not be appropriate control groups. For the majority of comparisons, the TWFEs estimates are at least half the size of the cross-sectional estimate.5
. | (1) . | (2) . | (3) . | (4) . | (5) . |
---|---|---|---|---|---|
Portfolio versus control . | Before . | After . | Raw Δ . | Cond Δ . | DiD . |
IIvFF | 4.684*** | 5.220*** | −0.607*** | −0.509*** | −0.021 |
(0.025) | (0.027) | (0.137) | (0.071) | (0.080) | |
N | 1,852 | 1,852 | 3,836 | 3,836 | 3,836 |
FIvFF | 5.248*** | 5.780*** | −0.045 | 0.005 | −0.032 |
(0.210) | (0.247) | (0.257) | (0.147) | (0.133) | |
N | 40.000 | 40.000 | 212 | 212 | 212 |
IFvII | 4.744*** | 5.578*** | −0.398** | −0.213** | 0.220** |
(0.136) | (0.151) | (0.190) | (0.096) | (0.108) | |
N | 91.000 | 91.000 | 314 | 314 | 314 |
FFvII | 5.279*** | 5.839*** | 0.607*** | 0.509*** | 0.021 |
(0.139) | (0.150) | (0.137) | (0.071) | (0.080) | |
N | 66.000 | 66.000 | 3,836 | 3,836 | 3,836 |
MIvMM | 4.836*** | 5.693*** | 0.180 | 0.027 | 0.129 |
(0.123) | (0.148) | (0.157) | (0.097) | (0.120) | |
N | 61.000 | 61.000 | 276 | 276 | 276 |
IMvII | 4.683*** | 5.174*** | −0.023 | 0.243*** | 0.051 |
(0.099) | (0.093) | (0.089) | (0.056) | (0.075) | |
N | 138 | 138 | 3,980 | 3,980 | 3,980 |
MMvII | 4.822*** | 5.348*** | 0.133 | 0.201*** | −0.021 |
(0.105) | (0.111) | (0.102) | (0.063) | (0.076) | |
N | 77.000 | 77.000 | 3,858 | 3,858 | 3,858 |
MFvMM | 4.945*** | 5.720*** | 0.248 | 0.171* | −0.032 |
(0.230) | (0.183) | (0.212) | (0.103) | (0.174) | |
N | 25.000 | 25.000 | 204 | 204 | 204 |
FMvFF | 5.327*** | 5.510*** | −0.141 | −0.096 | −0.284 |
(0.211) | (0.180) | (0.221) | (0.129) | (0.209) | |
N | 20.000 | 20.000 | 172 | 172 | 172 |
JoinersI (MI&FIvFF&MM) | 4.999*** | 5.728*** | 0.060 | 0.005 | 0.067 |
(0.113) | (0.132) | (0.142) | (0.084) | (0.088) | |
N | 101 | 101 | 488 | 488 | 488 |
LeaversI (MF&IFvII&MM) | 4.787*** | 5.608*** | 0.241** | 0.315*** | 0.160** |
(0.118) | (0.125) | (0.114) | (0.058) | (0.075) | |
N | 116 | 116 | 4,090 | 4,090 | 4,090 |
JoinersM (FM&IMvFF&MM) | 4.764*** | 5.216*** | −0.313*** | −0.091 | 0.004 |
(0.092) | (0.084) | (0.116) | (0.069) | (0.090) | |
N | 158 | 158 | 602 | 602 | 602 |
LeaversM (MF&MIvFF&MM) | 4.868*** | 5.701*** | −0.019 | −0.061 | 0.034 |
(0.109) | (0.117) | (0.132) | (0.076) | (0.099) | |
N | 86.000 | 86.000 | 458 | 458 | 458 |
. | (1) . | (2) . | (3) . | (4) . | (5) . |
---|---|---|---|---|---|
Portfolio versus control . | Before . | After . | Raw Δ . | Cond Δ . | DiD . |
IIvFF | 4.684*** | 5.220*** | −0.607*** | −0.509*** | −0.021 |
(0.025) | (0.027) | (0.137) | (0.071) | (0.080) | |
N | 1,852 | 1,852 | 3,836 | 3,836 | 3,836 |
FIvFF | 5.248*** | 5.780*** | −0.045 | 0.005 | −0.032 |
(0.210) | (0.247) | (0.257) | (0.147) | (0.133) | |
N | 40.000 | 40.000 | 212 | 212 | 212 |
IFvII | 4.744*** | 5.578*** | −0.398** | −0.213** | 0.220** |
(0.136) | (0.151) | (0.190) | (0.096) | (0.108) | |
N | 91.000 | 91.000 | 314 | 314 | 314 |
FFvII | 5.279*** | 5.839*** | 0.607*** | 0.509*** | 0.021 |
(0.139) | (0.150) | (0.137) | (0.071) | (0.080) | |
N | 66.000 | 66.000 | 3,836 | 3,836 | 3,836 |
MIvMM | 4.836*** | 5.693*** | 0.180 | 0.027 | 0.129 |
(0.123) | (0.148) | (0.157) | (0.097) | (0.120) | |
N | 61.000 | 61.000 | 276 | 276 | 276 |
IMvII | 4.683*** | 5.174*** | −0.023 | 0.243*** | 0.051 |
(0.099) | (0.093) | (0.089) | (0.056) | (0.075) | |
N | 138 | 138 | 3,980 | 3,980 | 3,980 |
MMvII | 4.822*** | 5.348*** | 0.133 | 0.201*** | −0.021 |
(0.105) | (0.111) | (0.102) | (0.063) | (0.076) | |
N | 77.000 | 77.000 | 3,858 | 3,858 | 3,858 |
MFvMM | 4.945*** | 5.720*** | 0.248 | 0.171* | −0.032 |
(0.230) | (0.183) | (0.212) | (0.103) | (0.174) | |
N | 25.000 | 25.000 | 204 | 204 | 204 |
FMvFF | 5.327*** | 5.510*** | −0.141 | −0.096 | −0.284 |
(0.211) | (0.180) | (0.221) | (0.129) | (0.209) | |
N | 20.000 | 20.000 | 172 | 172 | 172 |
JoinersI (MI&FIvFF&MM) | 4.999*** | 5.728*** | 0.060 | 0.005 | 0.067 |
(0.113) | (0.132) | (0.142) | (0.084) | (0.088) | |
N | 101 | 101 | 488 | 488 | 488 |
LeaversI (MF&IFvII&MM) | 4.787*** | 5.608*** | 0.241** | 0.315*** | 0.160** |
(0.118) | (0.125) | (0.114) | (0.058) | (0.075) | |
N | 116 | 116 | 4,090 | 4,090 | 4,090 |
JoinersM (FM&IMvFF&MM) | 4.764*** | 5.216*** | −0.313*** | −0.091 | 0.004 |
(0.092) | (0.084) | (0.116) | (0.069) | (0.090) | |
N | 158 | 158 | 602 | 602 | 602 |
LeaversM (MF&MIvFF&MM) | 4.868*** | 5.701*** | −0.019 | −0.061 | 0.034 |
(0.109) | (0.117) | (0.132) | (0.076) | (0.099) | |
N | 86.000 | 86.000 | 458 | 458 | 458 |
Notes: This table gives means and estimates of the effect of transitioning as a household to/from informality. Groups are defined by their state across the transition gap. So for someone who is always Formal (FF), always Informal (II), always Mix (MM), and permutations, thereof. Columns (1) and (2) provide the raw means for each portfolio group, in the respective time. Columns (3) and (4) provide the gap relative to their respective ‘control groups’ estimated as a simple intercept shift using OLS. For Columns (4) and (5) the estimates are conditional on household size, share of secondary schooling, and ‘real-time’ fixed effects. Column (5) is estimated using a household fixed effects model. The data are stacked on a dimensionless ‘transition time’ that is the gap in time between period 0 and 1, but naturally this duplicates observations in ‘real time’ in wave 2 in 2012. Errors clustered at household level.
Source: Authors’ compilation based on Nigeria 2010–2015 data.
. | (1) . | (2) . | (3) . | (4) . | (5) . |
---|---|---|---|---|---|
Portfolio versus control . | Before . | After . | Raw Δ . | Cond Δ . | DiD . |
IIvFF | 4.684*** | 5.220*** | −0.607*** | −0.509*** | −0.021 |
(0.025) | (0.027) | (0.137) | (0.071) | (0.080) | |
N | 1,852 | 1,852 | 3,836 | 3,836 | 3,836 |
FIvFF | 5.248*** | 5.780*** | −0.045 | 0.005 | −0.032 |
(0.210) | (0.247) | (0.257) | (0.147) | (0.133) | |
N | 40.000 | 40.000 | 212 | 212 | 212 |
IFvII | 4.744*** | 5.578*** | −0.398** | −0.213** | 0.220** |
(0.136) | (0.151) | (0.190) | (0.096) | (0.108) | |
N | 91.000 | 91.000 | 314 | 314 | 314 |
FFvII | 5.279*** | 5.839*** | 0.607*** | 0.509*** | 0.021 |
(0.139) | (0.150) | (0.137) | (0.071) | (0.080) | |
N | 66.000 | 66.000 | 3,836 | 3,836 | 3,836 |
MIvMM | 4.836*** | 5.693*** | 0.180 | 0.027 | 0.129 |
(0.123) | (0.148) | (0.157) | (0.097) | (0.120) | |
N | 61.000 | 61.000 | 276 | 276 | 276 |
IMvII | 4.683*** | 5.174*** | −0.023 | 0.243*** | 0.051 |
(0.099) | (0.093) | (0.089) | (0.056) | (0.075) | |
N | 138 | 138 | 3,980 | 3,980 | 3,980 |
MMvII | 4.822*** | 5.348*** | 0.133 | 0.201*** | −0.021 |
(0.105) | (0.111) | (0.102) | (0.063) | (0.076) | |
N | 77.000 | 77.000 | 3,858 | 3,858 | 3,858 |
MFvMM | 4.945*** | 5.720*** | 0.248 | 0.171* | −0.032 |
(0.230) | (0.183) | (0.212) | (0.103) | (0.174) | |
N | 25.000 | 25.000 | 204 | 204 | 204 |
FMvFF | 5.327*** | 5.510*** | −0.141 | −0.096 | −0.284 |
(0.211) | (0.180) | (0.221) | (0.129) | (0.209) | |
N | 20.000 | 20.000 | 172 | 172 | 172 |
JoinersI (MI&FIvFF&MM) | 4.999*** | 5.728*** | 0.060 | 0.005 | 0.067 |
(0.113) | (0.132) | (0.142) | (0.084) | (0.088) | |
N | 101 | 101 | 488 | 488 | 488 |
LeaversI (MF&IFvII&MM) | 4.787*** | 5.608*** | 0.241** | 0.315*** | 0.160** |
(0.118) | (0.125) | (0.114) | (0.058) | (0.075) | |
N | 116 | 116 | 4,090 | 4,090 | 4,090 |
JoinersM (FM&IMvFF&MM) | 4.764*** | 5.216*** | −0.313*** | −0.091 | 0.004 |
(0.092) | (0.084) | (0.116) | (0.069) | (0.090) | |
N | 158 | 158 | 602 | 602 | 602 |
LeaversM (MF&MIvFF&MM) | 4.868*** | 5.701*** | −0.019 | −0.061 | 0.034 |
(0.109) | (0.117) | (0.132) | (0.076) | (0.099) | |
N | 86.000 | 86.000 | 458 | 458 | 458 |
. | (1) . | (2) . | (3) . | (4) . | (5) . |
---|---|---|---|---|---|
Portfolio versus control . | Before . | After . | Raw Δ . | Cond Δ . | DiD . |
IIvFF | 4.684*** | 5.220*** | −0.607*** | −0.509*** | −0.021 |
(0.025) | (0.027) | (0.137) | (0.071) | (0.080) | |
N | 1,852 | 1,852 | 3,836 | 3,836 | 3,836 |
FIvFF | 5.248*** | 5.780*** | −0.045 | 0.005 | −0.032 |
(0.210) | (0.247) | (0.257) | (0.147) | (0.133) | |
N | 40.000 | 40.000 | 212 | 212 | 212 |
IFvII | 4.744*** | 5.578*** | −0.398** | −0.213** | 0.220** |
(0.136) | (0.151) | (0.190) | (0.096) | (0.108) | |
N | 91.000 | 91.000 | 314 | 314 | 314 |
FFvII | 5.279*** | 5.839*** | 0.607*** | 0.509*** | 0.021 |
(0.139) | (0.150) | (0.137) | (0.071) | (0.080) | |
N | 66.000 | 66.000 | 3,836 | 3,836 | 3,836 |
MIvMM | 4.836*** | 5.693*** | 0.180 | 0.027 | 0.129 |
(0.123) | (0.148) | (0.157) | (0.097) | (0.120) | |
N | 61.000 | 61.000 | 276 | 276 | 276 |
IMvII | 4.683*** | 5.174*** | −0.023 | 0.243*** | 0.051 |
(0.099) | (0.093) | (0.089) | (0.056) | (0.075) | |
N | 138 | 138 | 3,980 | 3,980 | 3,980 |
MMvII | 4.822*** | 5.348*** | 0.133 | 0.201*** | −0.021 |
(0.105) | (0.111) | (0.102) | (0.063) | (0.076) | |
N | 77.000 | 77.000 | 3,858 | 3,858 | 3,858 |
MFvMM | 4.945*** | 5.720*** | 0.248 | 0.171* | −0.032 |
(0.230) | (0.183) | (0.212) | (0.103) | (0.174) | |
N | 25.000 | 25.000 | 204 | 204 | 204 |
FMvFF | 5.327*** | 5.510*** | −0.141 | −0.096 | −0.284 |
(0.211) | (0.180) | (0.221) | (0.129) | (0.209) | |
N | 20.000 | 20.000 | 172 | 172 | 172 |
JoinersI (MI&FIvFF&MM) | 4.999*** | 5.728*** | 0.060 | 0.005 | 0.067 |
(0.113) | (0.132) | (0.142) | (0.084) | (0.088) | |
N | 101 | 101 | 488 | 488 | 488 |
LeaversI (MF&IFvII&MM) | 4.787*** | 5.608*** | 0.241** | 0.315*** | 0.160** |
(0.118) | (0.125) | (0.114) | (0.058) | (0.075) | |
N | 116 | 116 | 4,090 | 4,090 | 4,090 |
JoinersM (FM&IMvFF&MM) | 4.764*** | 5.216*** | −0.313*** | −0.091 | 0.004 |
(0.092) | (0.084) | (0.116) | (0.069) | (0.090) | |
N | 158 | 158 | 602 | 602 | 602 |
LeaversM (MF&MIvFF&MM) | 4.868*** | 5.701*** | −0.019 | −0.061 | 0.034 |
(0.109) | (0.117) | (0.132) | (0.076) | (0.099) | |
N | 86.000 | 86.000 | 458 | 458 | 458 |
Notes: This table gives means and estimates of the effect of transitioning as a household to/from informality. Groups are defined by their state across the transition gap. So for someone who is always Formal (FF), always Informal (II), always Mix (MM), and permutations, thereof. Columns (1) and (2) provide the raw means for each portfolio group, in the respective time. Columns (3) and (4) provide the gap relative to their respective ‘control groups’ estimated as a simple intercept shift using OLS. For Columns (4) and (5) the estimates are conditional on household size, share of secondary schooling, and ‘real-time’ fixed effects. Column (5) is estimated using a household fixed effects model. The data are stacked on a dimensionless ‘transition time’ that is the gap in time between period 0 and 1, but naturally this duplicates observations in ‘real time’ in wave 2 in 2012. Errors clustered at household level.
Source: Authors’ compilation based on Nigeria 2010–2015 data.
The second panel of Table 1 (b composite transitions) provides estimates of composite groups who transition in the same ‘direction’ as well as composite control groups who may serve as suitable counterfactuals for these groups. We estimate pre- and post-transition means (Columns 1 and 2), unconditional (Column 3), and conditional gaps (Column 4), as well as in Column (5) a TWFEs regression for those joining informality (that is those moving into a fully informal portfolio, MI & FI movers). These are compared to those households who remained always in the formal sector, either fully formal or mixed. Notably for this comparison, those transitioning between these statuses are no worse off after transitioning compared with the stayers. Similarly, we contrast those leaving informality, both from mixed portfolio households and fully informal households, with those who are always informal and always mixed. It is worth noting that we detect an increase in household consumption as a consequence of the transition. The estimate suggests a 16% increase in consumption for those households that become fully formal.
We further consider the effects of diversifying a household’s portfolio (M) compared to leaving mixed portfolios. Yet, it should not be surprising that for both of these we do not observe any differences in the TWFEs approach.
In a final step, we aim to test whether our findings from the cross-sectional analysis also hold in the longitudinal context. We consider six possible changes to a household’s portfolio, deriving these hypotheses from the cross-sectional relationships estimated from Equation (1):
H1 A household becomes informal, that is, they change their portfolio from full formality to full informality. Switchers of interest: Fully formal household switching to fully informal. From the cross-sectional estimates, this is expected to be a welfare-decreasing move.
H2 A household formalizes, that is they change their portfolio from full informality to full formality. Switchers of interest: Fully informal household switching to fully formal. From the earlier estimates, this is expected to be a welfare-increasing move.
H3 A household diversifies its income portfolio moving from a fully formal to a mixed one. Switchers of interest: Fully formal household switching to mixed portfolio. Our earlier exercise suggests that this change should be welfare preserving.
H4 A household diversifies its income portfolio moving from a fully informal to a mixed one. Switchers of interest: Fully informal household switching to mixed portfolio. Our prior estimates suggest this will be a welfare-increasing move.
H5 A diversified household collapses its portfolio to full formality. Switchers of interest: Mixed portfolio household switching to fully formal. Similar to H3, our estimates imply this change will be welfare preserving.
H6 A diversified household collapses its portfolio to full informality. Switchers of interest: Mixed portfolio household switching to fully informal. As above, this change is expected to be welfare preserving.
Recent advances in the difference-in-difference literature have shown that there are concerns about recovering unbiased estimates in the presence of differential treatment timing when estimating treatment effects using the traditional TWFEs estimator (see e.g. Sun and Abraham, 2021; Goodman-Bacon, 2021; Callaway and Sant’Anna, 2021). We sidestep the issues of differential treatment timing raised by Goodman-Bacon (2021) by focusing on solely estimating a 2 × 2 TWFE estimate, that is, a two-group comparison over two periods. We then estimate Equation (2) as follows:
where , log of total household expenditure i in time t, is predicted by a time-invariant indicator whether a household changes its income portfolio, Switcher, between t and t − 1 and the dummy Post indicating the second time period in which a household is observed and the interaction of both indicators. δ is the coefficient of interest showing the average treatment effect on the treated (ATT) of the household’s change in status on its welfare. We also control for a vector of household time-varying characteristics, X (viz. household size and share with secondary schooling). Further controls used in the cross-sectional case such as land ownership and female headship were considered but ultimately deemed to be potential sources of collider bias in the dynamic setup, as the parameters would be identified only for those switching status. Those ‘switchers’ amongst the land and household head composition, would therefore likely correlate with the switch in informal household status, expected to bias the estimates of the effect of interest.
With this setup, we automatically eliminate any time-invariant unobserved household characteristics that determine a household’s welfare outcomes. However, the issue of selection remains. Imagine we compare households that are fully informal in t − 1 and now we define those as switchers who change their income portfolio to mixed in t and continue to compare them to those who remained fully informal. It is likely that those who remain fully informal are fundamentally different from those switchers, for example, due to higher risk aversion. Falco (2014) showed that risk aversion strongly predicts the choice between informal and formal occupations for urban Ghanaians. We would thus not compare like with like. Therefore, we design our control group more carefully by exploiting the variation in portfolio switching timing. A feature of the Nigerian data we exploit are three time periods, wherein one can identify up to two switching events for each household. We can additionally identify those households who never change their household informality portfolio. We do not believe that these serve as good control for our switchers, and thus we exclude these in our estimation. Instead, our treatment group is comprised of those households who switch between the first two periods (‘Early switchers’). And we compare their outcomes, to the comparison group of those households who will make the same move but only between the last two periods (‘Late switchers’), we, however, only consider these a valid control in the time period prior to their move. This is because we wish to net out the underlying unobservable characteristics driving the switching behaviour whilst at the same time precluding the accrual of potential gains or losses from switching. Note that due to this decision when stacking to the 2×2 we discard the latter observations for the households who are ‘late switchers’ and thus restrict our analysis to changes in the first possible transition period. We believe this is the most robust choice for exploring employment transition dynamics in an endogenous decision-making process, as by comparing households who are making a transition compared to those who in the next period will make the same transition, we effectively net out the selection issue as both households should have the same unobservable characteristics that drive the move.
To illustrate how comparable early and late switcher households are in our sample, we present pre-switching balance statistics in Table 2. Each broad column shows the balance tests of one sample related to one specific hypothesis test. The tables demonstrate that we compare statistically similar households based on their observable characteristics which we use as controls. We note, however, that especially when investigating switchers from formal to informal or mixed-income portfolios the sample is very small, especially for the early switchers. To confirm whether the estimated effects are true or a result of low power, we will conduct randomization inference of the point estimates we obtain from the TWFEs as a robustness check.
. | Early switchers . | Late switchers . | T-test . | ||
---|---|---|---|---|---|
Variable . | N . | Mean . | N . | Mean . | Difference . |
Fully formal switched to fully informal (H1) | |||||
Household size | 10 | 5.400 | 66 | 6.530 | −1.130 |
(0.859) | (0.554) | ||||
Dependency ratio | 10 | 0.722 | 66 | 0.673 | 0.049 |
(0.229) | (0.084) | ||||
Share in household with secondary school | 10 | 0.000 | 66 | 0.022 | −0.022 |
(0.000) | (0.013) | ||||
Fully informal switched to fully formal (H2) | |||||
Household size | 122 | 6.139 | 62 | 4.903 | 1.236** |
(0.348) | (0.364) | ||||
Dependency ratio | 122 | 0.828 | 62 | 0.731 | 0.098 |
(0.069) | (0.097) | ||||
Share in household with secondary school | 122 | 0.015 | 62 | 0.014 | 0.000 |
(0.007) | (0.008) | ||||
Fully formal switched to mix (H3) | |||||
Household size | 6 | 5.333 | 34 | 5.765 | −0.431 |
(0.422) | (0.532) | ||||
Dependency ratio | 6 | 0.792 | 34 | 0.995 | −0.203 |
(0.198) | (0.147) | ||||
Share in household with secondary school | 6 | 0.000 | 34 | 0.005 | −0.005 |
(0.000) | (0.004) | ||||
Fully informal switched to mix (H4) | |||||
Household size | 192 | 6.109 | 82 | 6.012 | 0.097 |
(0.189) | (0.419) | ||||
Dependency ratio | 192 | 0.852 | 82 | 0.752 | 0.100 |
(0.055) | (0.070) | ||||
Share in household with secondary school | 192 | 0.014 | 82 | 0.018 | −0.004 |
(0.005) | (0.007) | ||||
Mix to fully formal (H5) | |||||
Household size | 8 | 7.750 | 40 | 6.725 | 1.025 |
(1.521) | (0.374) | ||||
Dependency ratio | 8 | 1.058 | 40 | 0.767 | 0.292 |
(0.287) | (0.108) | ||||
Share in household with secondary school | 8 | 0.018 | 40 | 0.003 | 0.015 |
(0.018) | (0.003) | ||||
Mix to fully informal (H6) | |||||
Household size | 26 | 5.808 | 102 | 6.206 | −0.398 |
(0.668) | (0.278) | ||||
Dependency ratio | 26 | 0.514 | 102 | 0.737 | −0.224 |
(0.131) | (0.070) | ||||
Share in household with secondary school | 26 | 0.011 | 102 | 0.017 | −0.006 |
(0.011) | (0.007) |
. | Early switchers . | Late switchers . | T-test . | ||
---|---|---|---|---|---|
Variable . | N . | Mean . | N . | Mean . | Difference . |
Fully formal switched to fully informal (H1) | |||||
Household size | 10 | 5.400 | 66 | 6.530 | −1.130 |
(0.859) | (0.554) | ||||
Dependency ratio | 10 | 0.722 | 66 | 0.673 | 0.049 |
(0.229) | (0.084) | ||||
Share in household with secondary school | 10 | 0.000 | 66 | 0.022 | −0.022 |
(0.000) | (0.013) | ||||
Fully informal switched to fully formal (H2) | |||||
Household size | 122 | 6.139 | 62 | 4.903 | 1.236** |
(0.348) | (0.364) | ||||
Dependency ratio | 122 | 0.828 | 62 | 0.731 | 0.098 |
(0.069) | (0.097) | ||||
Share in household with secondary school | 122 | 0.015 | 62 | 0.014 | 0.000 |
(0.007) | (0.008) | ||||
Fully formal switched to mix (H3) | |||||
Household size | 6 | 5.333 | 34 | 5.765 | −0.431 |
(0.422) | (0.532) | ||||
Dependency ratio | 6 | 0.792 | 34 | 0.995 | −0.203 |
(0.198) | (0.147) | ||||
Share in household with secondary school | 6 | 0.000 | 34 | 0.005 | −0.005 |
(0.000) | (0.004) | ||||
Fully informal switched to mix (H4) | |||||
Household size | 192 | 6.109 | 82 | 6.012 | 0.097 |
(0.189) | (0.419) | ||||
Dependency ratio | 192 | 0.852 | 82 | 0.752 | 0.100 |
(0.055) | (0.070) | ||||
Share in household with secondary school | 192 | 0.014 | 82 | 0.018 | −0.004 |
(0.005) | (0.007) | ||||
Mix to fully formal (H5) | |||||
Household size | 8 | 7.750 | 40 | 6.725 | 1.025 |
(1.521) | (0.374) | ||||
Dependency ratio | 8 | 1.058 | 40 | 0.767 | 0.292 |
(0.287) | (0.108) | ||||
Share in household with secondary school | 8 | 0.018 | 40 | 0.003 | 0.015 |
(0.018) | (0.003) | ||||
Mix to fully informal (H6) | |||||
Household size | 26 | 5.808 | 102 | 6.206 | −0.398 |
(0.668) | (0.278) | ||||
Dependency ratio | 26 | 0.514 | 102 | 0.737 | −0.224 |
(0.131) | (0.070) | ||||
Share in household with secondary school | 26 | 0.011 | 102 | 0.017 | −0.006 |
(0.011) | (0.007) |
Notes: The value displayed for t-tests are the differences in the means across the groups. Standard errors are below in parentheses. ***, **, and * indicate significance at the 1%, 5%, and 10% critical level. The groups are defined as in the text, such that each cell represents the mean for the variable between early switchers, that is, those who switch between 2010 and 2012 and ‘late switchers before their move’ that is, the 2010 and 2012 realizations for those who make the switch between 2012 and 2015.
Source: Author’s elaboration using Nigeria Data for 2010–12.
. | Early switchers . | Late switchers . | T-test . | ||
---|---|---|---|---|---|
Variable . | N . | Mean . | N . | Mean . | Difference . |
Fully formal switched to fully informal (H1) | |||||
Household size | 10 | 5.400 | 66 | 6.530 | −1.130 |
(0.859) | (0.554) | ||||
Dependency ratio | 10 | 0.722 | 66 | 0.673 | 0.049 |
(0.229) | (0.084) | ||||
Share in household with secondary school | 10 | 0.000 | 66 | 0.022 | −0.022 |
(0.000) | (0.013) | ||||
Fully informal switched to fully formal (H2) | |||||
Household size | 122 | 6.139 | 62 | 4.903 | 1.236** |
(0.348) | (0.364) | ||||
Dependency ratio | 122 | 0.828 | 62 | 0.731 | 0.098 |
(0.069) | (0.097) | ||||
Share in household with secondary school | 122 | 0.015 | 62 | 0.014 | 0.000 |
(0.007) | (0.008) | ||||
Fully formal switched to mix (H3) | |||||
Household size | 6 | 5.333 | 34 | 5.765 | −0.431 |
(0.422) | (0.532) | ||||
Dependency ratio | 6 | 0.792 | 34 | 0.995 | −0.203 |
(0.198) | (0.147) | ||||
Share in household with secondary school | 6 | 0.000 | 34 | 0.005 | −0.005 |
(0.000) | (0.004) | ||||
Fully informal switched to mix (H4) | |||||
Household size | 192 | 6.109 | 82 | 6.012 | 0.097 |
(0.189) | (0.419) | ||||
Dependency ratio | 192 | 0.852 | 82 | 0.752 | 0.100 |
(0.055) | (0.070) | ||||
Share in household with secondary school | 192 | 0.014 | 82 | 0.018 | −0.004 |
(0.005) | (0.007) | ||||
Mix to fully formal (H5) | |||||
Household size | 8 | 7.750 | 40 | 6.725 | 1.025 |
(1.521) | (0.374) | ||||
Dependency ratio | 8 | 1.058 | 40 | 0.767 | 0.292 |
(0.287) | (0.108) | ||||
Share in household with secondary school | 8 | 0.018 | 40 | 0.003 | 0.015 |
(0.018) | (0.003) | ||||
Mix to fully informal (H6) | |||||
Household size | 26 | 5.808 | 102 | 6.206 | −0.398 |
(0.668) | (0.278) | ||||
Dependency ratio | 26 | 0.514 | 102 | 0.737 | −0.224 |
(0.131) | (0.070) | ||||
Share in household with secondary school | 26 | 0.011 | 102 | 0.017 | −0.006 |
(0.011) | (0.007) |
. | Early switchers . | Late switchers . | T-test . | ||
---|---|---|---|---|---|
Variable . | N . | Mean . | N . | Mean . | Difference . |
Fully formal switched to fully informal (H1) | |||||
Household size | 10 | 5.400 | 66 | 6.530 | −1.130 |
(0.859) | (0.554) | ||||
Dependency ratio | 10 | 0.722 | 66 | 0.673 | 0.049 |
(0.229) | (0.084) | ||||
Share in household with secondary school | 10 | 0.000 | 66 | 0.022 | −0.022 |
(0.000) | (0.013) | ||||
Fully informal switched to fully formal (H2) | |||||
Household size | 122 | 6.139 | 62 | 4.903 | 1.236** |
(0.348) | (0.364) | ||||
Dependency ratio | 122 | 0.828 | 62 | 0.731 | 0.098 |
(0.069) | (0.097) | ||||
Share in household with secondary school | 122 | 0.015 | 62 | 0.014 | 0.000 |
(0.007) | (0.008) | ||||
Fully formal switched to mix (H3) | |||||
Household size | 6 | 5.333 | 34 | 5.765 | −0.431 |
(0.422) | (0.532) | ||||
Dependency ratio | 6 | 0.792 | 34 | 0.995 | −0.203 |
(0.198) | (0.147) | ||||
Share in household with secondary school | 6 | 0.000 | 34 | 0.005 | −0.005 |
(0.000) | (0.004) | ||||
Fully informal switched to mix (H4) | |||||
Household size | 192 | 6.109 | 82 | 6.012 | 0.097 |
(0.189) | (0.419) | ||||
Dependency ratio | 192 | 0.852 | 82 | 0.752 | 0.100 |
(0.055) | (0.070) | ||||
Share in household with secondary school | 192 | 0.014 | 82 | 0.018 | −0.004 |
(0.005) | (0.007) | ||||
Mix to fully formal (H5) | |||||
Household size | 8 | 7.750 | 40 | 6.725 | 1.025 |
(1.521) | (0.374) | ||||
Dependency ratio | 8 | 1.058 | 40 | 0.767 | 0.292 |
(0.287) | (0.108) | ||||
Share in household with secondary school | 8 | 0.018 | 40 | 0.003 | 0.015 |
(0.018) | (0.003) | ||||
Mix to fully informal (H6) | |||||
Household size | 26 | 5.808 | 102 | 6.206 | −0.398 |
(0.668) | (0.278) | ||||
Dependency ratio | 26 | 0.514 | 102 | 0.737 | −0.224 |
(0.131) | (0.070) | ||||
Share in household with secondary school | 26 | 0.011 | 102 | 0.017 | −0.006 |
(0.011) | (0.007) |
Notes: The value displayed for t-tests are the differences in the means across the groups. Standard errors are below in parentheses. ***, **, and * indicate significance at the 1%, 5%, and 10% critical level. The groups are defined as in the text, such that each cell represents the mean for the variable between early switchers, that is, those who switch between 2010 and 2012 and ‘late switchers before their move’ that is, the 2010 and 2012 realizations for those who make the switch between 2012 and 2015.
Source: Author’s elaboration using Nigeria Data for 2010–12.
Table 3 reports the estimates of Equation (2), each cell represents the estimate for the δ parameter. The purpose of this estimation strategy is to ensure that any unobserved heterogeneity which may bias the estimates is cancelled out, as we focus our estimates on those who transition compared to those who will transition in the same direction, in the following wave, but prior to their transition. This gives the ATT, conditioning out the potential endogenous unobservables that influence the decision for a household to change its formality status. It is worthwhile remarking that when this is net out only those households moving wholesale from fully informal to fully formal are found to be significantly better off in terms of their consumption. The estimates imply a substantial increase in consumption of 39.5% after the move into full formality. It is worth bearing in mind that the definition of informality we use is comprised of social insurance measures. Thus, the estimates are consistent with the view that households’ welfare may be improved by their access to health coverage and pensions. Access to these systems therefore allows households to have a liquidity increase which can then be used for consumption. Notably, however, the gains of this would appear to be accrued only by the extreme move from fully informal to fully formal portfolio, and there are no changes to households’ probability of being poor (see Supplementary Appendix B).
LHS . | ln(Total Expenditure) . |
---|---|
Fully formal switched to fully informal (H1) | 0.144 |
(0.447) | |
N | 76.000 |
Fully informal switched to fully formal (H2) | 0.395*** |
(0.148) | |
N | 184 |
Fully formal switched to mix (H3) | 0.187 |
(0.284) | |
N | 40.000 |
Fully informal switched to mix (H4) | 0.031 |
(0.158) | |
N | 274 |
Mix to fully formal (H5) | −0.273 |
(0.424) | |
N | 48.000 |
Mix to fully informal (H6) | −0.084 |
(0.224) | |
N | 128 |
LHS . | ln(Total Expenditure) . |
---|---|
Fully formal switched to fully informal (H1) | 0.144 |
(0.447) | |
N | 76.000 |
Fully informal switched to fully formal (H2) | 0.395*** |
(0.148) | |
N | 184 |
Fully formal switched to mix (H3) | 0.187 |
(0.284) | |
N | 40.000 |
Fully informal switched to mix (H4) | 0.031 |
(0.158) | |
N | 274 |
Mix to fully formal (H5) | −0.273 |
(0.424) | |
N | 48.000 |
Mix to fully informal (H6) | −0.084 |
(0.224) | |
N | 128 |
Notes: Each cell in this table represents the estimate of the δ parameter from Equation (2). The functional form presented controls for year-fixed effects, household size, dependency ratio, and the share of household members with secondary schooling. Standard Errors are clustered at household level. *, **, *** indicate statistical significance at 1%, 5%, and 10%, respectively.
Source: Authors’ compilation based on Nigeria 2010–15 data.
LHS . | ln(Total Expenditure) . |
---|---|
Fully formal switched to fully informal (H1) | 0.144 |
(0.447) | |
N | 76.000 |
Fully informal switched to fully formal (H2) | 0.395*** |
(0.148) | |
N | 184 |
Fully formal switched to mix (H3) | 0.187 |
(0.284) | |
N | 40.000 |
Fully informal switched to mix (H4) | 0.031 |
(0.158) | |
N | 274 |
Mix to fully formal (H5) | −0.273 |
(0.424) | |
N | 48.000 |
Mix to fully informal (H6) | −0.084 |
(0.224) | |
N | 128 |
LHS . | ln(Total Expenditure) . |
---|---|
Fully formal switched to fully informal (H1) | 0.144 |
(0.447) | |
N | 76.000 |
Fully informal switched to fully formal (H2) | 0.395*** |
(0.148) | |
N | 184 |
Fully formal switched to mix (H3) | 0.187 |
(0.284) | |
N | 40.000 |
Fully informal switched to mix (H4) | 0.031 |
(0.158) | |
N | 274 |
Mix to fully formal (H5) | −0.273 |
(0.424) | |
N | 48.000 |
Mix to fully informal (H6) | −0.084 |
(0.224) | |
N | 128 |
Notes: Each cell in this table represents the estimate of the δ parameter from Equation (2). The functional form presented controls for year-fixed effects, household size, dependency ratio, and the share of household members with secondary schooling. Standard Errors are clustered at household level. *, **, *** indicate statistical significance at 1%, 5%, and 10%, respectively.
Source: Authors’ compilation based on Nigeria 2010–15 data.
It is worth comparing the estimates from this approach adopted with those obtained from the earlier exercise reported in Table 1. We find the only ones who see any change in their welfare is those who formalize across the two time periods. These estimates may be compared with those in Column (5) for the group IFvII in Table 1. There we found the naive difference-in-difference gain becoming fully formal to be an increase in consumption of 22% when compared to those households who are always informal. Conversely, in our preferred approach, wherein we exploit the additional information from the third wave of the Nigerian data and select a more appropriate control group which allows us to net out the selection-on-unobservable characteristics, we find that the naive difference-in-differences underestimate the welfare effect by 50%, suggesting that there is substantial heterogeneity among fully informal households in terms of their welfare.
However, it is a stark finding that regardless of the other direction of transitions, there are no effects detected on household expenditure. It is, however, worth bearing in mind that some of the cells of switchers are very small, which would put inferences drawn from these estimates under suspicion, calling for a thorough inspection in this sense in the next section.
4.2 Robustness: randomization inference and number of jobs
A concern that may be raised with respect to our strictest estimation sample groups is that we are attempting to draw inferences from small sample sizes. Inherently, the lack of statistical significance in the estimated effects may simply be due to the lack of statistical power available in the data.
In order to mitigate this concern, we adopt randomization inference to obtain the null set bounds from the Fisher exact test (Fisher, 1935) as recently suggested by Young (2019). We implement it using ritest by Simon (2017) in Stata. The underlying intuition behind the approach is simple. We do not know the sampling distribution to our point estimate. But perhaps we can estimate the exact bounds of the null hypothesis that the estimated point estimate is exactly zero. To do so we randomize a notional treatment constrained to the same proportion as one of our switches, we then estimate a regression with the same functional form as in specification (2). As the switch is randomly allocated, there is no information and any point estimate obtained will be spurious. This procedure is repeated 1,000 times. We then take the 1,000 estimates of the point estimates for our spurious treatments and can construct an exact p-value for our estimate being in the null set. This is given as the ratio of estimates whose values are as extreme as the one we estimated with the real transition in Table 3 over the number of randomized permutation regressions estimated. Note that these bounds may be made as arbitrarily narrow as desired by increasing the number of spurious regressions permuted. One can obtain the 95% exact internal null bounds of the treatment by estimating the 5th and 95th percentiles of the permuted null estimates. Figure 3 reports the results of this exercise. Note that the point estimate remains unchanged as before and is shown in red, but we now report the point estimates from each of the 1,000 replications of the exercise. One can see where the mass of null estimates lies. It is notable that our results stand, that is, there are no significant differences in welfare across any group as a result of their transition, with the exception of the group moving from full informality to full formality (I→F). This stark finding suggests that households are able to effectively hedge their positions by transitioning.

Difference-in-difference estimates for switchers, Nigeria 2010–15 with randomized inference.
Notes: The graphs plot coefficients and confidence intervals from Equation (2) in a solid circle. Each point represents a different hypothesis regression outcome. Where the initials F, I, and M represent Fully Formal, Fully Informal, and Mixed portfolios, respectively, and the right arrow represents the direction of the switch. So for example, the first estimate I →F is the switch between fully informal to fully formal. Each of the hollow circles represents a point estimate from the randomization inference exercise. The mass of blue points will be clustered along what can be considered to be the null bounds of the regression, so if a red point estimate is found to lie in this area, it can be inferred that the point estimate is a true null effect. Conversely, if the solid circle point estimate lies outside the mass of hollow circle estimates, the point estimate can be said to be different from zero.
Source: Authors’ compilation.
Another concern that might arise as mentioned before is that the change in formality portfolio is directly related to a change in the number of jobs within a household and thus mechanically leads to higher incomes. While we controlled for such characteristics in the cross-sectional setting, we did not in the dynamic analysis due to concerns of collider bias. Instead, we replicate the estimates from Table 3 for the sub-samples, where we hold the number of jobs constant across groups (1–2 jobs, 3–4 jobs, and 5+ jobs) and maintain our strategy of using households who will be making the same switch the following period as the controls. Results are presented in Supplementary Appendix Table D1. For the group of 1–2 jobs, we find qualitatively similar effects as in the main results presented. The point estimates for log total expenditure still imply a substantial gain for those households moving from informality to full formality of 33%.
5. Conclusions
We propose discretized measures of informality at the household level and we assess how they relate to the welfare of urban households in five Sub-Saharan African countries. These measures, based on the employment or income portfolios, offer researchers a way to capture employment diversification at the household level.
First, using cross-sectional data, we describe how the depth of informality correlates with consumption levels, explicitly comparing it to a simplified measure of the household head’s formality status. The results point to important welfare effects of social insurance access. A small share of formal income can make a household as well off as a fully formally earning household. Using only the formality status of the household head obscures such diversification.
Second, we explore how changes in the depth of informality over time influence urban household welfare, by exploiting three waves of panel data for Nigeria. This approach allows to control for unobserved characteristics and selection. The headline estimates are found to be approximately half of those found from the cross-sectional regressions. When breaking down these estimates and applying a TWFEs estimation strategy, we find few significant effects on household welfare from changes between formality profiles. The only significant results come from non-switching households (i.e., comparing always informal to always formal portfolios). Albeit referring to a short time period, the analysis suggests that unobserved characteristics may be driving the cross-sectional estimates. Furthermore, controlling for selection on unobservables, we find no evidence of any household welfare effects aside from an increase in consumption for those moving from a fully informal to a fully formal portfolio. Our results capture the ATT (and not the ATE), suggesting that for the marginal household to change its activity profile, the welfare gain must be substantially large. These observations confirm that selection is driving a large part of the naive TWFE results which had not accounted for unobservables.
Welfare is thus the same for households with mixed portfolios regardless of the depth of informality due to social insurance access. Although not possible to test with the available data, this is suggestive of the potential for social insurance to reach beyond the direct beneficiary to make households gaining welfare from it. The results complement the literature inspecting the role of institutionalization of protective measures for the informal economy and to that arguing for both vertical and horizontal extensions of social insurance systems (Behrendt and Nguyen, 2019). From a policy perspective, the approach of targeting the formalization of the marginal worker may not result in perceivable welfare effects. Policies catering to a wider coverage of informal workers should instead be considered. For instance, the extension of social insurance coverage toward informal workers should be discussed as a policy tool for urban SSA.
Lastly, our article points to some interesting avenues for future research. First, urban areas in SSA have a particular labour market setting and we acknowledge only capturing a single point in time for our comparative analysis. Future research should inspect longitudinal data for more countries in the region to further identify trends in the depth of informality and the impact of social insurance coverage on household welfare. Second, the article does not aim to approve or reject a household income pooling hypothesis, which could be a topic for further research. Moreover, our longitudinal analysis accounts for transitions from one household employment status to another, not directly modelling intra-household dynamics and assuming a neoclassical model of common preferences (Thomas, 1990). We could expect that heterogeneity in outcomes may be partially explained by intra-household bargaining decisions, so future research ought to account for the nuances in household dynamics as in Tiefenthaler (1999) for Brazilian couples. Third, though beyond the scope of our analysis, the move toward a portfolio with greater formality may be accompanied by other benefits for the household such as risk mitigation, and it could provide a promising avenue for further investigation. Fourth and last, we find no welfare gain in switching to a dynamic setting for urban Nigeria. This result aligns with findings for Mexico by Azuara and Marinescu (2013) that free health insurance provision for informal workers does not incentivize switches unless the wage gain is substantial for the marginal worker. Similarly, easing self-employment registration in Brazil has no effect on formalization, but reducing the tax burden does (Rocha et al., 2018). We could infer that some of our results might be driven by such strategic behaviour around tax evasion, so to maximize income among the informal self-employed, calling for future research on the effect of tax avoidance on household behaviour. In line with recent studies finding discrepancies between observational and experimental data to explore endogenous decisions such as internal temporary migration (Lagakos et al., 2020; Baseler, 2022), more experimental research on social insurance use or tax avoidance prevention would be welcome to accompany the findings proposed in our study.
Footnotes
The literature on informality also considers firm characteristics (size, productivity or tax registration) to define formality (Maloney, 1999; Perry et al., 2007; La Porta and Shleifer, 2008, 2014; Hsieh and Olken, 2014; Meghir et al., 2015; Allen et al., 2018; Ohnsorge and Yu, 2021). However, formal firms (in terms of size and registration) often hire also informal workers without social insurance subscriptions (Levy, 2008; Ulyssea, 2018). Our definition does not consider the firm characteristics.
The LSMS-ISA datasets are nationally representative, cross-sectional, and longitudinal surveys conducted by the World Bank in collaboration with the national statistical offices.
In all countries, 80% of the households in rural areas have no formal income at all, so that our analysis would be severely constrained by sample size issues and would also seem less relevant in such settings.
The latest wave of the Nigerian panel from 2018 does not allow to have consistent variable definitions, and is therefore excluded.
It is worth noting that, as we report the unconditional means for all transition groups, it is possible for the reader to estimate a naive differences-in-differences for any comparison group, if preferred over the ones we have selected.
Supplementary material
Supplementary material is available on the OUP website. These are the data, the replication files, and the online appendix. Data and related information is publicly available through the World Bank LSMS-ISA project website. Data and Stata replication code is also available here.
Funding
The authors received support from UNU-WIDER for the publication in Open Access of this article. No further financial support was received for the research and/or authorship of this article.
Acknowledgements
The authors would like to thank Kalle Hirvonen, Kunal Sen, Ira Gang, and Jann Ley, participants at the 2021 Globalization and Development conference, the UNU-WIDER Informal Work workshop, IFPRI Ethiopia brown bag seminar and the EU-AFD Research Facility on Inequalities seminar at Addis Ababa University for useful comments. The work has also benefited from the valuable inputs of several anonymous referees, which the authors highly appreciate. This article solely expresses the views of the authors and does not reflect the official position of AFD, UNU-WIDER, or other affiliated institutions.