288
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Decomposing joint distributions via reweighting functions: an application to intergenerational economic mobility

&
 

Abstract

We introduce a method that extends the traditional Oaxaca-Blinder decomposition to both the full distribution of an outcome of interest and to settings where group membership varies along a continuum. We achieve this by working directly with the joint distribution of outcome and group membership and comparing it to an independent joint distribution. Like all decompositions, we assume the difference is partially due to differences in characteristics between groups (a composition effect) and partially due to differences in returns to characteristics between groups (a structure effect). We use reweighting functions to estimate a counterfactual joint distribution representing the hypothetical if characteristics did not vary according to group while returns to characteristics did. The counterfactual allows us to decompose differences between the empirical and independent distributions into composition and structure effects. We demonstrate the method by decomposing multiple measures of immobility for white men in the U.S.

JEL Classification:

Acknowledgments

The views expressed in this paper are our own and do not reflect the views of the Office of the Comptroller of the Currency, the US Department of the Treasury, or any federal agency. The views in the paper do not establish supervisory policy, requirements, or expectations. We would like to acknowledge useful feedback and comments from seminar participants at the Office of the Comptroller of the Currency Economics Department, the editor, Esfandiar Maasoumi, as well as an associate editor and three anonymous referees. All errors are our own.

Notes

1 See Black and Devereux (Citation2011) for a fairly extensive review of the decomposition literature in economics.

2 More specifically, our decomposition methodology is wholly concerned with joint distributions as is reflected in the model notation. Our application, however, is concerned with understanding differences in joint distributions and relies on conditional distribution functions (e.g., quantile and mean bivariate regressions). So while the mechanics of estimating our counterfactual operate on joint distributions directly, pattern extraction from the joint distributions operate on conditional distribution functions within our application.

3 Our actual focus, like much of the mobility literature, is log earnings; we refer to ‘income’ for short.

4 The reader should note that for all distributional features of interest we will investigate - various slope parameters of mean and quantile regressions - the corresponding parameter for the independent joint distribution is zero, so for our purposes ν(f(yc)f(yp))=0.

5 We use the KernSmooth package in R for f(yp) and the ‘np’ package for f(yp|x).

6 Like all nonparametric estimators, dimensionality can be a concern for the Hall et al. (Citation2004) method. However, there is some work, particularly Izbicki and Lee (Citation2016), which shows the method performs rather well in simulation exercises based on sample sizes very similar to our application (n = 1,000 compared to our n = 1,357), though the computational time is shown rather large; and our application is based on x of dimension six, whereas their simulations go as high as twenty. Thus, while this limitation should be noted, there is some evidence regarding stability of the estimator indicating it should not be an overwhelming concern.

7 We note here that this reweighting approach is not the only way one may go about estimating such counterfactuals, much as there are several approaches to estimating counterfactuals in the traditional discrete group decomposition literature (Chernozhukov et al., Citation2013; CitationDiNardo et al., Citation1996; Firpo et al., Citation2007; Machado and Mata, Citation2005; Rothe, Citation2015). In a previous version of this paper, under a different title, we proposed an alternative, more ‘brute force’ approach to the counterfactual (Richey and Rosburg, Citation2016). That approach paralleled Machado and Mata (Citation2005) and took an ‘estimate-and-simulate’ approach that was very computationally intensive and required estimation of complex interactive conditional CDFs (i.e., F(yc|x,yp)). The approach provided here builds off the foundation of that paper, but provides what we believe is a simpler estimation procedure that avoids estimation of complex interactive conditional CDFs.

8 Note again, for all of our applications the parameter of interest for the independent distribution is zero.

9 See Fortin et al. (Citation2011) for a detailed discussion of identifying assumptions in decomposition methods. Also note ignorability is a less restrictive assumption than independence, which would require the unobservables to be independent of the covariates. Only ignorability is needed for identification of the structure-composition decomposition.

10 See Mazumder (Citation2005) or Black and Devereux (Citation2011) for an overview of this literature. The large range in the estimated IGE arises from a variety of data issues including life-cycle and measurement error biases (Böhlmark and Lindquist, Citation2006; Haider and Solon, Citation2006; Nybom and Stuhler, Citation2017).

11 See Corak and Heisz (Citation1999) and Grawe (Citation2004) for work along these lines.

12 The over sample of military and poor whites were discontinued in 1984 and 1990, respectively.

13 We exclude individuals who lived with a spouse or child during these years. Measurement error in parental income is a common concern in the mobility literature. However, recent research indicates that some mobility measures are less susceptible to such errors relative to others (Nybom and Stuhler, Citation2017).

14 The literature on intergenerational mobility has identified the possibility of life-cycle biases in estimates depending on age at which children are surveyed (Böhlmark and Lindquist, Citation2006; Haider and Solon, Citation2006; Nybom and Stuhler, Citation2017). However, this literature seems to indicate such biases are minimized or eliminated when youths reach their mid-30s.

15 Our measure of experience is very similar to, but slightly different from, the measure used by Regan and Oaxaca (Citation2009).

16 It is not unexpected that our IGE estimate is on the lower end of the range reported in existing literature (0.30 - 0.60). IGE estimates tend to be lower with shorter income averages (see Mazumder (Citation2005) or Black and Devereux (Citation2011) for discussions of measurement error as it relates to this issue), and we use a two-year average. Ideally, we would use a longer observation time frame but doing so has serious effects on our sample size and hinders our ability to carry out the decomposition.

17 While we have the actual parental income for each individual, the decomposition treats everyone in each quartile as simply a group member and thus it would be incorrect to reinsert this information to attempt actual regressions.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.