ABSTRACT
This paper investigates returns to women’s education by applying an optimal IV selection approach, post-Lasso IV estimation, which improves the first-stage predictive relationship between an endogenous regressor and instruments. Using the 2010 American Community Survey, we find that an extra year of education increases married women’s own income by $4,480 and spouse income by $8,822. Our findings indicate that 53% of the increase in women’s consumption by education is attributed to the marriage market, and thus, we conclude that the marriage market is the primary channel through which education improves women’s well-being. The results demonstrate the advantages of the post-Lasso approach: The resulting two-stage least squares estimator maintains efficiency without increasing finite sample bias and is less subject to the inconsistency problem when some instruments are invalid; This differs from the results using the instrument of birth quarters only, which is mostly applied in studies on returns to education.
Acknowledgments
We would like to thank Youjin Hahn, Bonsoo Koo, Francesca Molinari, Liang Choon Wang, Ou Yang, and seminar participants at the Asian Meeting of Econometric Society and Monash University for helpful comments. All remaining errors are the responsibility of the authors.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 The post-Lasso IV approach has yet to be applied to empirical research, except Gilchrist and Sands (Citation2016). They investigate the spillover effect of first-week viewers on movie consumption by using Lasso-chosen instruments of weather shocks.
2 For a recent example using birth date as an instrument, see Leigh and Ryan (Citation2008).
3 In this study, we mainly use the 2010 data instead of the 2016 data for the following reasons: First, the return to education for married women is insignificant with the 2016 data. Second, the effect of education on the marriage probability is significant in 2016, and thus, there may exist systematic differences between married and single women. We discuss these issues in Section 6.1.
4 Data were extracted from the Integrated Public Use Microdata Series website: https://usa.ipums.org/usa (Ruggles et al. Citation2015).
5 The discount factor captures both a sharing rule within a household and economy of scale.
6 We also estimate the log income equation and find that the results are similar.
7 Although the term `instrument’ includes the excluded instruments and the exogenous regressors , we use the term mostly to denote the outside instrument .
8 We also perform other variable selection methods, such as forward and backward stepwise selection, and the corresponding 2SLS estimators have two to three times larger standard errors than the post-Lasso IV estimator. The results are available upon request.
9 The norm of an by 1 vector is denoted by .
10 Section 4.2 in Belloni et al. (Citation2012) covers the case in which all instruments are individually weak.
11 The non-monotonicity due to the second birth quarter leads to the poor performance of three-IV estimation, which is discussed in the next subsection in more detail.
12 The first-stage regression results are available upon request.
13 Sakellariou and Fang (Citation2016) use different combinations of IVs for different samples to improve the first-stage prediction when estimating returns to schooling.
14 All Lasso instruments pass the over-identification test (i.e. the Sargan test).
15 We use the consumer price index inflation calculator provided by the U.S. Bureau of Labor Statistics. See http://www.bls.gov/data/inflation_calculator.htm.
16 We use the term `pseudo’ because some relevant IVs, by coincidence, are still contained in the set of randomly chosen instruments.