704
Views
4
CrossRef citations to date
0
Altmetric
Research Article

Decomposition of the gender wage gap using the LASSO estimator

&
 

ABSTRACT

We use the LASSO estimator to select among a large number of explanatory variables in wage regressions for a decomposition of the gender wage gap. The LASSO selection with a one standard error rule removes about a quarter of the regressors. We use the LASSO-selected regressors for OLS-based gender wage decompositions. This approach results in a smaller error variance than in OLS without LASSO-selection. The explained gender wage gap is 1%-point greater than in the conventional OLS model.

JEL CLASSIFICATION:

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 For example, Bach, Chernozhukov, and Spindler (Citation2018) use 4,382 and Briel and Töpfer (Citation2020) use 5,821 variables.

2 A statistical model with a coefficient vector that contains many zeros is called sparse (Hastie, Tibshirani, and Friedman Citation2009).

3 For example, Bach, Chernozhukov, and Spindler (Citation2018) analyse the gender wage gap using data from the 2016 American Community Survey and use the double LASSO method to select among up to 4,382 regressors. Briel and Töpfer (Citation2020) find that the size of the estimated pay gap differs substantially depending on the approach. See also Angrist and Frandsen (Citation2019).

4 Miller (Citation1984) discusses different algorithms for the subset selection technique. The algorithms either evaluate all subsets of the set of explanatory variables or use a heuristic for which subsets to evaluate. They usually choose the subset that results in the lowest sum of squared residuals (Tibshirani Citation1996).

5 The PSID does not clearly distinguish between different sources of income for farm-workers and the self-employed.

6 We assess the quality of the fit using the cross-validation based, LASSO residual sum of squares estimator (Fan, Guo, and Hao Citation2012). Although this tends to be biased downwards, particularly for small values of λ (Fan, Guo, and Hao Citation2012), Reid, Tibshirani, and Friedman (Citation2016) show that the bias is typically not large.

7 We also decompose the gender wage gap for estimates that do not include an indicator variable for sex in this step. The results of the decompositions are similar in both cases.

8 Which categories of categorical data are dropped by the LASSO-estimator depends on the choice of the reference category and on the correlation structure of the data. Yuan and Lin (Citation2006) suggest the use of the ‘Group-LASSO’ approach where all categories of a characteristic are either dropped or kept, but not single categories. The results from this approach are tabulated in . These results suggest a slightly larger unexplained gender wage gap than the OLS results, 0.127 vs. 0.119 log points.

9 Our main interest is the comparison of the results arising from the OLS post-LASSO specification with results which are based on a standard OLS approach. Our specifications do not correct for selection, which could result in downward biased estimates (Albrecht, Van Vuuren, and Vroman Citation2009).

10 The results of the Oaxaca-Blinder decomposition for 2016 are shown in Table A3 in the Appendix.

11 The results of the Smith-Welch decomposition for the change between 2006 and 2016 are shown in Table A4 in the Appendix.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.