393
Views
1
CrossRef citations to date
0
Altmetric
Research Articles

On the Performance of Different Regularization Methods in Bifactor-(S-1) Models with Explanatory Variables—Caveats, Recommendations, and Future Directions

, , ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 560-573 | Received 28 Jun 2022, Accepted 24 Oct 2022, Published online: 15 Dec 2022
 

Abstract

Regularization methods in linear regression models with manifest variables have been shown to be effective in selecting key predictors from a set of many variables, while improving predictions for novel observations. Regularization methods are particularly attractive for the analysis of complex multidimensional data when theory development is the primary goal; for example when researchers attempt to predict general or specific factors in bifactor models using many potentially relevant predictors. However, applications of regularization methods in such models are still scarce. In a simulation study, we examined the performance of different regularization methods in bifactor-(S-1) models, varying the number of predictors, the correlations with the outcome (effect size), the underlying structure of multicollinearity as well as the sample size, the type of penalty, and a single-step versus a two-step approach. We explore potential caveats in the use of regularization methods in bifactor-(S-1) models, provide practical recommendations, and discuss future directions.

Notes

1 Preliminary analyses have been conducted with the RegSem-package (Jacobucci et al., Citation2016), however multiple problems occurred. First, the computational effort exceeded our expectations especially in conditions with a high number of noise variables rendering the simulation infeasible. Second, many converged models produced theoretically impossible values indicating improper solutions (Rindskopf, Citation1984). Third, after employing a rather simple and somewhat liberal indicator to avoid these theoretically impossible values, close to one third of all models indicated improper solutions. This indicator marked all models in which the sum of the squared regression coefficients (ridge) or the sum of their absolute values (lasso) increased beyond the respective value obtained with ML estimation.

2 We thank an anonymous reviewer for their remark regarding this topic.

3 Calculations were performed under SMP Debian 5.10.113-1 (2022-04-29) x86_64 GNU/Linux running on a machine with an AMD Ryzen Threadripper 3970X 32-Core Processor with 2195 Mhz and 1/16/128 MB L1/L2/L3-Cache respectively as well as a maximum of 126 GiB of RAM.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 412.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.