Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach: Structural Equation Modeling: A Multidisciplinary Journal: Vol 24, No 2

2,653

Views

CrossRef citations to date

Altmetric

Abstract

Finite mixture models, whether latent class models, growth mixture models, latent profile models, or factor mixture models, have become an important statistical tool in social science research. One of the biggest and most debated challenges in mixture modeling is the evaluation of model fit and model comparison. In the application of mixture models, researchers often fit a collection of models and then decide on a single optimal model based on a variety of model fit information. We propose a k-fold cross-validation procedure to model selection whereby the model is repeatedly fit to $k - 1$ different partitions of the data set, the resulting model is then applied to kth partition of the sample, and the distribution of fit indexes is examined. This method is illustrated with growth mixture models fit to longitudinal data on reading ability collected as part of the Early Childhood Longitudinal Study–Kindergarten Cohort.

Keywords:

FUNDING

This work was supported by National Science Foundation Grant REAL-1252463 awarded to the University of Virginia, David Grissmer (Principal Investigator), and Christopher Hulleman (Co-Principal Investigator).

Notes

¹ Despite some debate regarding the results presented by Lo et al. (Citation2001; see Jeffries, Citation2003), we report the VLMR LRT and aLMR LRT because they remain widely used in the mixture modeling literature.

² Although a test sample size of 5 (1% of N = 496) might seem too small when performing 100-fold cross-validation, no parameters are estimated when the model is applied to the test sample. The test sample only needs to consist of at least one participant (essentially inserting the participant’s data into $x_{i}$ in Equation 1 and calculating the $- 2 L L$ using the model-implied mean and covariance structure when the model was estimated using the training sample), which occurs when performing leave-one-out cross-validation (i.e., k-fold cross-validation with k = N).

Additional information

Funding

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach

Information for

Open access

Opportunities

Help and information

Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach

Abstract

FUNDING

Notes

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature