935
Views
21
CrossRef citations to date
0
Altmetric
Original Articles

Regression Mixture Models: Does Modeling the Covariance Between Independent Variables and Latent Classes Improve the Results?

, &
 

Abstract

Regression mixture models are increasingly used as an exploratory approach to identify heterogeneity in the effects of a predictor on an outcome. In this simulation study, we tested the effects of violating an implicit assumption often made in these models; that is, independent variables in the model are not directly related to latent classes. Results indicate that the major risk of failing to model the relationship between predictor and latent class was an increase in the probability of selecting additional latent classes and biased class proportions. In addition, we tested whether regression mixture models can detect a piecewise relationship between a predictor and outcome. Results suggest that these models are able to detect piecewise relations but only when the relationship between the latent class and the predictor is included in model estimation. We illustrate the implications of making this assumption through a reanalysis of applied data examining heterogeneity in the effects of family resources on academic achievement. We compare previous results (which assumed no relation between independent variables and latent class) to the model where this assumption is lifted. Implications and analytic suggestions for conducting regression mixture based on these findings are noted.

Notes

1 The effect of model misspecification on class proportions was further explored in a separate set of simulations (results available on request) using a single large data set (n = 100,000) for each condition. With mean differences on x across classes set at one standard deviation, these simulations varied the intercept, slope, and residual variances across conditions. Results show that bias in estimates of class means is only found when the model is misspecified (C on x omitted) and the residual variances are not equal across classes. In this case, the proportion of individuals in the class with the larger residual is overestimated and the proportion in the class with the smaller variance is underestimated.

2 Note that if the C on x path is included in the model when there is a curvilinear relationship between x and y, then the strength of the C on x relationship will result in posterior probabilities very close to 0 or 1 for each individual.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.