934
Views
51
CrossRef citations to date
0
Altmetric
Original Articles

Understanding Linkages Among Mixture Models

Pages 775-815 | Received 06 Sep 2012, Accepted 17 Jul 2013, Published online: 11 Dec 2013
 

Abstract

The methodological literature on mixture modeling has rapidly expanded in the past 15 years, and mixture models are increasingly applied in practice. Nonetheless, this literature has historically been diffuse, with different notations, motivations, and parameterizations making mixture models appear disconnected. This pedagogical review facilitates an integrative understanding of mixture models. First, 5 prototypic mixture models are presented in a unified format with incremental complexity while highlighting their mutual reliance on familiar probability laws, common assumptions, and shared aspects of interpretation. Second, 2 recent extensions—hybrid mixtures and parallel-process mixtures—are discussed. Both relax a key assumption of classic mixture models but do so in different ways. Similarities in construction and interpretation among hybrid mixtures and among parallel-process mixtures are emphasized. Third, the combination of both extensions is motivated and illustrated by means of an example on oppositional defiant and depressive symptoms. By clarifying how existing mixture models relate and can be combined, this article bridges past and current developments and provides a foundation for understanding new developments.

Notes

1For mixture models with discrete outcomes, a probability parameterization, most commonly, or loglinear parameterization, have historically been used (e.g., CitationBiemer, 2011; CitationCollins & Flaherty, 2002; CitationHeinen, 1996; CitationMcCutcheon, 2002). For mixture models with continuous outcomes, a logistic parameterization is typically used for the between-class model, as employed here also. For consistency, we also employ a logistic parameterization with discrete outcomes (following, e.g., CitationHumphreys & Janson, 2000; CitationB. O. Muthén, 2001, Citation2004; CitationReboussin, Reboussin, Liang, & Anthony, 1998). This logistic parameterization can be used to compute probabilities, implicitly achieves the same constraints as in the popular probability parameterization, and readily expands to accommodate covariates (unlike the probability parameterization; CitationMagidson & Vermunt, 2004).

2Reasons are that (a) there is no formal set of rules (akin to Wright's tracing rules in SEM) allowing mixture model equations to be directly reproduced from their diagrams and (b) diagrams do not fully represent all aspects of the mixture model. For instance, current path diagrams do not communicate how many classes there are, if a particular parameter differs across some but not all classes, if a parameter is fixed to 0 in some classes but estimated in others, or which is the reference class.

3Here we use a shorthand: P(A) represents P(A = a), where a is a realization of random variable A.

4This is also sometimes called the chain rule.

5Online appendix is available at http://www.vanderbilt.edu/peabody/sterba/appxs.htm

6For each outcome in the LPA, depicts class-specific densities weighted by their class probabilities; however, weighting is used here only for ease of visualization. In LPA estimation, weighting is done for the joint outcome density, not individual outcome densities (see Equation (14)).

7Even when discrete subpopulations exist, many more response patterns than classes can arise due to measurement error.

8When evaluated for a particular response pattern, it is a probability.

9Having K = M corresponds with configural invariance of the categorical latent variable across time.

10The testability of measurement invariance in the conventional LTA contrasts to its untestability in the conventional GBT (unless extended to a second-order GBT, as in CitationGrimm & Ram, 2009).

11If T > 2, it could be possible to regress latent states at time t on prior states at both t – 1 and t – 2.

a Further constraints are common for parsimony and/or to prevent empirical underidentification (see text). Further constraints are also used to, for instance, impose threshold invariance within state across time in LTA.

12Analogously, in continuous latent variable models, the posterior density can be “shrunk” toward the mean of the prior density (CitationSkrondal & Rabe-Hesketh, 2004).

13A third approach, not discussed here, is unavailable for LCA and LTA but available for LPA and GBT. It involves estimating residual covariances within class; estimating many such covariances and/or allowing them to differ across class can incur estimation problems (CitationLubke & Neale, 2006).

14For instance, in a K = M = 2 LTA, testing forward change (below-diagonal elements of the transition probability matrix = 0) requires fixing α1 to a large negative number. Testing no-change (off-diagonal elements of transition probability matrix = 0) requires also fixing β11 to a very large positive number.

15Within class, the number of growth coefficients is (1 + b) × r (where b is the polynomial curve degree and r is the number of regimes). At timepoints where a particular regime goes off-line, its growth coefficients are fixed to 0 (for details see CitationDolan, Schmittmann, Lubke, & Neale, 2005).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.