1,147
Views
2
CrossRef citations to date
0
Altmetric
Teacher’s Corner

A Guide to Detecting and Modeling Local Dependence in Latent Class Analysis Models

Pages 971-982 | Received 13 Oct 2021, Accepted 20 Jan 2022, Published online: 31 Mar 2022
 

Abstract

Latent class analysis (LCA) assigns individuals to mutually exclusive classes based on response patterns to a set of indicators. A primary assumption made is local independence, which suggests class indicators are uncorrelated within each class. When the indicators are correlated and unmodeled, parameter estimates can be severely biased. We provide a comprehensive resource for applied researchers to statistically detect local independence violations and model identified correlated residuals. We explain the local independence assumption and illustrate how to detect and model conditional dependence using maximum likelihood (ML) and Bayesian estimation. For ML, we discuss two detection methods (bivariate residual associations, and the modification index) and one modeling technique (LCA residual associations model). We also demonstrate how to use the restrictive prior strategy to detect and model conditional dependence when using Bayesian estimation. These techniques are illustrated with simulated datasets; code is provided in the online supplemental materials.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 The LCA model has been extended to include class indicators that have an ordinal, interval, or ratio scale. LCA models can have class indicators of all the same scale or of mixed scales (e.g., some binary items and some ratio items can be combined together in the same analysis).

2 We used a simulated dataset instead of an empirical dataset because it allowed us to control the amount of conditional dependence present between class indicators. Some of the detection methods are not as powerful as other detection methods, and we can highlight these strengths and limitations more easily with simulated data. In addition, with a simulated dataset, we can be confident the correlated residuals were not misspecified when later modeling the conditional dependence.

3 Our item threshold specifications correspond to moderate class separation (Masyn, Citation2013; Nylund-Gibson & Masyn, Citation2016).

4 The traditional method of identifying a cut-off uses the chi-square distribution, where an approximate degrees of freedom (df) can be calculated with df=liljlilj+1. The observed categories for ui are li, and the observed categories for uj are lj. With the approximate df, researchers can then use the chi-square distribution to identify a possible cutoff for “high” BVR. For example, using our generated dataset with binary class indicators, we could calculate an approximate df for a pair of class indicators by inserting li=lj=2 into the approximate df formula: df=2×222+1=1. Next, we could identify a critical value using a chi-square distribution table with df=1. The upper 5% quintile was 3.841, which suggests Pearson test statistic values greater than 3.841 indicate non-zero residual associations. One important caveat here is that multiple testing must be accounted for (Asparouhov & Muthén, Citation2015). Therefore, researchers would be better served with a more stringent cutoff, especially when they have many class indicators as we do.

5 We use a cut-off value of 10 as the critical value as an arbitrary illustrative feature. As noted above, there are no clear rules-of-thumb for implementing the chi-square distribution and selecting the critical value to use for interpretation purposes. Our cut-off value of 10 is not meant to provide any sort of rule-of-thumb. Instead, we have selected this value as a reasonably high value that can be used for demonstration purposes. Researchers wanting to implement this method should carefully consider the impact and influence that cut-off values have on substantive conclusions prior to implementation.

6 The MI can be calculated for each possible pair of class indicators using submatrices in the hessian, see Sörbom (Citation1989) for a technical explanation of the MI calculation and Oberski et al. (Citation2013) for an explanation of MI in the context of LCA with residual correlations.

7 The residual correlation ρ can be approximated from the association parameter β using (1+4β21)/(2β); however, as the residual correlation increases, the approximation formula tends to underestimate the correlation (Asparouhov & Muthén, Citation2015; Becker, Citation1989).

8 The five residual associations included were consistently identified as non-zero across detection methods in the Detecting and Modeling Conditional Dependence with ML Estimation section.

9 For a gentle introduction to Bayesian estimation, see Depaoli et al. (Citation2017) and van de Schoot et al. (Citation2014).

10 The default priors are as follows. Item thresholds implement normal distribution priors such that an individual item threshold ∼ N(0,5). Class prevalences implement a Dirichlet distribution such that the class proportions for a two-class model would be ∼ D(10,10), with 10 individuals assigned to each latent class by default. Finally, the residual associations are assumed to take on an inverse Wishart distribution with residual associations ∼ IW(0,11).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 412.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.