351
Views
14
CrossRef citations to date
0
Altmetric
Original Articles

Factors Affecting the Adequacy and Preferability of Semiparametric Groups-Based Approximations of Continuous Growth Trajectories

, &
Pages 590-634 | Published online: 15 Aug 2012
 

Abstract

Psychologists have long been interested in characterizing individual differences in change over time. It is often plausible to assume that the distribution of these individual differences is continuous in nature, yet theory is seldom so specific as to designate its parametric form (e.g., normal). Semiparametric groups-based trajectory models (SPGMs) were thus developed to provide a discrete approximation for continuously distributed growth of unknown form. Previous research has demonstrated the adequacy of the approximation provided by SPGM but only under relatively narrow, theoretically optimal conditions. Under alternative conditions, which may be more common in practice (e.g., higher dimension random effects, smaller sample sizes), this study shows that approximation adequacy can suffer. Furthermore, this study also evaluates whether SPGM's discrete approximation is preferable to a parametric trajectory model that assumes normally distributed random effects when in fact the distribution is modestly nonnormal. The answer is shown to depend on distributional characteristics of both repeated measures (binary or continuous) and random effects (bimodal or skewed). Implications for practice are discussed in light of empirical examples on externalizing behavior.

Notes

1Out of 100 applications, 22 modeled externalizing or related antisocial, conduct, or aggressive behavior. See Online Appendix at http://www.vanderbilt.edu/peabody/sterba/appxs.htm for references for surveyed applications.

2 CitationNagin (2005) used two random effects that were correlated at 1.0, which is statistically equivalent to one random effect. CitationBrame et al. (2006) used one random effect (a random intercept).

3The three samples used in Panels A–C had continuous repeated measures and were generated under HLM simulation conditions described later, which included quadratic fixed effects.

4A similar curse of dimensionality has been noted for related methods, such as nonparametric maximum likelihood estimation (NPMLE; Follmann & Lambert, 1989; CitationHeckman & Singer, 1984; CitationLaird, 1978) wherein 6–7 points of support may be needed to adequately approximate one random effect (e.g., CitationRabe-Hesketh, Pickles, & Skrondal, 2003) whereas 15 points of support may be needed to approximate two random effects (e.g., CitationSchafer, 2001). But beyond two dimensions, “little is known about the performance of NPMLE for models with a large number of latent variables [random effects]” (CitationSkrondal & Rabe-Hesketh, 2004, p. 183). However, in contrast to SPGM, the number of mass points K available for NPMLE is not as directly limited by N and model complexity because K is not chosen using model selection indices such as BIC (CitationLindsay, 1995; CitationSkrondal & Rabe-Hesketh, 2004).

5Unfortunately, overextracting classes in SPGM beyond the number selected as best fitting (by BIC) at a given N is not necessarily viable; this risks allowing a class proportion to approach zero or allowing parameters in two classes to approach the same values. Both situations can lead to singularities and estimation problems (CitationMcLachlan & Peel, 2000).

6Van den Oord et al. (2003) estimated random effect nonnormality by finding which of a family of Johnson Curves best fit the random effect distribution.

7The simulation was piloted with the following free parameters in the fitted SPGM: one Level 1 error variance parameter (for continuous repeated measures only) and class-varying intercept, linear, and quadratic growth coefficients for each of K classes. However, for binary repeated measures, severe convergence problems were obtained during piloting when estimating more than two classes (even though almost always best BIC K > 2), and converged replications often had singularities and nonsensical, extreme growth coefficient values and/or standard errors. Diagnostic checks suggested that estimation problems often occurred when SPGM tried to reproduce the pattern of endorsement of individuals displaying mainly zeros over time. For these individuals there is little information from which to estimate the intercept (other than as a large negative value) or slopes (other than as nonpositive). In this light, we re-ran all binary cells with a designated flat class (linear and quadratic slopes fixed to zero) in which a boundary constraint of −3.5 was imposed on the intercept.

8Overall, SPGM's ARB and MSE tended to be much higher in the binary outcomes condition than the normal outcomes condition. In case this might be due to binary outcomes requiring much higher N than continuous in order to achieve sufficient K for SPGM's approximation, we increased N to 10,000. We found the same pattern of results. We also calculated medians and trimmed means (i.e., most extreme 10% of sample estimates removed before averaging) to see if bias was meaningfully influenced by a few extreme sample estimates per cell. We again found the same pattern of results.

9A subsetted sample size was chosen to mirror the simulation conditions; using the full sample can change the appearance and number of trajectory classes. Imposing a censored normal versus normal conditional response distribution yielded similar results. The empirical example SPGM and HLM with conditionally normal repeated measures used heterogeneous residual variances (σ2 t ). This example is for pedagogical purposes; related analyses are available elsewhere (e.g., for aggression; NICHD ECCRN, 2004).

*p < .05.

10Others have sought to explain these discrepant findings more from a direct perspective (e.g., CitationFontaine et al., 2009; CitationVan Dulmen et al., 2009; see also CitationEggleston et al., 2004; CitationJackson & Sher, 2008).

11For instance, CitationPetitclerc et al. (2009) explain that “with four groups and more, a small, high stable group was consistently found, and adding groups resulted in splitting lower level groups. Therefore, the four-group solution was retained” (p. 1479); likewise CitationGross et al. (2009) write, “Despite improved BIC scores, both the five and six group models resulted in subdividing already modest size groups with higher levels of maternal depressive symptoms into smaller groups that were not substantively different from one another; thus, the four group model emerged as the best fitting and most parsimonious model” (p. 147). CitationBeyers & Seiffge-Krenke (2007) explain that “if a solution with K classes emerges in which certain classes are merely slight variations on a common theme and, hence, do not have differential substantive meaning, the more parsimonious solution with K – 1 classes is chosen” (p. 563). CitationBrame et al. (2001) similarly state, “For the adolescent aggression data a six-group model was found to best fit the data. However, here we describe the four-group model because the results from this more parsimonious solution are qualitatively similar” (p. 506).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.