488
Views
6
CrossRef citations to date
0
Altmetric
Articles

Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches

, &
Pages 277-299 | Published online: 02 Jul 2019
 

Abstract

Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.

Notes

1 According to our definition of the two types of supplemental samples, the Young Lives Study used a refreshment approach. That is, the field teams followed the original sampling procedures; for example, the Ethiopia country report states: “there were some refusals and changes of mind during the second visit and such households were replaced by another eligible one using the standard procedure” (Alemu et al., Citation2003, p. 21).

2 For analyses, the data from Time 1 and 2 were deleted for all individuals in the replacement and refreshment samples. In other words, we generated scores at all five time points for these individuals, but they are not included in the analyses until Time 3 because they are in the supplemental, not the original, sample.

3 Acknowledging the role of multiplicity, we recognize that even with MCAR, if 20 potential auxiliary variables were examined, one may relate significantly to attrition using α=.05. Even so, if missingness is MCAR, any conceivable auxiliary variable would necessarily have a zero population correlation with missingness.

4 Two additional conditions were originally included: (5) correlation between the latent intercept and slope (cor(L,S)=0, .3) and (6) variance of measurement errors (var(e)=1, 3). The combination of these levels and conditions were chosen to make the reliabilities of the manifest variables between 0.25 and 0.95. Varying the correlation between the latent intercept and slope or the measurement error variance did not affect the pattern of findings; researchers may consider omitting these conditions from future simulations.

5 If the auxiliary variable is incorporated in the model (e.g., score at Time 1), then FIML can be used directly. Moreover, for large samples, results from FIML and the two-stage method should yield the same parameter estimates.

6 In the original subsample, the bi-serial correlation between gender and the estimated latent slope is .17 which is at the lower end of the conditions in our simulation. There were no other variables in the data set that were identified as auxiliary variables, i.e., that were correlated with whether or not a participant returned at Time 2.

7 Figure 4 depicts the plots of individual growth trajectories of the original sample and the estimate of the average slope parameter. It should be noted that for individuals who participated in only one time point, i.e., original sample that attrited after Time 1 or supplemental sample that attrited after Time 3, individual growth trajectories cannot be computed.

8 Little and Rubin (Citation2002) discussed design weights which are inversely proportional to the probability of the individual being in the data set. That is, the weight for each individual is denoted as 1/πi, where πi is the probability that the individual is included in the sample. In a related study, we used inverse probability weighting method; however, the bias was not corrected by applying the weights to the data (Mazen, Taylor & Tong, Citation2019).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 352.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.