2,352
Views
9
CrossRef citations to date
0
Altmetric
Research Articles

Regression trees for longitudinal data with baseline covariates

ORCID Icon & ORCID Icon
Pages 1-22 | Received 13 Apr 2017, Accepted 28 Oct 2018, Published online: 31 Dec 2018
 

ABSTRACT

Longitudinal changes in a population of interest are often heterogeneous and influenced by a combination of baseline factors. In such cases, classical linear mixed effects models [Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974.] for the mean structure provide poor fit to the data. We propose regression tree methodology for the longitudinal data identifying and characterizing homogeneous subgroups. Currently available regression tree construction methods are either limited to a repeated measures scenario or combine the heterogeneity among subgroups with the random inter-subject variability. We propose a longitudinal classification and regression tree (LongCART) algorithm under conditional inference framework that overcomes these limitations utilizing a two-step approach. The LongCART first selects the partitioning variable via a parameter instability test and then finds the optimal split for the selected partitioning variable. Thus, at each node, the decision of further splitting is type I error controlled, guarding against variable selection bias, over-fitting and spurious splitting. We obtained asymptotic results for the proposed instability test and examined its finite sample behavior through simulation studies. Comparative performance of LongCART algorithm was evaluated empirically via simulation studies. Finally, we applied LongCART to study the longitudinal changes in choline levels among HIV-positive patients.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by National Institute of Mental Health [R01MH108467].

Notes on contributors

Madan Gopal Kundu

Madan Gopal Kundu is a Manager in the Data and Statistical Sciences (DSS) at AbbVie in Chicago, IL, USA.

Jaroslaw Harezlak

Jaroslaw Harezlak is a Professor at the Department of Epidemiology and Biostatistics, Indiana University School of Public Health, Bloomington, IN, USA.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.