Abstract
It has been repeatedly shown that in case–control association studies, analysis of a secondary trait that ignores the original sampling scheme can produce highly biased risk estimates. Although a number of approaches have been proposed to properly analyze secondary traits, most approaches fail to reproduce the marginal logistic model assumed for the original case–control trait and/or do not allow for interaction between secondary trait and genotype marker on primary disease risk. In addition, the flexible handling of covariates remains challenging. We present a general retrospective likelihood framework to perform association testing for both binary and continuous secondary traits, which respects marginal models and incorporates the interaction term. We provide a computational algorithm, based on a reparameterized approximate profile likelihood, for obtaining the maximum likelihood (ML) estimate and its standard error for the genetic effect on secondary traits, in the presence of covariates. For completeness, we also present an alternative pseudo-likelihood method for handling covariates. We describe extensive simulations to evaluate the performance of the ML estimator in comparison with the pseudo-likelihood and other competing methods. Supplementary materials for this article are available online.
Acknowledgments
The authors appreciate the insightful and constructive comments made by the two anonymous reviewers. The authors thank Dr. Ravi Varadhan for helpful suggestions regarding numerical optimization in R. Part of Dr. Ghosh's research was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics of the National Cancer Institute. Additional support provided by NIH grants R01GM074175, P01CA142538, and P30ES010126.