Abstract
When large amounts of survival data arrive in streams, conventional estimation methods become computationally infeasible since they require access to all observations at each accumulation point. We develop online updating methods for carrying out survival analysis under the Cox proportional hazards model in an online-update framework. Our methods are also applicable with time-dependent covariates. Specifically, we propose online-updating estimators as well as their standard errors for both the regression coefficients and the baseline hazard function. Extensive simulation studies are conducted to investigate the empirical performance of the proposed estimators. A large colon cancer dataset from the Surveillance, Epidemiology, and End Results program and a large venture capital dataset with time-dependent covariates are analyzed to demonstrate the utility of the proposed methodologies. Supplemental files for this article are available online.
Supplementary Materials
This zip file contains our supplementary materials, which includes the following:
Codes:
updatesurvival: R package
README.pdf: README file with instructions on running the R package
testcode.R: one sample code
datatest1.txt: one simulated data set used in the simulation study I
Additional figures: Additional figures under the
fixed partition and no bias correction approach (Figures S1-S3)
fixed partition and bias correction approach (Figures S4-S6)
adaptive partition and no bias correction approach (Figures S7-S9)
adaptive partition and bias correction approach (Figures S10-S12)
Acknowledgments
We would like to thank the editor, the associate editor, and the two anonymous reviewers for their very helpful comments and suggestions, which have led to a much improved version of the article.