Abstract
Chronic kidney disease (CKD) affects many lives and has a large impact on health systems around the world. To better understand and predict costs for insurance plan people with CKD in the United States, we built a new model of their individual costs. Our model is the first to explicitly model both the CKD stage transition process and the distribution of costs given those stages. Additionally, it incorporates numerous covariates and comorbidities. We applied the models to two large and rich datasets, one commercial insurance and the other Medicare fee-for-service, totaling about 40 million beneficiary months. We found that the XGBoost models best predict both stage transitions and costs. If XGBoost models are unavailable, a multivariate logistic regression model with regularization to predict stage and a logit-gamma model of the costs given the stage best predicted the people’s health care costs in the next month.
Acknowledgments
The authors appreciate Rob Bachler, Deana Bell, Lisa Charron, Gabriela Dieguez, Leah Engel, Mike Hamachek, Austin Levenson, and Karen Schenkenfelder for their work and feedback on the project. The authors are also grateful to the project oversight group (Ken Avner, Joan Barrett, Scott Kelly, Daniel Kurowski, Rhyxian Lin, George Omondi, Rebecca Owen, and Alex Ryu) and Achilles Natsis and Erika Schulty from the Society of Actuaries for their support of this project. Finally, the authors thank the anonymous referees and editors for their thoughtful comments and feedback.