933
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Enhancing the Human Health Status Prediction: The ATHLOS Project

ORCID Icon, ORCID Icon, , , ORCID Icon, ORCID Icon, , , , , , , , , , ORCID Icon, ORCID Icon, & show all
Pages 834-856 | Received 11 Feb 2021, Accepted 20 May 2021, Published online: 17 Jun 2021

References

  • Andrews, G., F. Cheok, and S. Carr. 1989. The Australian longitudinal study of ageing. Australian Journal on Ageing 8 (2):31–35. doi:10.1111/j.1741-6612.1989.tb00756.x.
  • Andrews, G., and M. J. Clark. 1999. The International Year of Older Persons: Putting aging and research onto the political agenda. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences 54 (1): P7–P10. doi:10.1093/geronb/54B.1.P7.
  • Arokiasamy, P., D. Bloom, J. Lee, K. Feeney, and M. Ozolins. 2012. Longitudinal aging study in India: Vision, design, implementation, and preliminary findings. National Research Council (US) Panel on Policy Research and Data Needs to Meet the Challenge of Aging in Asia; Smith JP, Majmundar M, editors. Aging in Asia: Findings From New and Emerging Data Initiatives. Washington (DC): National Academies Press (US)
  • Azur, M. J., E. A. Stuart, C. Frangakis, and P. J. Leaf. 2011. Multiple imputation by chained equations: What is it and how does it work? International Journal of Methods in Psychiatric Research 20 (1):40–49. doi:10.1002/mpr.329.
  • Börsch-Supan, A., M. Brandt, C. Hunkler, T. Kneip, J. Korbmacher, F. Malter, B. Schaan, S. Stuck, and S. Zuber. 2013. Data resource profile: the Survey of Health, Ageing and Retirement in Europe (SHARE). International journal of epidemiology 42 (4):992–1001. doi:10.1093/ije/dyt088
  • Buuren, S. van, and K. Groothuis-Oudshoorn. 2010. mice: Multivariate imputation by chained equations in R. Journal of Statistical Software. In press:1–68.
  • Caballero, F. F., G. Soulis, W. Engchuan, A. Sánchez-Niubó, H. Arndt, J. L. Ayuso-Mateos, J. M. Haro, S. Chatterji, and D. B. Panagiotakos. 2017. Advanced analytical methodologies for measuring healthy ageing and its determinants, using factor analysis and machine learning techniques: The ATHLOS project. Scientific Reports 7 (1):43955. doi:10.1038/srep43955.
  • Cahan, E. M., T. Hernandez-Boussard, S. Thadaney-Israni, and D. L. Rubin. 2019. Putting the data before the algorithm in big data addressing personalized healthcare. NPJ Digital Medicine 2 (1):1–6. doi:10.1038/s41746-019-0157-2.
  • Canela, M. Á., I. Alegre, and A. Ibarra. 2019. Dummy Variables. In: Quantitative Methods for Management, 57–63. Springer, Cham. doi: 10.1007/978-3-030-17554-2_6.
  • Chai, T., and R. R. Draxler. 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific Model Development 7 (3):1247–50. doi:10.5194/gmd-7-1247-2014.
  • Chen, R., L. Sang, M. Jiang, Z. Yang, N. Jia, F. Wanyi, J. Xie W. Guan, W. Liang, Z. Ni, et al.. 2020. Longitudinal hematologic and immunologic variations associated with the progression of COVID-19 patients in China. Journal of Allergy and Clinical Immunology. 146(1):89–100. doi:10.1016/j.jaci.2020.05.003.
  • Chen, T., and C. Guestrin. 2016. “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, New York, NY, USA, 785–794.
  • Dash, S., S. K. Shakyawar, M. Sharma, and S. Kaushik. 2019. Big data in healthcare: Management, analysis and future prospects. Journal of Big Data 6 (1):1–25. doi:10.1186/s40537-019-0217-0.
  • DeSilva, A. P., M. Moreno-Betancur, A. M. De Livera, K. J. Lee, and J. A. Simpson. 2017. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a time-varying covariate with a non-linear association with time: A simulation study. BMC Medical Research Methodology 17 (1):1–11. doi:10.1186/s12874-017-0372-y.
  • DeSilva, A. P., M. Moreno-Betancur, A. M. De Livera, K. J. Lee, and J. A. Simpson. 2019. Multiple imputation methods for handling missing values in a longitudinal categorical variable with restrictions on transitions over time: A simulation study. BMC Medical Research Methodology 19 (1):1–14. doi:10.1186/s12874-018-0650-3.
  • Dimopoulos, A. C., M. Nikolaidou, F. F. Caballero, W. Engchuan, A. Sanchez-Niubo, H. Arndt, J. L. Ayuso-Mateos, J. M. Haro, S. Chatterji, E. N. Georgousopoulou, et al.. 2018. Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk. BMC Medical Research Methodology. 18(1):1–11. doi:10.1186/s12874-018-0644-1.
  • Eckle, K., and J. Schmidt-Hieber. 2019. A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Networks 110:232–42. doi:10.1016/j.neunet.2018.11.005.
  • Eekhout, I., H. CW de Vet, J. W. R. Twisk, J. P. L. Brand, R. D. B. Michiel, and M. W. Heymans. 2014. Missing data in a multi-item instrument were best handled by multiple imputation at the item score level. Journal of Clinical Epidemiology 67 (3):335–42. doi:10.1016/j.jclinepi.2013.09.009.
  • Elmessiry, A., W. O. Cooper, T. F. Catron, J. Karrass, Z. Zhang, and M. P. Singh. 2017. Triaging patient complaints: Monte Carlo cross-validation of six machine learning classifiers. JMIR Medical Informatics 5 (3):e19. doi:10.2196/medinform.7140.
  • Flitcroft, L., W. S. Chen, and D. Meyer. 2020. The Demographic Representativeness and Health Outcomes of Digital Health Station Users: Longitudinal Study. Journal of Medical Internet Research 22 (6):e14977. doi:10.2196/14977.
  • Hood, L. 2017. P4 medicine and scientific wellness: Catalyzing a revolution in 21st century medicine. Molecular Frontiers Journal 1 (2):132–37. doi:10.1142/S2529732517400156.
  • Hosmer Jr, D. W., S. Lemeshow, and R. X. Sturdivant. 2013. Applied logistic regression. Vol. 398. John Wiley & Sons.
  • Huque, M. H., J. B. Carlin, J. A. Simpson, and K. J. Lee. 2018. A comparison of multiple imputation methods for missing data in longitudinal studies. BMC Medical Research Methodology 18 (1):168. doi:10.1186/s12874-018-0615-6.
  • Ichimura, H., S. Shimizutani, and H. Hashimoto. 2009. JSTAR first results 2009 report. Technical Report. Research Institute of Economy, Trade and Industry (RIETI).
  • Jian, S., N. Su-Mei, C. Xue, Z. Jie, and W. Xue-sen. 2017. Association and interaction between triglyceride–glucose index and obesity on risk of hypertension in middle-aged and elderly adults. Clinical and Experimental Hypertension 39 (8):732–39. doi:10.1080/10641963.2017.1324477.
  • Jolani, S., L. E. Frank, and S. van Buuren. 2014. Dual imputation model for incomplete longitudinal data. British Journal of Mathematical and Statistical Psychology 67 (2):197–212. doi:10.1111/bmsp.12021.
  • Jordanov, I., N. Petrov, and A. Petrozziello. 2018. Classifiers accuracy improvement based on missing data imputation. Journal of Artificial Intelligence and Soft Computing Research 8 (1):31–48. doi:10.1515/jaiscr-2018-0002.
  • Kelly, D., K. Curran, and B. Caulfield. 2017. Automatic prediction of health status using smartphone-derived behavior profiles. IEEE Journal of Biomedical and Health Informatics 21 (6):1750–60. doi:10.1109/JBHI.2017.2649602.
  • Koskinen, S. 2018. “Health 2000 and 2011 Surveys—THL Biobank. National Institute for Health and Welfare.” https://thl.fi/fi/web/thl-biobank/for-researchers/sample-collections/health-2000-and-2011-surveys. [ Online; accessed 18-July-2008].
  • Kowal, P., S. Chatterji, N. Naidoo, R. Biritwum, W. Fan, R. L. Ridaura, T. Maximova, P. Arokiasamy, N. Phaswana-Mafuya, S. Williams, et al.. 2012. Data resource profile: The World Health Organization Study on global AGEing and adult health (SAGE). International Journal of Epidemiology. 41(6):1639–49. doi:10.1093/ije/dys210.
  • Kramer, O. 2013. K-nearest neighbors. In Dimensionality reduction with unsupervised nearest neighborsIntelligent Systems Reference Library, vol 51. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38652-7_2.
  • Leonardi, M., S. Chatterji, S. Koskinen, J. L. Ayuso-Mateos, J. M. Haro, G. Frisoni, L. Frattura, A. Martinuzzi, B. Tobiasz-Adamczyk, M. Gmurek et al.. 2014. Determinants of health and disability in ageing population: The COURAGE in Europe Project (collaborative research on ageing in Europe). Clinical Psychology & Psychotherapy. 21(3):193–98. doi:10.1002/cpp.1856.
  • Licher, S., M. J. G. Leening, P. Yilmaz, F. J. Wolters, J. Heeringa, P. J. E. Bindels, Alzheimer’s Disease Neuroimaging Initiative, M. W. Vernooij, B. C. Stephan, E. W. Steyerberg et al.. 2019. Development and validation of a dementia risk prediction model in the general population: An analysis of three longitudinal studies. American Journal of Psychiatry 176(7):543–51. doi:10.1176/appi.ajp.2018.18050566.
  • Lin, W.-C., and C.-F. Tsai. 2020. Missing value imputation: A review and analysis of the literature (2006–2017). Artificial Intelligence Review 53 (2):1487–509. doi:10.1007/s10462-019-09709-4.
  • Liu, Y.-M. 2020. Population Aging, Technological Innovation, and the Growth of Health Expenditure: Evidence From Patients With Type 2 Diabetes in Taiwan. Value in Health Regional Issues 21:120–26. doi:10.1016/j.vhri.2019.07.012.
  • Lucas, C., P. Wong, J. Klein, T. B. R. Castro, J. Silva, M. Sundaram, M. K. Ellingson, T. Mao, J. E. Oh, B. Israelow, et al.. 2020. Longitudinal analyses reveal immunological misfiring in severe COVID-19. Nature. 584(7821):463–69. doi:10.1038/s41586-020-2588-y.
  • Luszcz, M. A., L. C. Giles, K. J. Anstey, C. B.-Y. Kathryn, R. A. Walker, and T. D. Windsor. 2016. Cohort Profile: The Australian Longitudinal Study of Ageing (ALSA). International Journal of Epidemiology 45 (4):1054–63. doi:10.1093/ije/dyu196.
  • Marois, G., and A. Aktas. 2020. Projecting health trajectories in Europe using microsimulation. IIASA Working Paper. Laxenburg, Austria: WP-20-004.
  • Miles, J. 2014. R squared, adjusted R squared. Wiley StatsRef: Statistics Reference Online. doi:10.1002/9781118445112.stat06627.
  • Montgomery, D. C., E. A. Peck, and G. G. Vining. 2021. Introduction to linear regression analysis. John Wiley & Sons, Inc.
  • Morgan, A. E. and M. T. Mc Auley 2020. Cholesterol Homeostasis: An In Silico Investigation into How Aging Disrupts Its Key Hepatic Regulatory Mechanisms. Biology 9 (10):314.
  • Nooraee, N., G. Molenberghs, J. Ormel, R. Edwin, and V. D. Heuvel. 2018. Strategies for handling missing data in longitudinal studies with questionnaires. Journal of Statistical Computation and Simulation 88 (17):3415–36. doi:10.1080/00949655.2018.1520854.
  • Panaretos, D., E. Koloverou, A. C. Dimopoulos, G.-M. Kouli, M. Vamvakari, G. Tzavelas, C. Pitsavos, and D. B. Panagiotakos. 2018. A comparison of statistical and machine-learning techniques in evaluating the association between dietary patterns and 10-year cardiometabolic risk (2002–2012): The ATTICA study. British Journal of Nutrition 120 (3):326–34. doi:10.1017/S0007114518001150.
  • Park, J. H., S. Lim, J. Lim, K. Kim, M. Han, I. Y. Yoon, J. Kim, Y. Chang, C. B. Chang, H. J. Chin, et al.. 2007. An overview of the Korean longitudinal study on health and aging. Psychiatry Investigation 4 (2):84.
  • Passarino, G., F. De Rango, and A. Montesanto. 2016. Human longevity: Genetics or Lifestyle? It takes two to tango. Immunity & Ageing 13 (1):12. doi:10.1186/s12979-016-0066-z.
  • Peasey, A., M. Bobak, R. Kubinova, S. Malyutina, A. Pajak, A. Tamosiunas, H. Pikhart, A. Nicholson, and M. Marmot. 2006. Determinants of cardiovascular disease and other non-communicable diseases in Central and Eastern Europe: Rationale and design of the HAPIEE study. BMC Public Health 6 (1):255. doi:10.1186/1471-2458-6-255.
  • Pedersen, A. B., E. M. Mikkelsen, D. Cronin-Fenton, N. R. Kristensen, T. M. Pham, L. Pedersen, and I. Petersen. 2017. Missing data and multiple imputation in clinical epidemiological research. Clinical Epidemiology 9:157. doi:10.2147/CLEP.S129785.
  • Pierce, M., H. Hope, T. Ford, S. Hatch, M. Hotopf, A. John, E. Kontopantelis, R. Webb, S. Wessely, S. McManus, et al.. 2020. Mental health before and during the COVID-19 pandemic: A longitudinal probability sample survey of the UK population. The Lancet Psychiatry. 7(10):883–92. doi:10.1016/S2215-0366(20)30308-4.
  • Prina, A. M., D. Acosta, I. Acosta, M. Guerra, Y. Huang, A. T. Jotheeswaran, I. Z. Jimenez-Velazquez, Z. Liu, J. J. Llibre Rodriguez, A. Salas, et al.. 2016. Cohort profile: The 10/66 study. International Journal of Epidemiology 46 (2):406–406i.
  • Ribeiro, C., and A. A. Freitas. 2021. A data-driven missing value imputation approach for longitudinal datasets. Artificial Intelligence Review: 1–31. doi:10.1007/s10462-021-09963-5.
  • Rights, J. D., and S. K. Sterba. 2020. New recommendations on the use of R-squared differences in multilevel model comparisons. Multivariate Behavioral Research 55 (4):568–99. doi:10.1080/00273171.2019.1660605.
  • Rodríguez-Artalejo, F., A. Graciani, P. Guallar-Castillón, M. L. Luz, M. C. Zuluaga, E. López-García, J. L. Gutiérrez-Fisac, J. M. Taboada, M. T. Aguilera, E. Regidor, et al.. 2011. Rationale and methods of the study on nutrition and cardiovascular risk in Spain (ENRICA). Revista Española De Cardiología (English Edition). 64(10):876–82. doi:10.1016/j.rec.2011.05.023.
  • Sanchez-Niubo, A., L. Egea-Cortés, B. Olaya, F. F. Caballero, L. A.-M. Jose, M. Prina, M. Bobak, H. Arndt, B. Tobiasz-Adamczyk, A. Pająk, et al.. 2019. Cohort profile: The Ageing trajectories of health–longitudinal opportunities and synergies (ATHLOS) project. International Journal of Epidemiology. 48(4):1052–1053i. doi:10.1093/ije/dyz077.
  • Singh, S., and J. Prasad. 2013. Estimation of missing values in the data mining and comparison of imputation methods. Mathematical Journal of Interdisciplinary Sciences 1 (2):75–90. doi:10.15415/mjis.2013.12015.
  • Slawski, M.. 2018. On principal components regression, random projections, and column subsampling. Electronic Journal of Statistics. 12(2):3673–712. doi:10.1214/18-EJS1486.
  • Sonnega, A., J. D. Faul, M. B. Ofstedal, K. M. Langa, J. W. R. Phillips, and D. R. Weir. 2014. Cohort profile: The health and retirement study (HRS). International Journal of Epidemiology 43 (2):576–85. doi:10.1093/ije/dyu067.
  • Stavseth, M. R., T. Clausen, andJ. Røislien. 2019. How handling missing data may impact conclusions: A comparison of six different imputation methods for categorical questionnaire data. SAGE Open Medicine 7:2050312118822912. doi:10.1177/2050312118822912.
  • Steptoe, A., E. Breeze, J. Banks, and J. Nazroo. 2013. Cohort profile: The English longitudinal study of ageing. International Journal of Epidemiology 42 (6):1640–48. doi:10.1093/ije/dys168.
  • Tai, A. M. Y., A. Albuquerque, N. E. Carmona, M. Subramanieapillai, D. S. Cha, M. Sheko, Y. Lee, R. Mansur, and R. S. McIntyre. 2019. Machine learning and big data: Implications for disease modeling and therapeutic discovery in psychiatry. Artificial intelligence in medicine 99:101704. doi:10.1016/j.artmed.2019.101704.
  • Vilardell, M., M. Buxó, R. Clèries, J. M. Martínez, G. Garcia, A. Ameijide, R. Font, S. Civit, R. Marcos-Gragera, M. L. Vilardell, and M. Carulla. 2020. Missing data imputation and synthetic data simulation through Modeling Graphical Probabilistic Dependencies between Variables (ModGraProDep): An application to breast cancer survival. Artificial Intelligence in Medicine 107:101875. doi:10.1016/j.artmed.2020.101875.
  • Wang, C., R. Pan, X. Wan, Y. Tan, X. Linkang, S. M. Roger, F. N. Choo, B. Tran, R. Ho, V. K. Sharma, et al.. 2020a. A longitudinal study on the mental health of general population during the COVID-19 epidemic in China. Brain, Behavior, and Immunity 87:40–48. doi:10.1016/j.bbi.2020.04.028.
  • Wang, Y., C. Dong, H. Yue, L. Chungao, Q. Ren, X. Zhang, H. Shi, and M. Zhou. 2020b. Temporal changes of CT findings in 90 patients with COVID-19 pneumonia: A longitudinal study. Radiology 296 (2):E55–E64. doi:10.1148/radiol.2020200843.
  • Whelan, B. J., and G. M. Savva. 2013. Design and methodology of the Irish Longitudinal Study on Ageing. Journal of the American Geriatrics Society 61:S265–S268. doi:10.1111/jgs.12199.
  • Wiens, J., and E. S. Shenoy. 2018. Machine learning for healthcare: On the verge of a major shift in healthcare epidemiology. Clinical Infectious Diseases 66 (1):149–53. doi:10.1093/cid/cix731.
  • Wong, R., A. Michaels-Obregon, and A. Palloni. 2017. Cohort profile: The Mexican health and aging study (MHAS). International Journal of Epidemiology 46 (2):e2–e2. doi:10.1093/ije/dyu263.
  • Yamaguchi, Y., M. Ueno, K. Maruo, and M. Gosho. 2020. Multiple imputation for longitudinal data in the presence of heteroscedasticity between treatment groups. Journal of Biopharmaceutical Statistics 30 (1):178–96. doi:10.1080/10543406.2019.1632878.
  • Zhang, Y., G. Golm, and G. Liu. 2020. A likelihood-based approach for the analysis of longitudinal clinical trials with return-to-baseline imputation. Statistics in Biosciences 12 (1):23–36. doi:10.1007/s12561-020-09269-0.
  • Zhang, Z. 2016. Introduction to machine learning: K-nearest neighbors. Annals of Translational Medicine 4:11. doi:10.21037/atm.2016.03.37.
  • Zhao, J., Q. Feng, P. Wu, R. A. Lupu, R. A. Wilke, Q. S. Wells, J. C. Denny, and W.-Q. Wei. 2019. Learning from longitudinal data in electronic health record and genetic data to improve cardiovascular event prediction. Scientific Reports 9 (1):1–10. doi:10.1038/s41598-018-37186-2.
  • Zhou, L., S. Pan, J. Wang, and A. V. Vasilakos. 2017. Machine learning on big data: Opportunities and challenges. Neurocomputing 237:350–61. doi:10.1016/j.neucom.2017.01.026.
  • Zumel, N., and J. Mount. 2016. vtreat: A data. frame Processor for Predictive Modeling. In arXiv preprintarXiv:1611.09477v9.
  • Zumel, N., J. Mount, and J. Porzak. 2014. Practical data science with R. New York, NY, USA, Manning Shelter Island.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.