Search in:

Dialogues in Clinical Neuroscience Volume 18, 2016 - Issue 3

Submit an article Journal homepage

Open access

1,167

Views

CrossRef citations to date

Altmetric

Listen

Brief Report

Transforming big data into computational models for personalized medicine and health care

La transformación de los macrodatos en modelos computacionales para la medicina personalizada y la atención en salud

Transformer les bases de données en modèles Informatiques pour la médecine personnalisée et les soins de santé

S. M. Reza SoroushmehrEmergency Medicine Department, University of Michigan, Ann Arbor, Michigan, USA; University of Michigan Center for Integrative Research in Critical Care (MCIRCC), University of Michigan, Ann Arbor, Michigan, USA; Department of Computational Medicine and Bio-informatics, University of Michigan, Ann Arbor, Michigan, USACorrespondence[email protected]

Kayvan NajarianEmergency Medicine Department, University of Michigan, Ann Arbor, Michigan, USA; University of Michigan Center for Integrative Research in Critical Care (MCIRCC), University of Michigan, Ann Arbor, Michigan, USA; Department of Computational Medicine and Bio-informatics, University of Michigan, Ann Arbor, Michigan, USA

Pages 339-343 | Published online: 01 Apr 2022

Cite this article
https://doi.org/10.31887/DCNS.2016.18.3/ssoroushmehr
CrossMark

In this article

Introduction
Computational approaches toward personalized medicine
Challenges
Discussion and conclusion
Acknowledgements
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF

Abstract

Health care systems generate a huge volume of different types of data. Due to the complexity and challenges inherent in studying medical information, it is not yet possible to create a comprehensive model capable of considering all the aspects of health care systems. There are different points of view regarding what the most efficient approaches toward utilization of this data would be. In this paper, we describe the potential role of big data approaches in improving health care systems and review the most common challenges facing the utilization of health care big data.

Los sistemas de atención de salud generan un enorme volumen de distintos tipos de datos. Debido a la complejidad y desafíos inherentes al estudio de la información médica, todavía no es posible crear un modelo comprensible capaz de incluir todos los aspectos de los sistemas de atención en salud. Existen diferentes puntos de vista acerca de cuáles serían las aproximaciones más eficientes para la utilización de esta información. En este artículo se describe el papel potencial de las aproximaciones de los macrodatos para mejorar los sistemas de atención de salud y se revisan los desafíos más comunes que enfrenta la utilización de los macrodatos en la atención de salud.

Les systèmes de santé génèrent un volume énorme de différents types de données. En raison de la complexité et des difficultés liées à l'etude des informations médicales, il n'est pas encore possible de créer un modèle complet prenant en compte tous les aspects des systèmes de santé. Les points de vue diffèrent sur les façons les plus efficaces d'utiliser ces données. Dans cet article, nous décrivons leur rôle potentiel dans l'amélioration des systèmes de santé et nous analysons les difficultés les plus courantes liées à l'utiiisation des données de santé.

Keywords:

big data
challenges
computational method
health care system
personalized medicine

Introduction

Recently, the term “big data” has been used more and more in topics related to the analysis of huge amounts of information. Characteristics of big data—including medical data—are volume (large), variety, velocity, and veracity. In this case, volume refers to the size of the data, variety refers to different types/sources of data, velocity refers to the speed of data generation, and veracity refers to the quality of data or data uncertainty due to factors such as noise, artifacts, and missing data. In the health care system, a variety of resources—such as randomized controlled clinical trials, wearable devices (eg, clothing and accessories incorporating sensors that measure activity or parameters such as blood pressure), video streams (eg, a video-based system for detecting fall events in elderly persons living alone at home), personal genomic services, imaging devices, and social media or Internet searches—provide data that could be useful for many applications.^Citation1 Such applications include drug and medical device safety surveillance, quality of care and performance measurement, making of diagnoses and prediction of prognosis, population management, decision support and precision medicine, and public health and research applications.^{Citation2,Citation3}

Over the last decade, medical researchers have taken into account the heterogeneity of data in their work, where the genetics of subjects have been studied as a function of epistasis, and family history and personal life events have been used to predict clinical evolution. Big data technology should expand this fascinating field of multivariate approach research and overcome the inability of existing approaches to effectively gather, share, and use information in a more comprehensive manner within the health care system.^Citation2 In order to utilize health care big data, research groups and organizations have designed and implemented many frameworks/ methods. One of the most established frameworks is Hadoop, which supports the analysis of large data sets. This framework has been used in the implementation of various applications, such as disease prediction in patients, diagnosis of cancer, patient emergency alerts, generation of disease decision rules, medical data quality assessment, and personalized recommendation systems.^{Citation4-Citation10}

In precision medicine, a patient's unique characteristics are used to tailor treatment in a manner that might be more elaborate than the standard course. For example, cardiologists currently use an algorithm that for a given patient predicts the occurrence of a myocardial infarction within 5 or 10 years based on body weight, arterial pressure, smoking status, blood lipid analysis results, and personal and family cardiovascular history. Precision medicine can be used in the diagnosis and prevention of disease, such as cancer, owing to advances in next-generation sequencing (NGS), liquid biopsy technology, computational biology methods, high-throughput functional screening, and analytical approaches.^Citation11

In the abovementioned domains, big data mining techniques have led to interesting results. For example, performance with such techniques is comparable to that of medical experts. It will be interesting to follow studies on the efficiency of these mining techniques in comparison with usual clinical management.

In this article, we briefly review data analysis methods for health care systems and examine challenges facing the utilization of this data.

Computational approaches toward personalized medicine

Although the concept of personalized medicine is not new, the emergence of powerful analytical tools has recently opened new avenues to predictive, preventive, participatory, and personalized medicine, known as P4 medicine.^Citation12 The hope is to reduce cost and improve the quality of care. Personalized medicine was involved in more than 25% of novel new drugs approved by the US Food and Drug Administration (FDA) in 2015,^Citation13 which shows that personalized medicine is moving toward becoming a substantial component of treatment products.

Research groups have investigated different aspects of personalized medicine, such as diagnosis, prognosis, and pharmacogenomics, through computational approaches or through improving/revising standards and regulations. Many of these research works, such as the “Baseline Study” project by Google Inc., the Cancer Genome Atlas, and the 100 000 Genomes Project (100KGP), are focused on high-throughput genomic analysis to achieve personalized health care by developing computational methods.^{Citation11,Citation14,Citation15} Genomic mutations can be exploited in the development of drugs that target a protein to treat disease.

By analyzing large amounts of data, Forkan et al showed that there is a trend or pattern in each individual patient's data.^Citation16 A use case in this model was used to identify the true abnormal conditions of patients with variations in blood pressure and heart rate. Vidyasagar reviewed machine learning techniques for predicting a drug response and found that there are biomarkers, even some without biological significance, that could predict a drug response.^Citation17 Krishnan and Westhead, in a study of the application of machine learning and probabilistic approaches to the prediction of functional effects of single-nucleotide polymorphisms (SNPs), found that machine learning methods could outperform probabilistic methods.^Citation18 An integration of clinical variables such as race (white vs nonwhite), intensive care unit (ICU) type (medical vs surgical), sex, and age has been used in developing multivariate logistic regression models to estimate a personalized initial dose of heparin.^Citation19 Using these models, investigators observed statistically significant associations between sub- and supratherapeutic activated partial thromboplastin time (aPTT), the aforementioned clinical variables, heparin dose, and sequential organ failure assessment scores (SOFA), with area under the curve (AUG; also called area under a receiver operating characteristics [ROC] curve, a two-dimensional depiction of classifier performance.) of 0.78 and 0.79 respectively.

None of the state-of-the-art big data-driven approaches have reported an accuracy (the ratio between correctly identified/classified samples and the total number of samples) of 100%, and this is probably due to challenges such as missing data, the quality of data, and variations in experimental results addressed in the next section.

Challenges

Besides general challenges inherent to the analysis of big data—such as missing data, erroneous/imprecise data, and heterogeneous data—employing big data in health care systems imposes new challenges, including the lack of reliability and repeatability of some (but by no means all) biological data; issues of privacy, ownership (ie, determining owner(s) of data), and confidentiality; inadequate data from randomized controlled clinical trials; and low quality of data in general.^{Citation1,Citation17,Citation18} To address the technical challenges, such as missing data and imprecise data, statistical as well as machine learning methods have been investigated.^{Citation20-Citation26} However, there is no unique solution to these problems; similar to other approaches, the efficacy of statistical and machine learning methods needs to be proven for new medical applications.

Another challenge is disparity in ethnic and socioeconomic status, which results in inequalities in health care; indeed, utilization of “omic” technologies is costly and might not be affordable for resource-poor populations. Integrating molecular pathology, epidemiology, and social sciences could be a strategy to explore health disparities linked to social environments.^Citation27 However, any influence on the global health setting from such future studies will only be effected if their results are reflected in political and economic decisions made.

To develop disease-specific models applicable to personalizing therapeutic interventions, we need to incorporate biomarkers (indicators of normal biological processes, pathogenic processes, or pharmacological responses to therapeutic intervention^Citation12) from DNA sequencing and improve the quality of data. However, in some diseases, such as cancer, cell heterogeneity in a single tumor makes detection of low-level mutations difficult, and a chemotherapy selected on the basis of specific genetic characteristics of that patient's cancer might be impractical.^Citation28 To reveal a correlation between results of DNA studies and disease type, more samples from different cells at different locations would be required, a procedure with low feasibility.^Citation28

Another challenge is the lack of knowledge about the human system. From a big data perspective, understanding the functionality of each part of this system needs to be converted to computational models and then integrated with other models of the human body. Understanding the biological networks and molecular processes, and thus the treatment outcome, in neuropsychiatry disorders has been severely hampered by limited access to the brain. Major big data projects such as BRAIN (Brain Research through Advancing Innovative Neurotechnologies), HBP (Human Brain Project), and TVB (The Virtual Brain),^Citation10 have been undertaken to enable investigators to fully understand the activity and connectivity of neuronal systems. However, these projects are far from complete, and various aspects of brain functionality may remain unresolved. For instance, understanding placebo effects at the psychological level, as well as in terms of neuroimaging, and neurobiological/physiological changes, is an ongoing and fascinating field of research.

Discussion and conclusion

With technological advances, different research groups and organizations are generating and using increasingly complex and diverse data sets in health care systems. However, as the human system is very complex, a comprehensive model is required in order to achieve P4 medicine. To develop such a model, new sensors, methods, platforms, and unique biomarkers for diagnosis, and therapeutic outcome prediction are required.^Citation29 There is still a need for devices and sensors able to provide good quality reports of relevant information on patient health. For instance, no thoroughly validated device for measuring cardiac output is currently available.^Citation30 To design a personalized model applicable to P4 medicine, more investment is required toward understanding the human body and relevant correlations so that it can be described with computational models. Moreover, in order to design an accurate model, more studies to investigate the influence of parameters such as environmental factors, family history, and lifestyle on health are warranted. However, this might be particularly challenging in the fields of neurology and psychiatry.

The authors would like to thank Craig Biwer and Samuel Habbo-Gavin for their valuable comments.

REFERENCES

AlemayehuD.BergerM.Big data: transforming drug development and health policy decision making. 2016 Mar 5. Epub ahead of print. doi:10.1007/s10742-016-0144-xHealth Serv Outcomes Res Method.
Google Scholar
BelleA.ThiagarajanR.SoroushmehrSM.NavidiF.BeardD.NajarianK.Big data analytics in healthcare. 2015;2015:370194. doi:10.1155/2015/370194Biomed Res Int.
Google Scholar
RumsfeldJ.JoyntK.MaddoxT.Big data analytics to improve cardiovascular care: promise and challenges.Nat Rev Cardiol.201613635035927009423
PubMed Web of Science ®Google Scholar
KuoMH.ChrimesD.MoaB.HuW.Design and construction of a big data analytics framework for health applications.IEEE/ACM Trans Comput Biol Bioinform.201613354955627295638
PubMedGoogle Scholar
IstephanS.SiadatMR.Unstructured medical image query using big data -an epilepsy case study.J Biomed Inform.20165921822626707450
PubMed Web of Science ®Google Scholar
BonnerS.McGoughAS.KureshiI.et alData quality assessment and anomaly detection via map/reduce and linked data: a case study in the medical domain. Paper presented at: 2015 IEEE International Conference on Big Data (Big Data); October 29-November 1, 2015; Santa Clara, CA, USA.
Google Scholar
LeeB.JeongE.A design of a Patient-customized healthcare system based on the Hadoop with text mining (PHSHT) for an efficient disease management and prediction.Int J Software Eng Applications.201488131150
Google Scholar
ZhangS.DongY.ChenX.WangS.Personalized recommendation system on Hadoop and HBase. In: Chen W, Yin G, Zhao G, eds. Big Data Technology and Applications. Singapore; 2016:34-45. Communications in Computer and Information Science; vol 590
Google Scholar
ChennamsettyH.ChalasaniS.RileyD.Predictive analytics on electronic health records (EHRs) using Hadoop and Hive. Paper presented at: 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). March 5-7, 2015; Coimbatore, India.
Google Scholar
FalconMl.JirsaV.SolodkinA.A new neuroinformatics approach to personalized medicine in neurology.Curr Opin Neurol.201629442943627224088
PubMed Web of Science ®Google Scholar
KenslerT.SpiraA.GarberJ.et alTransforming cancer prevention through precision medicine and immune-oncology.Cancer Prev Res (Phila).20169121026744449
PubMed Web of Science ®Google Scholar
HoodL.FriendS.Predictive, personalized, preventive, participatory (P4) cancer medicine.Nat Rev Clin Oncol.20118318418721364692
PubMed Web of Science ®Google Scholar
NiceEC.From proteomics to personalized medicine: the road ahead.Expert Rev Proteomics.201613434134326905403
PubMed Web of Science ®Google Scholar
IbrahimR.PasicM.YousefGM.Omics for personalized medicine: defining the current we swim in.Expert Rev Mol Diagn.201616771972226959799
PubMed Web of Science ®Google Scholar
ViciniP.FieldsO.LaiE.et alPrecision medicine in the age of big data: the present and future role of large-scale unbiased sequencing in drug discovery and development.Clin Pharmacol Ther.201599219820726536838
PubMed Web of Science ®Google Scholar
ForkanA.KhalilI.IbaidaA.TariZ.BDCaM: big data for context-aware monitoring - a personalized knowledge discovery framework for assisted healthcare.IEEE Trans Cloud Comput.2015991
Google Scholar
VidyasagarM.Identifying predictive features in drug response using machine learning: opportunities and challenges.Annu Rev Pharmacol Toxicol.201555153425423479
PubMed Web of Science ®Google Scholar
KrishnanV.WestheadD.A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function.Bioinformatics.200319172199220914630648
PubMed Web of Science ®Google Scholar
GhassemiM.RichterS.EcheI.ChenT.DanzigerJ.CeliL.A data-driven approach to optimized medication dosing: a focus on heparin.Intensive Care Med.20144091332133925091788
PubMed Web of Science ®Google Scholar
WangY.ChenR.GhoshJ.et alRubik: knowledge guided tensor factorization and completion for health data analytics.Proc 21th ACM SIGKDD Intl Conference Knowledge Discovery Data Mining; Sydney, Australia; KDD'15.201512651274
Google Scholar
ZhangZ.FangH.WangH.Multiple imputation based clustering validation (MlV)for big longitudinal trial data with missing values in eHealth.J Med Syst.201640614627126063
PubMed Web of Science ®Google Scholar
ÖzdemirV.DoveE.GürsoyU.et alPersonalized medicine beyond genomics: alternative futures in big data—proteomics, environtome and the social proteome. 2015 Dec 8. Epub ahead of print. doi:10.1007/s00702-015-1489-yJ Neural Transrn (Vienna).
Google Scholar
PriyaM.KumarPR.A novel intelligent approach for predicting atherosclerotic individuals from big data for healthcare.Int J Production Res.2015532475177532
Web of Science ®Google Scholar
LangeK.PappJC.SinsheimerJS.SobelEM.Next-generation statistical genetics: modeling, penalization, and optimization in high-dimensional data.Annu ftet/Sfaf App.201411279300
Google Scholar
MardaniM.MateosG.GiannakisGB.Subspace learning and imputation for streaming big data matrices and tensors.IEEE Trans Signal Process.2015631026632677
Web of Science ®Google Scholar
JerezJM.MolinaI.Garcia-LaencinaPJ.et alMissing data imputation using statistical and machine learning methods in a real breast cancer problem.Artif Intell Med.201050210511520638252
PubMed Web of Science ®Google Scholar
NishiA.MilnerD.GiovannucciE.et alIntegration of molecular pathology, epidemiology and social science for global precision medicine.Expert Rev Mol Diagn.2015161112326636627
PubMed Web of Science ®Google Scholar
KruglyakKM.LinE.OngFS.Next-generation sequencing and applications to the diagnosis and treatment of lung cancer.Exp Med Biol.2016890123136
Web of Science ®Google Scholar
ByrlingJ.AnderssonB.Marko-VargaG.AnderssonR.Cholangiocarcinoma - current classification and challenges towards personalised medicine.Scand J Gastroenterol.201651664164326806118
PubMed Web of Science ®Google Scholar
JohnsonA.GhassemiM.NematiS.NiehausK.CliftonD.CliffordG.Machine learning and decision support in critical care.Proc IEEE.20161042444466
PubMed Web of Science ®Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Transforming big data into computational models for personalized medicine and health care

La transformación de los macrodatos en modelos computacionales para la medicina personalizada y la atención en salud

Transformer les bases de données en modèles Informatiques pour la médecine personnalisée et les soins de santé

Abstract

Introduction

Computational approaches toward personalized medicine

Challenges

Discussion and conclusion

REFERENCES

Information for

Open access

Opportunities

Help and information

Transforming big data into computational models for personalized medicine and health care

La transformación de los macrodatos en modelos computacionales para la medicina personalizada y la atención en salud

Transformer les bases de données en modèles Informatiques pour la médecine personnalisée et les soins de santé

Abstract

Introduction

Computational approaches toward personalized medicine

Challenges

Discussion and conclusion

REFERENCES

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date