140
Views
0
CrossRef citations to date
0
Altmetric
Featured Articles

An unsupervised embedding harmonization system for privacy-preserving data mining in healthcare

, , &

References

  • Alshehri, S., Radziszowski, S. P., & Raj, R. K. (2012). Secure access for healthcare data in the cloud using ciphertext-policy attribute-based encryption [Paper presentation]. 2012 IEEE 28th International Conference on Data Engineering Workshops.
  • Artetxe, M., Labaka, G., & Agirre, E. (2018). A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. arXiv preprint arXiv:1805.06297.
  • Blumenthal, D., & Tavenner, M. (2010). The “Meaningful Use” regulation for electronic health records. New England Journal of Medicine, 363(6), 501–504. https://doi.org/10.1056/NEJMp1006114
  • Brisimi, T. S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I. C., & Shi, W. (2018). Federated learning of predictive models from federated Electronic Health Records. International Journal of Medical Informatics, 112, 59–67. https://doi.org/10.1016/j.ijmedinf.2018.01.007
  • CCS. (2017). Clinical classifications software (CCS) for ICD-9-CM. www.hcup-us.ahrq.gov/toolssoftware/ccs/ccs.jsp
  • Cheng, Y., Wang, F., Zhang, P., & Hu, J. (2016). Risk prediction with electronic health records: A deep learning approach. Proceedings of the 2016 SIAM International Conference on Data Mining (pp. 432-440). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611974348.49
  • Choi, E., Bahadori, M. T., Searles, E., Coffey, C., Thompson, M., Bost, J., Tejedor-Sojo, J., & Sun, J. (2016). Multi-layer representation learning for medical concepts. Proceedings of the 22nd ACM SIGKDD. International Conference on Knowledge Discovery and Data Mining (pp. 1495-1504). https://doi.org/10.1145/2939672.2939823
  • Dernoncourt, F., Lee, J. Y., Uzuner, O., & Szolovits, P. (2016). De-identification of patient notes with recurrent neural networks. Journal of Biomedical Informatics, 75, S34–S42.
  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT, 2019, 4171–4186.
  • Dolgin, E. (2011). Trial networks move beyond single-disease strategies [News]. Nature Medicine, 17(12), 1525–1525. https://doi.org/10.1038/nm1211-1525
  • El Emam, K., Jonker, E., Arbuckle, L., & Malin, B. (2011). A systematic review of re-identification attacks on health data. PloS One, 6(12), e28071. https://doi.org/10.1371/journal.pone.0028071
  • Glicksberg, B. S., Miotto, R., Johnson, K. W., Shameer, K., Li, L., Chen, R., & Dudley, J. T. (2018). Automated disease cohort selection using word embeddings from Electronic Health Records. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 23.
  • Goldstein, B. A., Navar, A. M., Pencina, M. J., & Ioannidis, J. P. A. (2017). Opportunities and challenges in developing risk prediction models with electronic health records data: A systematic review. Journal of the American Medical Informatics Association, 24(1), 198–208. https://doi.org/10.1093/jamia/ocw042
  • Gower, J. C. (1975). Generalized procrustes analysis. Psychometrika, 40(1), 33–51. https://doi.org/10.1007/BF02291478
  • Gunter, T. D., & Terry, N. P. (2005). The emergence of national electronic health record architectures in the United States and Australia: Models, costs, and questions. Journal of Medical Internet Research, 7(1), e3. https://doi.org/10.2196/jmir.7.1.e3
  • Heart, T., Ben-Assuli, O., & Itamar, S. (2017). A review of PHR, EMR and EHR integration: A more personalized healthcare and public health policy. Health Policy and Technology, 6(1), 20–25. https://doi.org/10.1016/j.hlpt.2016.08.002
  • HHSgov. (2012). Guidance regarding methods for de-identification of protected health information in accordance with the health insurance portability and accountability act (HIPAA) Privacy Rule |. HHS.gov HHSgov. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html
  • Huang, Y., Lee, J., Wang, S., Sun, J., Liu, H., & Jiang, X. (2018). Privacy-preserving predictive modeling: Harmonization of contextual embeddings from different sources. JMIR Medical Informatics, 6(2), e33. https://doi.org/10.2196/medinform.9455
  • ICD - ICD-9 - International Classification of Diseases, Ninth Revision. (2019). 2019-03-01T03:03:49Z/). https://www.cdc.gov/nchs/icd/icd9.htm
  • Jensen, P. B., Jensen, L. J., & Brunak, S. (2012). Mining electronic health records: Towards better research applications and clinical care. Nature Reviews. Genetics, 13(6), 395–405. https://doi.org/10.1038/nrg3208
  • Jha, A. K., DesRoches, C. M., Campbell, E. G., Donelan, K., Rao, S. R., Ferris, T. G., Shields, A., Rosenbaum, S., & Blumenthal, D. (2009). Use of electronic health records in U.S. hospitals. New England Journal of Medicine, 360(16), 1628–1638. http://doi.org/10.1056/NEJMsa0900592.NJ200904163601608
  • Kartchner, D., Christensen, T., Humpherys, J., & Wade, S. (2017). Code2Vec: Embedding and clustering medical diagnosis data [Paper presentation]. 2017 IEEE International Conference on Healthcare Informatics (ICHI), Aug 2017. https://doi.org/10.1109/ICHI.2017.94
  • Kim, M., Lee, J., Ohno-Machado, L., & Jiang, X. (2020). Secure and differentially private logistic regression for horizontally distributed data. IEEE Transactions on Information Forensics and Security, 15, 695–710. https://doi.org/10.1109/TIFS.2019.2925496
  • Kushida, C. A., Nichols, D. A., Jadrnicek, R., Miller, R., Walsh, J. K., & Griffin, K. (2012). Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies. Medical Care, 50, S82–S101. https://doi.org/10.1097/MLR.0b013e3182585355
  • Langer, S. G. (2011). Challenges for data storage in medical imaging research. Journal of Digital Imaging, 24(2), 203–207. https://doi.org/10.1007/s10278-010-9311-8
  • Lee, D., Jiang, X., & Hwanjo, Y. (2020). Harmonized representation learning on dynamic EHR graphs. Journal of Biomedical Informatics, 106, 103426. https://doi.org/10.1016/j.jbi.2020.103426
  • Lee, T.-H., Lee, J., & Jun, C.-H. (2021). Bilingual autoencoder-based efficient harmonization of multi-source private data for accurate predictive modeling. Information Sciences, 568, 403–426. https://doi.org/10.1016/j.ins.2021.03.064
  • Lee, J., Sun, J., Wang, F., Wang, S., Jun, C.-H., & Jiang, X. (2018). Privacy-preserving patient similarity learning in a federated environment: Development and analysis. JMIR Medical Informatics, 6(2), e20. https://doi.org/10.2196/medinform.7744
  • Li, Z., Roberts, K., Jiang, X., & Qi, L. (2019). Distributed learning from multiple EHR databases: Contextual embedding models for medical events. Journal of Biomedical Informatics, 92, 103138. https://doi.org/10.1016/j.jbi.2019.103138
  • Meystre, S. M., Friedlin, F. J., South, B. R., Shen, S., & Samore, M. H. (2010). Automatic de-identification of textual documents in the electronic health record: A review of recent research [ReviewPaper]. BMC Medical Research Methodology, 10(1), 1–16. https://doi.org/10.1186/1471-2288-10-70
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26.
  • MIMIC. (2021). The medical information mart for intensive care. https://mimic.mit.edu/
  • Nadeem, A., & Javed, M. Y. (2005). A performance comparison of data encryption algorithms. 2005 International Conference on Information and Communication Technologies.
  • Nam, D.-Y., & Han, J.-K. (2021). An efficient algorithm for generating harmonized stereoscopic 360° VR images. IEEE Transactions on Circuits and Systems for Video Technology, 31(12), 4864–4882. https://doi.org/10.1109/TCSVT.2021.3056065
  • Neamatullah, I., Douglass, M. M., Lehman, L.-w. H., Reisner, A., Villarroel, M., Long, W. J., Szolovits, P., Moody, G. B., Mark, R. G., & Clifford, G. D. (2008). Automated de-identification of free-text medical records [OriginalPaper]. BMC Medical Informatics and Decision Making, 8(1), 1–17. https://doi.org/10.1186/1472-6947-8-32
  • Office for Civil Rights, H. (2002). Standards for privacy of individually identifiable health information. Final Rule. Federal Register, 67(157), 53181–53273.
  • Patil, S., & Kulkarni, U. (2019). Accuracy prediction for distributed decision tree using machine learning approach [Paper presentation]. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (pp. 1365–1371). IEEE.. https://doi.org/10.1109/ICOEI.2019.8862580
  • Pennington, J., Socher, R., & Manning, C. D. (2021). GloVe: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543).
  • Poria, S., Cambria, E., Bajpai, R., & Hussain, A. (2017). A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, 37, 98–125. https://doi.org/10.1016/j.inffus.2017.02.003
  • Powell, J., Fitton, R., & Fitton, C. (2015). Sharing electronic health records: The patient view [Research article]. Informatics in Primary Care, 14, 55–57. https://hijournal.bcs.org/index.php/jhi/article/view/614
  • Quinn, M., Forman, J., Harrod, M., Winter, S., Fowler, K. E., Krein, S. L., Gupta, A., Saint, S., Singh, H., & Chopra, V. (2019). Electronic health records, communication, and data sharing: Challenges and opportunities for improving the diagnostic process. Diagnosis (Berlin, Germany), 6(3), 241–248. https://doi.org/10.1515/dx-2018-0036
  • Regier, D. A., Kuhl, E. A., & Kupfer, D. J. (2013). The DSM‐5: Classification and criteria changes. World Psychiatry: Official Journal of the World Psychiatric Association (WPA), 12(2), 92–98. https://doi.org/10.1002/wps.20050
  • Shabani, M., Vears, D., & Borry, P. (2018). Raw genomic data: Storage, access, and sharing. Trends in Genetics: TIG, 34(1), 8–10. https://doi.org/10.1016/j.tig.2017.10.004
  • South, B. R., Mowery, D., Suo, Y., Leng, J., Ferrández, Ó., Meystre, S. M., & Chapman, W. W. (2014). Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. Journal of Biomedical Informatics, 50, 162–172. https://doi.org/10.1016/j.jbi.2014.05.002
  • Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(11). (pp. 2579-2605)
  • Wael, F. W., Wang, Z., Huang, Y., Wang, S., Wang, F., & Jiang, X. (2016). A predictive model for medical events based on contextual embedding of temporal sequences [Text.Serial.Journal]. JMIR Medical Informatics, 4(4), e39. https://medinform.jmir.org/2016/4/e39. https://doi.org/10.2196/medinform.5977
  • Wang, W., Jiang, X., Wu, Y., Cui, L., Cheng, S., & Ohno-Machado, L. (2013). EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed privacy-preserving online model learning. Journal of Biomedical Informatics, 46(3), 480–496. https://doi.org/10.1016/j.jbi.2013.03.008
  • Xiao, C., Ma, T., Dieng, A. B., Blei, D. M., & Wang, F. (2018). Readmission prediction via deep contextual embedding of clinical concepts. PloS One, 13(4), e0195024. https://doi.org/10.1371/journal.pone.0195024
  • Yu, H., Jiang, X., & Vaidya, J. (2006). Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data. Proceedings of the 2006 ACM Symposium on Applied computing(pp. 603–610).
  • Zerka, F., Barakat, S., Walsh, S., Bogowicz, M., Leijenaar, R. T. H., Jochems, A., Miraglio, B., Townend, D., & Lambin, P. (2020). Systematic review of privacy-preserving distributed machine learning from federated databases in health care. JCO Clinical Cancer Informatics, 4, 184–200. https://doi.org/10.1200/cci.19.00047
  • Zhu, Y., Sha, Y., Wu, H., Li, M., Hoffman, R. A., & Wang, M. D. (2022). Proposing causal sequence of death by neural machine translation in public health informatics. IEEE Journal of Biomedical and Health Informatics, 26(4), 1422–1431. https://doi.org/10.1109/jbhi.2022.3163013

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.