1,746
Views
8
CrossRef citations to date
0
Altmetric
Articles

A review of Automatic end-to-end De-Identification: Is High Accuracy the Only Metric?

, &

References

  • Shweta, A. Kumar, A. Ekbal, S. Saha, and P. Bhattacharyya. 2016. A recurrent neural network architecture for de-identifying clinical records. Proceedings of the 13th Intl. Conference on Natural Language Processing, Varanasi, India, 188–97.
  • Brasher, E. A. 2018. Addressing the failure of anonymization: Guidance from the European Union’s general data protection regulation. Columbia Business Law Review 209, 2018.
  • Bui, D. D. A., M. Wyatt, and J. J. Cimino. 2017a. The UAB informatics institute and 2016 CEGS N-GRID de-identification shared task challenge. Journal of Biomedical Informatics 75:S54–S61. doi:10.1016/j.jbi.2017.05.001.
  • Bui, D. D. A., M. Wyatt, and J. J. Cimino. 2017b. The UAB informatics institute and 2016 CEGS N-GRID de-identification shared task challenge. Journal of Biomedical Informatics 75:S54–S61. doi:10.1016/j.jbi.2017.05.001.
  • Chen, T., R. M. Cullen, and M. Godwin. 2015. Hidden Markov model using Dirichlet process for de-identification. Journal of Biomedical Informatics 58:S60–S66. doi:10.1016/j.jbi.2015.09.004.
  • Dalianis, H. 2018. Clinical text mining: Secondary use of electronic patient records. Switzerland: Springer.
  • Dehghan, A., A. Kovacevic, G. Karystianis, J. A. Keane, and G. Nenadic. 2015. Combining knowledge- and data-driven methods for de-identification of clinical narratives. Journal of Biomedical Informatics 58 (Supplement):S53– S59. doi:10.1016/j.jbi.2015.06.029.
  • Dernoncourt, F., J. Y. Lee, and P. Szolovits. 2017. NeuroNER: An easy-to-use program for named-entity recognition based on neural networks. CoRR abs/1705.05487.
  • Dernoncourt, F., J. Y. Lee, O. Uzuner, and P. Szolovits. 2017. De-identification of patient notes with recurrent neural networks. Journal of the American Medical Informatics Association 24 (3):596–606. doi:10.1093/jamia/ocw156.
  • El Emam, K., S. Jabbouri, S. Sams, Y. Drouet, and M. Power. 2006. Evaluating common de-identification heuristics for personal health information. Journal of Medical Internet Research 8 (4):e28. doi:10.2196/jmir.8.4.e28.
  • Gabriel, R. A., S. Shenoy, T.-T. Kuo, J. McAuley, and C.-N. Hsu. 2018. The presence of highly similar notes within the MIMIC-III dataset. University of California, San Diego.
  • Garfinkel, S. (2015). De-identification of personally identifiable information technical report. National institute of Standards and Technology (NIST), U.S. Department of Commerce.
  • Gkoulalas-Divanis, A., G. Loukides, and J. Sun. 2014. Publishing data from electronic health records while preserving privacy: A survey of algorithms. Journal of Biomedical Informatics 50:4–19. doi:10.1016/j.jbi.2014.06.002.
  • Goldberg, Y. 2017. Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10 (1):1–309. doi:10.2200/S00762ED1V01Y201703HLT037.
  • Goldberger, A. L., L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. 2000. PhysioBank, physioToolkit, and physioNet: Components of a new research resource for complex physiologic signals. Circulation 101 (23):e215–e220. doi:10.1161/01.CIR.101.23.e215.
  • Guinney, J., and J. Saez-Rodriguez. 2018. Alternative models for sharing confidential biomedical data. Nature Biotechnology 36 (5):391. doi:10.1038/nbt.4128.
  • He, B., Y. Guan, J. Cheng, K. Cen, and W. Hua. 2015. CRFs based de-identification of medical records. Journal of Biomedical Informatics 58 (Supplement):S39– S46. doi:10.1016/j.jbi.2015.08.012.
  • Health & Disability Commissioner. 2009. The code of rights, health & disability commissioner. Accessed December 09, 2017. http://www.hdc.org.nz/the-act–code/the-code-of-rights.
  • Jiang, Z., C. Zhao, B. He, Y. Guan, and J. Jiang. 2017. De-identification of medical records using conditional random fields and long short-term memory networks. Journal of Biomedical Informatics 75:S43–S53. doi:10.1016/j.jbi.2017.10.003.
  • Johnson, A. E., T. J. Pollard, L. Shen, H. L. Li-wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3:160035. doi:10.1038/sdata.2016.35.
  • Johnson, K. W., J. K. De Freitas, B. S. Glicksberg, J. R. Bobe, and J. T. Dudley. 2018. Evaluation of patient re-identification using laboratory test orders and mitigation via latent space variables. Biocomputing 2019, 415-426.
  • Kumar, V., A. Stubbs, S. Shaw, and Ö. Uzuner. 2015. Creation of a new longitudinal corpus of clinical narratives. Journal of Biomedical Informatics 58 (Supplement):S6– S10. doi:10.1016/j.jbi.2015.09.018.
  • Lee, H.-J., Y. Wu, Y. Zhang, J. Xu, H. Xu, and K. Roberts. 2017. A hybrid approach to automatic de-identification of psychiatric notes. Journal of Biomedical Informatics 75:S19–S27. doi:10.1016/j.jbi.2017.06.006.
  • Lee, J. Y., F. Dernoncourt, and P. Szolovits. 2018. Transfer learning for named-entity recognition with neural networks. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
  • Lee, J. Y., F. Dernoncourt, O. Uzuner, and P. Szolovits. 2016. Feature-augmented neural networks for patient note de-identification. Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP), Japan, 17–22.
  • Li, X.-B., and J. Qin. 2017. Anonymizing and sharing medical text records. Information Systems Research 28 (2):332–52. doi:10.1287/isre.2016.0676.
  • Liu, Z., Y. Chen, B. Tang, X. Wang, Q. Chen, H. Li, J. Wang, Q. Deng, and S. Zhu. 2015. Automatic de-identification of electronic medical records using token-level and character-level conditional random fields. Journal of Biomedical Informatics 58 (Supplement):S47–S52. doi:10.1016/j.jbi.2015.06.009.
  • Liu, Z., B. Tang, X. Wang, and Q. Chen. 2017. De-identification of clinical notes via recurrent neural network and conditional random field. Journal of Biomedical Informatics 75:S34–S42. doi:10.1016/j.jbi.2017.05.023.
  • Mayo, M., and V. Yogarajan. 2019. A nearest neighbour-based analysis to identify patients from continuous glucose monitor data. In Asian Conference on Intelligent Information and Database Systems, Indonesia, pp. 349-360. Springer, Cham, 2019.
  • Menger, V., F. Scheepers, L. M. van Wijk, and M. Spruit. 2018. DEDUCE: A pattern matching method for automatic de-identification of Dutch medical text. Telematics and Informatics 35 (4):727–36. doi:10.1016/j.tele.2017.08.002.
  • Office of the Privacy Commissioner. 2013. Health information privacy code 1994. Accessed December 09, 2017. https://www.privacy.org.nz/the-privacy-act-and-codes/codes-of-practice/health-information-privacy-code-1994/.
  • Pantazos, K., S. Lauesen, and S. Lippert. 2017. Preserving medical correctness, readability and consistency in de-identified health records. Health Informatics Journal 23 (4):291–303. doi:10.1177/1460458216647760.
  • Phuong, N. D., and V. T. N. Chau. 2016. Automatic de-identification of medical records with a multilevel hybrid semi-supervised learning approach. 2016 IEEE RIVF International Conference on Computing & Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), Vietnam, 43–48.
  • Polonetsky, J., O. Tene, and K. Finch. 2016. Shades of gray: Seeing the full spectrum of practical data de-intentification. Santa Clara Law Review 56:593.
  • Ragupathy, R., and V. Yogarajan. 2018. Applying the reason model to enhance health record research in the age of ‘big data’. The New Zealand Medical Journal 131 (1478):65–67.
  • Statistics New Zealand. 2016. Integrated data infrastructure. In Secondary integrated data infrastructure. https://www.stats.govt.nz/integrated-data/integrated-data-infrastructure/
  • Stubbs, A., M. Filannino, and Ö. Uzuner. 2017. De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1. Journal of Biomedical Informatics 75:S4–S18. doi:10.1016/j.jbi.2017.06.011.
  • Stubbs, A., C. Kotfila, and Ö. Uzuner. 2015. Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task track 1. Journal of Biomedical Informatics 58 (Supplement):S11– S19. doi:10.1016/j.jbi.2015.06.007.
  • Stubbs, A., C. Kotfila, H. Xu, and Ö. Uzuner. 2015. Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task track 2. Journal of Biomedical Informatics 58 (Supplement):S67– S77. doi:10.1016/j.jbi.2015.07.001.
  • Stubbs, A., and Ö. Uzuner. 2015a. Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus. Journal of Biomedical Informatics 58 (Supplement):S20– S29. doi:10.1016/j.jbi.2015.07.020.
  • Stubbs, A., and Ö. Uzuner. 2015b. Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus. Journal of Biomedical Informatics 58:S20–S29. doi:10.1016/j.jbi.2015.07.020.
  • Stubbs, A., and Ö. Uzuner. 2015c. Annotating risk factors for heart disease in clinical narratives for diabetic patients. Journal of Biomedical Informatics 58 (Supplement):S78– S91. doi:10.1016/j.jbi.2015.05.009.
  • Stubbs, A., and Ö. Uzuner. 2017. De-identification of medical records through annotation. In Nancy Ide and James Pustejovsky (Eds.) Handbook of linguistic annotation, 1433–59. Netherlands: Springer.
  • Stubbs, A., Ö. Uzuner, C. Kotfila, I. Goldstein, and P. Szolovits. 2015a. Challenges in synthesizing surrogate PHI in narrative EMRs, 717–35. Cham: Springer International Publishing.
  • Stubbs, A., Ö. Uzuner, C. Kotfila, I. Goldstein, and P. Szolovits. 2015b. Challenges in synthesizing surrogate PHI in narrative EMRs. In Aris Gkoulalas-Divanis and Grigorios Loukides (Eds.) Medical data privacy handbook, 717–35. Switzerland: Springer.
  • Sweeney, L. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10 (05):557–70. doi:10.1142/S0218488502001648.
  • Uzuner, Ö., Y. Luo, and P. Szolovits. 2007. Evaluating the state-of-the-art in automatic de-identification. Journal of the American Medical Informatics Association 14 (5):550–63. doi:10.1197/jamia.M2444.
  • Uzuner, Ö., and A. Stubbs. 2015. Practical applications for natural language processing in clinical research: The 2014 i2b2/UTHealth shared tasks. Journal of Biomedical Informatics 58(Suppl):S1. doi:10.1016/j.jbi.2015.10.007.
  • Uzuner, Ö., A. Stubbs, and M. Filannino. 2017. A natural language processing challenge for clinical records: Research domains criteria (RDoC) for psychiatry. Journal of Biomedical Informatics 75:S1–S3. doi:10.1016/j.jbi.2017.10.005.
  • Vepakomma, P., O. Gupta, T. Swedish, and R. Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564.
  • Yadav, S., A. Ekbal, S. Saha, P. S. Pathak, and P. Bhattacharyya. 2017. Patient data de-identification: A conditional random-field-based supervised approach. In Snehanshu Saha, Abhyuday Mandal, Anand Narasimhamurthy, V. Sarasvathi, Shivappa Sangam (Eds.), Handbook of research on applied cybernetics and systems science, 234–53. India: IGI Global.
  • Yang, H., and J. M. Garibaldi. 2015. Automatic detection of protected health information from clinic narratives. Journal of Biomedical Informatics 58 (Supplement):S30– S38. doi:10.1016/j.jbi.2015.06.015.
  • Yogarajan, V., M. Mayo, and B. Pfahringer. 2018a. Privacy protection for health information research in new zealand district health boards. The New Zealand Medical Journal 131 (1485):19–26.
  • Yogarajan, V., M. Mayo, and B. Pfahringer. 2018b. A survey of automatic de-identification of longitudinal clinical narratives. arXiv preprint arXiv:1810.06765.
  • Zhao, Y.-S., K.-L. Zhang, H.-C. Ma, and K. Li. 2018. Leveraging text skeleton for de-identification of electronic medical records. BMC Medical Informatics and Decision Making 18 (1):18. doi:10.1186/s12911-018-0598-6.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.