Search in:

IISE Transactions on Healthcare Systems Engineering Volume 9, 2019 - Issue 2

Submit an article Journal homepage

255

Views

CrossRef citations to date

Altmetric

Articles

Semi-automated text mining strategies for identifying rare causes of injuries from emergency room triage data

Gaurav Nanda School of Engineering Education, School of Industrial Engineering, Purdue University, West Lafayette, IN, USA;

Kirsten Vallmuur Institute of Health and Biomedical Innovation and School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia; ; Jamieson Trauma Institute, Royal Brisbane and Women's Hospital, Metro North Hospital and Health Service, Queensland Health;

Mark Lehto School of Industrial Engineering, Purdue University, West Lafayette, IN, USACorrespondence[email protected]

Pages 157-171 | Published online: 25 Mar 2019

Cite this article
https://doi.org/10.1080/24725579.2019.1567628
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Aggarwal, C. C., and Zhai, C. (2012) An introduction to text mining. Pp. 1–10 in Mining Text Data, C. C. Aggarwal and C. Zhai (Eds.). Retrieved from http://link.springer.com/chapter/10.1007/978-1-4614-3223-4_1
Google Scholar
Batista, G. E. A. P. A., Prati, R. C., and Monard, M. C. (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter, 6(1), 20. doi:10.1145/1007730.1007735
Google Scholar
Bertke, S. J., Meyers, A. R., Wurzelbacher, S. J., Bell, J., Lampl, M. L., and Robins, D. (2012) Development and evaluation of a Naïve Bayesian model for coding causation of workers’ compensation claims. Journal of Safety Research, 43(5–6), 327–332. doi:10.1016/j.jsr.2012.10.012
PubMed Web of Science ®Google Scholar
Bertke, S. J., Meyers, A. R., Wurzelbacher, S. J., Measure, A., Lampl, M. P., and Robins, D. (2016) Comparison of methods for auto-coding causation of injury narratives. Accident Analysis & Prevention, 88, 117–123. doi:10.1016/j.aap.2015.12.006
Web of Science ®Google Scholar
Chawla, N. V., Japkowicz, N., and Kotcz, A. (2004) Editorial: Special issue on learning from imbalanced data sets. ACM Sigkdd Explorations Newsletter, 6(1), 1–6. Retrieved from http://dl.acm.org/citation.cfm?id=1007733
Google Scholar
Chen, L., Vallmuur, K., and Nayak, R. (2015) Injury narrative text classification using factorization model. BMC Medical Informatics and Decision Making, 15(Suppl 1), S5. doi:10.1186/1472-6947-15-S1-S5
Google Scholar
Corns, H. L., Marucci, H. R., and Lehto, M. R. (2007) Development of an approach for optimizing the accuracy of classifying claims narratives using a machine learning tool (TEXTMINER[4]). Pp. 411–416 in Human Interface and the Management of Information. Methods, Techniques and Tools in Information Design. Human Interface 2007. Lecture Notes in Computer Science, M. J. Smith and G. Salvendy (Eds.). Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-3-540-73345-4_47
Google Scholar
David W. Hosmer, Jr. (2004) Applied Logistic Regression. John Wiley & Sons.Retrieved from http://books.google.com/books?id=Po0RLQ7USIMC
Google Scholar
Fan, R., Chang, K., Hsieh, C., Wang, X., and Lin, C. (2008) LIBLINEAR: A library for large linear classification. Retrieved from http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.9959
Google Scholar
Frank, E., Hall, M., Holmes, G., Kirkby, R., Pfahringer, B., Witten, I. H., and Trigg, L. (2005) Weka. Pp. 1305–1314 in Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach (Eds.). Retrieved from http://link.springer.com/chapter/10.1007/0-387-25465-X_62
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H. (2009) The WEKA data mining software: An update. SIGKDD Explor. Newsl., 11(1), 10–18. doi:10.1145/1656274.1656278
Google Scholar
Japkowicz, N. (2000) The class imbalance problem: Significance and strategies. In Proceedings of the International Conference on Artificial Intelligence. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.35.1693&rep=rep1&type=pdf
Google Scholar
Lauría, E. J. M., and March, A. D. (2011) Combining Bayesian text classification and shrinkage to automate healthcare coding: A data quality analysis. J. Data and Information Quality, 2(3), 13:1–13:22. doi:10.1145/2063504.2063506
Google Scholar
Lehto, M., Marucci-Wellman, H., and Corns, H. (2009) Bayesian methods: A useful tool for classifying injury narratives into cause groups. Injury Prevention: Journal of the International Society for Child and Adolescent Injury Prevention, 15(4), 259–265. doi:10.1136/ip.2008.021337
Web of Science ®Google Scholar
Manning, C. D., Raghavan, P., and Schütze, H. (2008) Introduction to Information Retrieval. Cambridge University Press. Retrieved from http://books.google.com/books?id=t1PoSh4uwVcC
Google Scholar
Marucci-Wellman, H. R., Corns, H. L., and Lehto, M. R. (2017) Classifying injury narratives of large administrative databases for surveillance—A practical approach combining machine learning ensembles and human review. Accident Analysis & Prevention, 98, 359–371. doi:10.1016/j.aap.2016.10.014
PubMed Web of Science ®Google Scholar
Marucci-Wellman, H. R., Lehto, M. R., and Corns, H. L. (2015) A practical tool for public health surveillance: Semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms. Accident Analysis & Prevention, 84, 165–176. doi:10.1016/j.aap.2015.06.014
Web of Science ®Google Scholar
McKenzie, K., Scott, D. A., Campbell, M. A., and McClure, R. J. (2010) The use of narrative text for injury surveillance research: A systematic review. Accident Analysis & Prevention, 42(2), 354–363. doi:10.1016/j.aap.2009.09.020
PubMed Web of Science ®Google Scholar
Measure, A. C. (2014). Automated Coding of Worker Injury Narratives. Boston, MA: JSM 2014 - Government Statistics Section. Retrieved from http://www.bls.gov/osmr/pdf/st140040.pdf
Google Scholar
Nanda, G., Grattan, K. M., Chu, M. T., Davis, L. K., and Lehto, M. R. (2016) Bayesian decision support for coding occupational injury data. Journal of Safety Research, 57, 71–82. doi:10.1016/j.jsr.2016.03.001
PubMed Web of Science ®Google Scholar
Nanda, G., Vallmuur, K., and Lehto, M. R. (2018) Improving autocoding performance of rare categories in injury classification: Is more training data or filtering the solution? Accident Analysis & Prevention, 110, 115–127. doi:10.1016/j.aap.2017.10.020
Google Scholar
Ng, A. Y., and Jordan, M. I. (2002) On discriminative vs. generative classifiers: A comparison of Logistic Regression and Naive Bayes. Pp. 841–848 in Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS'01), T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.). Cambridge, MA: MIT Press. Retrieved from http://papers.nips.cc/paper/2020-on-discriminative-vs-generative-classifiers-a-comparison-of-logistic-regression-and-naive-bayes.pdf
Google Scholar
Phua, C., Alahakoon, D., and Lee, V. (2004) Minority report in fraud detection. ACM SIGKDD Explorations Newsletter, 6(1), 50. doi:10.1145/1007730.1007738
Google Scholar
Prati, R. C., Batista, G. E. A. P. A., and Silva, D. F. (2015). Class imbalance revisited: A new experimental setup to assess the performance of treatment methods. Knowledge and Information Systems, 45(1), 247–270. doi:10.1007/s10115-014-0794-3
Web of Science ®Google Scholar
Provost, F. (2000). Machine learning from imbalanced data sets 101. Retrieved from http://www.aaai.org/Papers/Workshops/2000/WS-00-05/WS00-05-001.pdf
Google Scholar
QISU Guide to Collecting An Accurate Text Description of an Injury Event. (2011).South Brisbane, QLD: Queensland Injury Surveillance Unit.
Google Scholar
Queensland Injury Surveillance Unit. (n.d.) Retrieved from http://www.qisu.org.au/ModCoreFrontEnd/index.asp?pageid=109
Google Scholar
Rizzo, S. G., Montesi, D., Fabbri, A., and Marchesini, G. (2015) ICD code retrieval: Novel approach for assisted disease classification. Pp. 147–161 in Data Integration in the Life Sciences. DILS 2015. Lecture Notes in Computer Science, N. Ashish and J.-L. Ambite (Eds.), . Springer International Publishing. Retrieved from http://link.springer.com/chapter/10.1007/978-3-319-21843-4_12
Google Scholar
Smith, G. S., Timmons, R. A., Lombardi, D. A., Mamidi, D. K., Matz, S., Courtney, T. K., and Perry, M. J. (2006) Work-related ladder fall fractures: Identification and diagnosis validation using narrative text. Accident Analysis & Prevention, 38(5), 973–980. doi:10.1016/j.aap.2006.04.008
PubMed Web of Science ®Google Scholar
Sun, A., Lim, E.-P., and Liu, Y. (2009) On strategies for imbalanced text classification using SVM: A comparative study. Decision Support Systems, 48(1), 191–201. doi:10.1016/j.dss.2009.07.011
Web of Science ®Google Scholar
Vallmuur, K. (2015) Machine learning approaches to analysing textual injury surveillance data: A systematic review. Accident Analysis & Prevention, 79, 41–49. doi:10.1016/j.aap.2015.03.018
Web of Science ®Google Scholar
Vallmuur, K., Marucci-Wellman, H. R., Taylor, J. A., Lehto, M., Corns, H. L., and Smith, G. S. (2016) Harnessing information from injury narratives in the “big data” era: Understanding and applying machine learning for injury surveillance. Injury Prevention, 22(Suppl 1), i34–i42. doi:10.1136/injuryprev-2015-041813
Google Scholar
Van Hulse, J., Khoshgoftaar, T. M., and Napolitano, A. (2007) Experimental Perspectives on Learning from Imbalanced Data (pp. 935–942). ACM, New York, NY. doi:10.1145/1273496.1273614
Google Scholar
Wellman, H. M., Lehto, M. R., Sorock, G. S., and Smith, G. S. (2004) Computerized coding of injury narrative data from the National Health Interview Survey. Accident Analysis & Prevention, 36(2), 165–171. doi:10.1016/S0001-4575(02)00146-X
Web of Science ®Google Scholar
Zhu, X. (2007) Advanced NLP: Text categorization with Logistic Regression. Retrieved from http://pages.cs.wisc.edu/∼jerryzhu/cs838/LR.pdf
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Semi-automated text mining strategies for identifying rare causes of injuries from emergency room triage data

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Semi-automated text mining strategies for identifying rare causes of injuries from emergency room triage data

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date