Search in:

Advanced search

Journal of Business Analytics Volume 7, 2024 - Issue 3

Submit an article Journal homepage

Views

CrossRef citations to date

Altmetric

Research Article

Predicting fraud in MD&A sections using deep learning

Sachin Velloor SivasubramanianSchool of Computing, Queen’s University, Kingston, Canada

David SkillicornSchool of Computing, Queen’s University, Kingston, CanadaCorrespondence[email protected]

Pages 197-206 | Received 27 Nov 2023, Accepted 03 Apr 2024, Published online: 15 Apr 2024

Cite this article
https://doi.org/10.1080/2573234X.2024.2342773
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

BERT. (2023a). BERT: How to handle long documents. Salt Data Labs. Retrieved August 14, 2023a, from https://www.saltdatalabs.com/blog/bert-how-to-handle-long-documents?rq=bert
Google Scholar
BERT. (2023b). BERT (language model) - Wikipedia. Retrieved August 14, 2023b, from https://en.wikipedia.org/wiki/BERT_(language_model)
Google Scholar
BERT. (2023c). BERT large model (uncased). Hugging Face. Retrieved August 14, 2023c, from https://huggingface.co/bert-large-uncased
Google Scholar
Borealis, A. I. Tutorial #14: Transformers I: Introduction. Borealis AI. Retrieved August 14, 2023, from https://www.borealisai.com/research-blogs/tutorial-14-transformers-i-introduction/#Multiple_heads
Google Scholar
Brown, T., Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S. (2020). Mann. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in Neural Information Processing Systems (Vol. 33, pp. 1877–1901). Curran Associates, Inc.
Google Scholar
Brown, S. V., & Tucker, J. W. (2011). Large-sample evidence on firms: Year-over-year md&a modifications. Journal of Accounting Research, 490(2), 309–346. https://doi.org/10.1111/j.1475-679X.2010.00396.x
Google Scholar
CFA Institute. (2023). Fraud and Deception Detection: Text-Based Analysis | CFA Institute Enterprising Investor. Retrieved August 14, 2023, from https://blogs.cfainstitute.org/investor/2021/02/15/fraud-and-deception-detection-text-based-analysis/
Google Scholar
Craja, P., Kim, A., & Lessmann, S. (2020.) Deep learning for detecting financial statement fraud. Decision Support Systems, 139, 113421. ISSN 0167-9236. https://doi.org/10.1016/j.dss.2020.113421
Web of Science ®Google Scholar
Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 280(1), 17–82. https://doi.org/10.1111/j.1911-3846.2010.01041.x
Google Scholar
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1996). Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement actions by the sec*. Contemporary Accounting Research, 130(1), 1–36. https://doi.org/10.1111/j.1911-3846.1996.tb00489.x
Google Scholar
Devlin, J., Chang, M.-W., Lee, K., & Kristina Toutanova, B. E. R. T. (2019, June). Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Vol. 1. pp. 4171–4186). Association for Computational Linguistics, Minneapolis, Minnesota.
Google Scholar
Durtschi, C., Hillison, W., & Pacini, C. (2004, January). The effective use of Benford’s Law to assist in detecting fraud in accounting data. Journal of Forensic Accounting, 50(1), 17–33.
Google Scholar
Eleutheria, A. I. (2023). Gpt-neo 125m, model description. Eleutheria AI. Retrieved August 01–14, 2023, from https://huggingface.co/EleutherAI/gpt-neo-125m.
Google Scholar
FastText. (2023). Word Vectors for 157 Languages: fastText. Retrieved May 9, 2023, from https://fasttext.cc/docs/en/crawl-vectors.html
Google Scholar
FinBERT. (2023). FinBERT: A pretrained BERT model for financial communications. ArXiv. Retrieved August 14, 2023, from https://arxiv.org/abs/2006.08097
Google Scholar
GPT. (2023) GPT-3 powers the next generation of apps. Retrieved August 14, 2023, from https://openai.com/blog/gpt-3-apps
Google Scholar
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., & Mikolov, T. (2018, May). Learning word vectors for 157 languages. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
Google Scholar
Graves, A., Fernández, S., & Schmidhuber, J. (2005). Bidirectional LSTM networks for improved phoneme classification and recognition. In W. Duch, J. Kacprzyk, E. Oja, & S. Zadrożny (Eds.), Artificial Neural Networks: Formal Models and Their Applications – ICANN 2005 (pp. 799–804). Springer.
Google Scholar
Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5–6), 602–610. ISSN 0893-6080. IJCNN 2005. https://doi.org/10.1016/j.neunet.2005.06.042
PubMed Web of Science ®Google Scholar
Hajek, P., & Henriques, R. (2017). Mining corporate annual reports for intelligent detection of financial statement fraud: A comparative study of machine learning methods. Knowledge-Based Systems, 128, 39–152. https://doi.org/10.1016/j.knosys.2017.05.001
Web of Science ®Google Scholar
HAN implementation. (2023). Text Classification with Hierarchical Attention Network. Retrieved August 14, 2023, from https://humboldt-wi.github.io/blog/research/information_systems_1819/group5_han/
Google Scholar
Hartwig, M., Voss, J. A., & Brian Wallace, D. (2015). Detecting lies in the financial industry: A survey of investment professionals’ beliefs. Journal of Behavioral Finance, 160(2), 173–182. https://doi.org/10.1080/15427560.2015.1034862
Google Scholar
Hartwig, M., Voss, J. A., Brimbal, L., & Brian Wallace, D. (2017). Investment professionals’ ability to detect deception: Accuracy, bias and metacognitive realism. Journal of Behavioral Finance, 180(1), 1–13. https://doi.org/10.1080/15427560.2017.1276069
Google Scholar
Harvard Law. (2023). The Important Legacy of the Sarbanes Oxley Act. Retrieved May 9, 2023, from https://corpgov.law.harvard.edu/2022/08/30/the-important-legacy-of-the-sarbanes-oxley-act/
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2017, April). Bag of tricks for efficient text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain (pp. 427–431). Association for Computational Linguistics.
Google Scholar
Junger, M., Wang, V., & Schlömer, M. (2020, July). Fraud against businesses both online and offline: Crime scripts, business characteristics, efforts, and benefits. Crime Science, 90(1). ISSN 2193-7680.https://doi.org/10.1186/s40163-020-00119-4
Google Scholar
Keras. (2023a). Tensorflow v2.12.0. Retrieved May 9, 2023a, from https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text/Tokenizer
Google Scholar
Keras. (2023b). Text Preprocessing - Keras 2.0.5 Documentation. Retrieved May 9, 2023b, from https://faroit.com/keras-docs/2.0.5/preprocessing/text/
Google Scholar
Keras. (2023c). Transfer Learning & Fine-Tuning. Retrieved May 9, 2023c, from https://keras.io/guides/transfer_learning/
Google Scholar
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., & Zettlemoyer, L. (2020, July). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7871–7880). Association for Computational Linguistics, Online.
Google Scholar
Michel, P., Levy, O., & Neubig, G. (2019). Are sixteen heads really better than one? In H. Wallach, H. Larochelle, A. Beygelzimer, F. D Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 32). Curran Associates, Inc.
Google Scholar
Muñoz, E. (2023). Attention is all you need: Discovering the Transformer. Retrieved August 14, 2023, from https://towardsdatascience.com/attention-is-all-you-need-discovering-the-transformer-paper-73e5ff5e0634
Google Scholar
Purda, L., & Skillicorn, D. (2015). Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research, 320(3), 1193–1223. https://doi.org/10.1111/1911-3846.12089
Google Scholar
Ruan, S., Sun, X., Yao, R., Li, W., & Zhang, N. (2021, December). Deep learning based on hierarchical self-attention for finance distress prediction incorporating text. Computational Intelligence and Neuroscience, 2021, 1–11. https://doi.org/10.1155/2021/1165296
Web of Science ®Google Scholar
SEC. (2023a). About EDGAR. Retrieved August 18, 2023a, from https://www.sec.gov/edgar/about
Google Scholar
SEC. (2023b). A Plain English Handbook. Retrieved August 15, 2023b, from https://www.sec.gov/pdf/handbook.pdf
Google Scholar
Skousen, C., & Wright, C. (2006, 08). Contemporaneous risk factors and the prediction of financial statement fraud. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.938736
Google Scholar
Stack Overflow. (2023). Performance - Stack Overflow: How does Choosing Between Pre and Post Zero Padding of Sequences Impact Results. Retrieved May 9, 2023, from https://stackoverflow.com/questions/46298793/how-does-choosing-between-pre-and-post-zero-padding-of-sequences-impact-results
Google Scholar
Wikipedia. (2023). EleutherAI - Wikipedia. Retrieved August 14, 2023, from https://en.wikipedia.org/wiki/EleutherAI
Google Scholar
Williams, A., Nangia, N., & Bowman, S. (2018, June). A broad-coverage challenge corpus for sentence understanding through inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 1112–1122). Association for Computational Linguistics, New Orleans, Louisiana.
Google Scholar
Yang, Y., Christopher Siy UY, M., & Huang, A. (2020). “FinBERT: A pretrained language model for financial communications.” https://arxiv.org/abs/2006.08097
Google Scholar
Yang, L., & Zhu, M. (2023). Restatement prediction with detection lag. Retrieved August 17, 2023, from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4045172
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Predicting fraud in MD&A sections using deep learning

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Predicting fraud in MD&A sections using deep learning

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date