7,731
Views
15
CrossRef citations to date
0
Altmetric
Research Article

Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

ORCID Icon, &
Article: 2166719 | Received 26 Jun 2022, Accepted 04 Jan 2023, Published online: 02 Feb 2023

References

  • Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. 2016. TensorFlow: Large-scale machine learning on heterogeneous systems. arXiv preprint arXiv:1603.04467. Accessed 2015. https://www.tensorflow.org/api_docs/python/tf/compat/v1/keras/layers/CuDNNLST.
  • Agarwal, S., and A. Sureka. 2015. Using knn and svm based one-class classifier for detecting online radicalization on twitter. In International Conference on Distributed Computing and Internet Technology, Cham, 431–405. Springer.
  • Ailem, M., A. Salah, and M. Nadif. 2017. Non-negative matrix factorization meets word embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, 1081–84.
  • Al-Azani, S., and E.S.M. El-Alfy. 2017. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text. ANT/SEIT 359–66. doi:10.1016/j.procs.2017.05.365.
  • Albadi, N., M. Kurdi, and S. Mishra. 2018. Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Los Alamitos, CA, USA, 69–76. IEEE.
  • Badjatiya, P., S. Gupta, M. Gupta, and V. Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 759–60.
  • Bengio, Y., R. Ducharme, P. Vincent, and C. Jauvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research 3 (Feb):1137–55.
  • Chen, L., R. Song, M. Liakata, A. Vlachos, S. Seneff, and X. Zhang. 2015. Using word embedding for bio-event extraction. In Proceedings of BioNLP 15, 121–26. Beijing, China: Association for Computational Linguistics, July. https://doi.org/10.18653/v1/W15-3814.
  • Clement, J. 2019. U.S. teens hate speech social media by type 2018 l Statistic, October. https://www.statista.com/statistics/945392/teenagers-who-encounter-hate-speech-online-social-media-usa/.
  • CUDA®, N. V. I. D. I. A. 2020. NVIDIA cuDNN | NVIDIA Developer, Accessed November 2019. https://developer.nvidia.com/cudnn.
  • Davidson, T., D. Warmsley, M. Macy, and I. Weber. 2017. “Automated hate speech detection and the problem of offensive language.” In Eleventh international aaai conference on web and social media, Canada.
  • Demilie, W. B., and A. Olalekan Salau. 2022. Detection of fake news and hate speech for Ethiopian languages: A systematic review of the approaches. Journal of Big Data 9 (1):1–17.
  • Devlin, J., M.W. Chang, K. Lee, and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805 11:512–515.
  • Djuric, N., J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th international conference on world wide web, Florence, Italy, 29–30.
  • Duggan, M. 2017. Online harassment 2017. Pew Research Center: Internet, Science & Tech. Accessed January 25, 2023. https://policycommons.net/artifacts/617798/online-harassment-2017/1598654/.
  • Ferrara, E., W.Q. Wang, O. Varol, A. Flammini, and A. Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In International conference on social informatics, Cham, 22–39. Springer.
  • Fortuna, P., and S. Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) 51 (4):1–30.
  • Founta, A. M., C. Djouvas, D. Chatzakou, I. Leontiadis, J. Blackburn, G. Stringhini, A. Vakali, M. Sirivianos, and N. Kourtellis. 2018. Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth International AAAI Conference on Web and Social Media, Stanford, California, USA.
  • Gambäck, B., and U. Kumar Sikdar. 2017. Using convolutional neural networks to classify hate-speech. In Proceedings of the first workshop on abusive language online, Vancouver, BC, Canada, 85–90.
  • Gialampoukidis, I., G. Kalpakis, T. Tsikrika, S. Papadopoulos, S. Vrochidis, and I. Kompatsiaris. 2017. Detection of terrorism-related twitter communities using centrality scores. In Proceedings of the 2nd International Workshop on Multimedia Forensics and Security, Bucharest, Romania, 21–25.
  • Gibert, O. D., N. Perez, A. Garcıa-Pablos, and M. Cuadros. 2018. Hate speech dataset from a white supremacy forum. arXiv preprint arXiv:180904444 11–20.
  • Golbeck, J., Z. Ashktorab, R. O. Banjo, A. Berlinger, S. Bhagwan, C. Buntain, P. Cheakalos, A. A. Geller, R. Kumar Gnanasekaran, R. Rajan Gunasekaran, et al. 2017. A large labeled corpus for online harassment research. In Proceedings of the 2017 ACM on web science conference, Troy, New York, USA, 229–33.
  • Gupta, S., and Z. Waseem. 2017. A comparative study of embeddings methods for hate speech detection from tweets. Association for Computing Machinery.
  • Hartung, M., R. Klinger, F. Schmidtke, and L. Vogel. 2017. Identifying right-wing extremism in German Twitter profiles: A classification approach. In International conference on applications of natural language to information systems, Cham, 320–25. Springer.
  • Hochreiter, S., and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9 (8):1735–80.
  • Inc, HateBase. 2020. HateBase. https://hatebase.org/.
  • Jaki, S., and T. De Smedt. 2019. Right-wing German hate speech on Twitter: Analysis and automatic detection. arXiv preprint arXiv:191007518, 31.
  • League, A.D. 2019. Online hate and harassment: The American experience, Pew Research Center: Internet, Science & Tech. Accessed December 25, 2022. https://policycommons.net/artifacts/617798/online-harassment-2017/1598662/.
  • Lilleberg, J., Y. Zhu, and Y. Zhang. 2015. Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), Beijing, China, 136–40. IEEE. doi:10.1109/ICCI-CC.2015.7259377.
  • Liu, A. 2018. Neural network models for hate speech classification in tweets. PhD diss., Dept. Arts Sci., Harvard, Cambridge, MA, USA. Retrieved from https://dash.harvard.edu/handle/1/38811552.
  • Ltd, We Are Social. 2020. Digital 2020. Accessed November 2008. https://wearesocial.com/digital-2020.
  • Magu, R., K. Joshi, and J. Luo. 2017. Detecting the hate code on social media. In Eleventh International AAAI Conference on Web and Social Media, Canada, 11:608–611. doi:10.1609/icwsm.v11i1.14921.
  • Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781 11:608–611.
  • Mozafari, M., R. Farahbakhsh, and N. Crespi. 2020. Hate speech detection and racial bias mitigation in social media based on BERT model. Plos One 15 (8):e0237861.
  • Nobata, C., J. Tetreault, A. Thomas, Y. Mehdad, and Y. Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th international conference on world wide web, Montreal, Quebec, Canada, 145–53.
  • Pelicon, A., M. Martinc, and P. Kralj Novak. 2019. Embeddia at SemEval- 2019 Task 6: Detecting hate with neural network and transfer learning approaches. In Proceedings of the 13th international workshop on semantic evaluation, Minneapolis, Minnesota, USA, 604–10.
  • Pennington, J., R. Socher, and C. D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14), 1532–43.
  • Peters, M. E., M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:180205365.
  • Pitsilis, G. K., H. Ramampiaro, and H. Langseth. 2018. Effective hate-speech detection in Twitter data using recurrent neural networks. Applied Intelligence 48 (12):4730–42.
  • Rajpurkar, P., J. Zhang, K. Lopyrev, and P. Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:160605250 2383–2392.
  • Ribeiro, M. H., P. H. Calais, Y. A. Santos, V. A. Almeida, and W. Meira Jr. 2017. “Like sheep among wolves”: Characterizing hateful users on twitter. arXiv preprint arXiv:180100317.
  • Ribeiro, M. T., S. Singh, and C. Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California, USA, 1135–44.
  • Sienčnik, S. K. 2015. Adapting word2vec to named entity recognition. In Proceedings of the 20th Nordic Conference of Computational Linguistics (NODAL- IDA 2015), Vilnius, Lithuania, 239–43.
  • Smedt, D., G. D. P. Tom, and P. Van Ostaeyen. 2018. Automatic detection of online jihadist hate speech. arXiv preprint arXiv:1803.04596, 31. Computational Linguistics & Psycholinguistics.
  • Socher, R., Y. Bengio, and C. D. Manning. 2012. Deep learning for NLP (without magic). In Tutorial Abstracts of ACL 2012, 5. Jeju Island, Korea: Association for Computational Linguistics.
  • Tang, D., F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin. 2014. Learning sentiment specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, USA, 1555–65.
  • Wang, J., K. R. L. Liang-Chih Yu, and X. Zhang. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, 225–30.
  • Wang, A. S., J. Michael, F. Hill, O. Levy, and S. R. Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:180407461 353–355.
  • Wang, P., Y. Qian, F. K. Soong, H. Lei, and H. Zhao. 2015. Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv preprint arXiv:151006168.
  • Waseem, Z. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science, Austin, 138–42.
  • Waseem, Z., and D. Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, San Diego, California, 88–93.
  • Weber, A. 2009. Manual on hate speech. Strasbourg, France: Council Of Europe.
  • Wei, Y., L. Singh, and S. Martin. 2016. “Identification of extremism on Twitter.” In 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Davis, California, 1251–55. IEEE.
  • Williams, A., N. Nangia, and S. R. Bowman. 2017. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:170405426 1: 1112–1122.
  • Yonghui, W., X. Jun, Y. Zhang, and X. Hua 2015. Clinical abbreviation disambiguation using neural word embeddings. In Proceedings of BioNLP, Beijing, China, 15, 171–76. doi:10.18653/v1/W15-3822.
  • Zhang, Z., D. Robinson, and J. Tepper. 2018. Detecting hate speech on twitter using a convolution-gru based deep neural network. In European semantic web conference, 745–60. Cham: Springer.
  • Zhu, J., Z. Tian, and S. Kübler. 2019. Um-iu@ ling at semeval-2019 task 6: Identifying offensive tweets using BERT and svms. Proceedings of the 13th International Workshop on Semantic Evaluation, 788–795. Minneapolis, Minnesota, USA: Association for Computational Linguistics.