Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

Hind Saleha Computer Science Department, University of Tabuk, Tabuk, The Kingdom of Saudi Arabia;b Computer Science Department, King Abdulaziz University Jeddah, Makkah, The Kingdom of Saudi ArabiaCorrespondence[email protected]

https://orcid.org/0000-0002-8208-3510

Areej Alhothalib Computer Science Department, King Abdulaziz University Jeddah, Makkah, The Kingdom of Saudi Arabia

Kawthar Moriab Computer Science Department, King Abdulaziz University Jeddah, Makkah, The Kingdom of Saudi Arabia

Article: 2166719 | Received 26 Jun 2022, Accepted 04 Jan 2023, Published online: 02 Feb 2023

Cite this article
https://doi.org/10.1080/08839514.2023.2166719
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. 2016. TensorFlow: Large-scale machine learning on heterogeneous systems. arXiv preprint arXiv:1603.04467. Accessed 2015. https://www.tensorflow.org/api_docs/python/tf/compat/v1/keras/layers/CuDNNLST.
Google Scholar
Agarwal, S., and A. Sureka. 2015. Using knn and svm based one-class classifier for detecting online radicalization on twitter. In International Conference on Distributed Computing and Internet Technology, Cham, 431–405. Springer.
Google Scholar
Ailem, M., A. Salah, and M. Nadif. 2017. Non-negative matrix factorization meets word embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, 1081–84.
Google Scholar
Al-Azani, S., and E.S.M. El-Alfy. 2017. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text. ANT/SEIT 359–66. doi:10.1016/j.procs.2017.05.365.
Google Scholar
Albadi, N., M. Kurdi, and S. Mishra. 2018. Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Los Alamitos, CA, USA, 69–76. IEEE.
Google Scholar
Badjatiya, P., S. Gupta, M. Gupta, and V. Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 759–60.
Google Scholar
Bengio, Y., R. Ducharme, P. Vincent, and C. Jauvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research 3 (Feb):1137–55.
Google Scholar
Chen, L., R. Song, M. Liakata, A. Vlachos, S. Seneff, and X. Zhang. 2015. Using word embedding for bio-event extraction. In Proceedings of BioNLP 15, 121–26. Beijing, China: Association for Computational Linguistics, July. https://doi.org/10.18653/v1/W15-3814.
Google Scholar
Clement, J. 2019. U.S. teens hate speech social media by type 2018 l Statistic, October. https://www.statista.com/statistics/945392/teenagers-who-encounter-hate-speech-online-social-media-usa/.
Google Scholar
CUDA®, N. V. I. D. I. A. 2020. NVIDIA cuDNN | NVIDIA Developer, Accessed November 2019. https://developer.nvidia.com/cudnn.
Google Scholar
Davidson, T., D. Warmsley, M. Macy, and I. Weber. 2017. “Automated hate speech detection and the problem of offensive language.” In Eleventh international aaai conference on web and social media, Canada.
Google Scholar
Demilie, W. B., and A. Olalekan Salau. 2022. Detection of fake news and hate speech for Ethiopian languages: A systematic review of the approaches. Journal of Big Data 9 (1):1–17.
Google Scholar
Devlin, J., M.W. Chang, K. Lee, and K. Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805 11:512–515.
Google Scholar
Djuric, N., J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th international conference on world wide web, Florence, Italy, 29–30.
Google Scholar
Duggan, M. 2017. Online harassment 2017. Pew Research Center: Internet, Science & Tech. Accessed January 25, 2023. https://policycommons.net/artifacts/617798/online-harassment-2017/1598654/.
Google Scholar
Ferrara, E., W.Q. Wang, O. Varol, A. Flammini, and A. Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In International conference on social informatics, Cham, 22–39. Springer.
Google Scholar
Fortuna, P., and S. Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) 51 (4):1–30.
Web of Science ®Google Scholar
Founta, A. M., C. Djouvas, D. Chatzakou, I. Leontiadis, J. Blackburn, G. Stringhini, A. Vakali, M. Sirivianos, and N. Kourtellis. 2018. Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth International AAAI Conference on Web and Social Media, Stanford, California, USA.
Google Scholar
Gambäck, B., and U. Kumar Sikdar. 2017. Using convolutional neural networks to classify hate-speech. In Proceedings of the first workshop on abusive language online, Vancouver, BC, Canada, 85–90.
Google Scholar
Gialampoukidis, I., G. Kalpakis, T. Tsikrika, S. Papadopoulos, S. Vrochidis, and I. Kompatsiaris. 2017. Detection of terrorism-related twitter communities using centrality scores. In Proceedings of the 2nd International Workshop on Multimedia Forensics and Security, Bucharest, Romania, 21–25.
Google Scholar
Gibert, O. D., N. Perez, A. Garcıa-Pablos, and M. Cuadros. 2018. Hate speech dataset from a white supremacy forum. arXiv preprint arXiv:180904444 11–20.
Google Scholar
Golbeck, J., Z. Ashktorab, R. O. Banjo, A. Berlinger, S. Bhagwan, C. Buntain, P. Cheakalos, A. A. Geller, R. Kumar Gnanasekaran, R. Rajan Gunasekaran, et al. 2017. A large labeled corpus for online harassment research. In Proceedings of the 2017 ACM on web science conference, Troy, New York, USA, 229–33.
Google Scholar
Gupta, S., and Z. Waseem. 2017. A comparative study of embeddings methods for hate speech detection from tweets. Association for Computing Machinery.
Google Scholar
Hartung, M., R. Klinger, F. Schmidtke, and L. Vogel. 2017. Identifying right-wing extremism in German Twitter profiles: A classification approach. In International conference on applications of natural language to information systems, Cham, 320–25. Springer.
Google Scholar
Hochreiter, S., and J. Schmidhuber. 1997. Long short-term memory. Neural computation 9 (8):1735–80.
PubMed Web of Science ®Google Scholar
Inc, HateBase. 2020. HateBase. https://hatebase.org/.
Google Scholar
Jaki, S., and T. De Smedt. 2019. Right-wing German hate speech on Twitter: Analysis and automatic detection. arXiv preprint arXiv:191007518, 31.
Google Scholar
League, A.D. 2019. Online hate and harassment: The American experience, Pew Research Center: Internet, Science & Tech. Accessed December 25, 2022. https://policycommons.net/artifacts/617798/online-harassment-2017/1598662/.
Google Scholar
Lilleberg, J., Y. Zhu, and Y. Zhang. 2015. Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), Beijing, China, 136–40. IEEE. doi:10.1109/ICCI-CC.2015.7259377.
Google Scholar
Liu, A. 2018. Neural network models for hate speech classification in tweets. PhD diss., Dept. Arts Sci., Harvard, Cambridge, MA, USA. Retrieved from https://dash.harvard.edu/handle/1/38811552.
Google Scholar
Ltd, We Are Social. 2020. Digital 2020. Accessed November 2008. https://wearesocial.com/digital-2020.
Google Scholar
Magu, R., K. Joshi, and J. Luo. 2017. Detecting the hate code on social media. In Eleventh International AAAI Conference on Web and Social Media, Canada, 11:608–611. doi:10.1609/icwsm.v11i1.14921.
Google Scholar
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781 11:608–611.
Google Scholar
Mozafari, M., R. Farahbakhsh, and N. Crespi. 2020. Hate speech detection and racial bias mitigation in social media based on BERT model. Plos One 15 (8):e0237861.
PubMed Web of Science ®Google Scholar
Nobata, C., J. Tetreault, A. Thomas, Y. Mehdad, and Y. Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th international conference on world wide web, Montreal, Quebec, Canada, 145–53.
Google Scholar
Pelicon, A., M. Martinc, and P. Kralj Novak. 2019. Embeddia at SemEval- 2019 Task 6: Detecting hate with neural network and transfer learning approaches. In Proceedings of the 13th international workshop on semantic evaluation, Minneapolis, Minnesota, USA, 604–10.
Google Scholar
Pennington, J., R. Socher, and C. D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14), 1532–43.
Google Scholar
Peters, M. E., M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:180205365.
Google Scholar
Pitsilis, G. K., H. Ramampiaro, and H. Langseth. 2018. Effective hate-speech detection in Twitter data using recurrent neural networks. Applied Intelligence 48 (12):4730–42.
Web of Science ®Google Scholar
Rajpurkar, P., J. Zhang, K. Lopyrev, and P. Liang. 2016. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:160605250 2383–2392.
Google Scholar
Ribeiro, M. H., P. H. Calais, Y. A. Santos, V. A. Almeida, and W. Meira Jr. 2017. “Like sheep among wolves”: Characterizing hateful users on twitter. arXiv preprint arXiv:180100317.
Google Scholar
Ribeiro, M. T., S. Singh, and C. Guestrin. 2016. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, California, USA, 1135–44.
Google Scholar
Sienčnik, S. K. 2015. Adapting word2vec to named entity recognition. In Proceedings of the 20th Nordic Conference of Computational Linguistics (NODAL- IDA 2015), Vilnius, Lithuania, 239–43.
Google Scholar
Smedt, D., G. D. P. Tom, and P. Van Ostaeyen. 2018. Automatic detection of online jihadist hate speech. arXiv preprint arXiv:1803.04596, 31. Computational Linguistics & Psycholinguistics.
Google Scholar
Socher, R., Y. Bengio, and C. D. Manning. 2012. Deep learning for NLP (without magic). In Tutorial Abstracts of ACL 2012, 5. Jeju Island, Korea: Association for Computational Linguistics.
Google Scholar
Tang, D., F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin. 2014. Learning sentiment specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, USA, 1555–65.
Google Scholar
Wang, J., K. R. L. Liang-Chih Yu, and X. Zhang. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, 225–30.
Google Scholar
Wang, A. S., J. Michael, F. Hill, O. Levy, and S. R. Bowman. 2018. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:180407461 353–355.
Google Scholar
Wang, P., Y. Qian, F. K. Soong, H. Lei, and H. Zhao. 2015. Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv preprint arXiv:151006168.
Google Scholar
Waseem, Z. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science, Austin, 138–42.
Google Scholar
Waseem, Z., and D. Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop, San Diego, California, 88–93.
Google Scholar
Weber, A. 2009. Manual on hate speech. Strasbourg, France: Council Of Europe.
Google Scholar
Wei, Y., L. Singh, and S. Martin. 2016. “Identification of extremism on Twitter.” In 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), Davis, California, 1251–55. IEEE.
Google Scholar
Williams, A., N. Nangia, and S. R. Bowman. 2017. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:170405426 1: 1112–1122.
Google Scholar
Yonghui, W., X. Jun, Y. Zhang, and X. Hua 2015. Clinical abbreviation disambiguation using neural word embeddings. In Proceedings of BioNLP, Beijing, China, 15, 171–76. doi:10.18653/v1/W15-3822.
Google Scholar
Zhang, Z., D. Robinson, and J. Tepper. 2018. Detecting hate speech on twitter using a convolution-gru based deep neural network. In European semantic web conference, 745–60. Cham: Springer.
Google Scholar
Zhu, J., Z. Tian, and S. Kübler. 2019. Um-iu@ ling at semeval-2019 task 6: Identifying offensive tweets using BERT and svms. Proceedings of the 13th International Workshop on Semantic Evaluation, 788–795. Minneapolis, Minnesota, USA: Association for Computational Linguistics.
Google Scholar

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date