Search in:

Advanced search

Applied Artificial Intelligence

An International Journal

Volume 34, 2020 - Issue 4

Submit an article Journal homepage

Free access

3,628

Views

CrossRef citations to date

Altmetric

Articles

Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines

Simone AccattoliDepartment of Information Engineering, Università Politecnica delle Marche, Ancona, Italy

Paolo SernaniDepartment of Information Engineering, Università Politecnica delle Marche, Ancona, ItalyCorrespondence[email protected]

Nicola FalcionelliDepartment of Information Engineering, Università Politecnica delle Marche, Ancona, Italy

Dagmawi Neway MekuriaDepartment of Information Engineering, Università Politecnica delle Marche, Ancona, Italy

Aldo Franco DragoniDepartment of Information Engineering, Università Politecnica delle Marche, Ancona, Italy

Pages 329-344 | Published online: 06 Feb 2020

Cite this article
https://doi.org/10.1080/08839514.2020.1723876
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Ben Mabrouk, A., and E. Zagrouba. 2017. Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognition Letters 92:62–67. doi:10.1016/j.patrec.2017.04.015.
Web of Science ®Google Scholar
Bengio, Y. 2009. Learning deep architectures for ai. Foundations and Trends® in Machine Learning 2 (1):1–127. doi:10.1561/2200000006.
Google Scholar
Bermejo Nievas, E., O. Deniz Suarez, G. Bueno García, and R. Sukthankar. 2011. Violence detection in video using computer vision techniques. In Computer analysis of images and patterns, ed. P. Real, D. Diaz-Pernil, H. Molina-Abril, A. Berciano, and W. Kropatsch, 332–39. Springer Berlin Heidelberg. doi:10.1007/978-3-642-23678-5_39.
Google Scholar
Chen, A. T. Y., M. Biglari-Abhari, K. I. K. Wang, A. Bouzerdoum, and F. H. C. Tivive. 2018. Convolutional neural network acceleration with hardware/software co-design. Applied Intelligence 48 (5):1288–301. doi:10.1007/s10489-017-1007-z.
Web of Science ®Google Scholar
Chen, D., H. Wactlar, M. Chen, C. Gao, A. Bharucha, and A. Hauptmann. 2008. Recognition of aggressive human behavior using binary local motion descriptors. 2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 5238–41. doi:10.1109/IEMBS.2008.4650395.
Google Scholar
Chen, L., H. Hsu, L. Wang, and C. Su. 2011. Violence detection in movies. 2011 Eighth International Conference Computer Graphics, Imaging and Visualization, 119–24. doi:10.1109/CGIV.2011.14.
Google Scholar
Chen, M. Y., and A. Hauptmann. 2009. MoSIFT: Recognizing human actions in surveillance videos. Tech. Rep. CMU-CS-09-161. Carnegie Mellon University.
Google Scholar
De Souza, F. D. M., G. C. Chavez, E. A. Do Valle Jr, and A. de Araujo. 2010. Violence detection in video using spatio-temporal features. 2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images, 224–30. doi:10.1109/SIBGRAPI.2010.38.
Google Scholar
Deniz, O., I. Serrano, G. Bueno, and T. Kim. 2014. Fast violence detection in video. 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, vol. 2, 478–85.
Google Scholar
Ding, C., S. Fan, M. Zhu, W. Feng, and B. Jia. 2014. Violence detection in video by using 3d convolutional neural networks. In Advances in visual computing, ed. G. Bebis, R. Boyle, B. Parvin, D. Koracin, R. McMahan, J. Jerald, H. Zhang, S. M. Drucker, C. Kambhamettu, M. El Choubassi, et al., 551–58. Springer International Publishing. doi:10.1007/978-3-319-14364-4_53.
Google Scholar
Dong, Z., J. Qin, and Y. Wang. 2016. Multi-stream deep networks for person to person violence detection in videos. In Pattern recognition, ed. T. Tan, X. Li, X. Chen, J. Zhou, J. Yang, and H. Cheng, 517–31. Singapore: Springer Singapore. doi:10.1007/978-981-10-3002-4_43.
Google Scholar
Gao, Y., H. Liu, X. Sun, C. Wang, and Y. Liu. 2016. Violence detection using oriented violent flows. Image and Vision Computing 48-49:37–41. doi:10.1016/j.imavis.2016.01.006.
Web of Science ®Google Scholar
Giannakopoulos, T., D. Kosmopoulos, A. Aristidou, and S. Theodoridis. 2006. Violence content classification using audio features. In Advances in artificial intelligence, ed. G. Antoniou, G. Potamias, C. Spyropoulos, and D. Plexousakis, 502–07. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/11752912_55.
Google Scholar
Guo, K., S. Wu, and Y. Xu. 2017. Face recognition using both visible light image and near-infrared image and a deep network. CAAI Transactions on Intelligence Technology 2 (1):39–47. doi:10.1016/j.trit.2017.03.001.
Google Scholar
Hassner, T., Y. Itcher, and O. Kliper-Gross. 2012. Violent flows: Real-time detection of violent crowd behavior. 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 1–6. doi:10.1109/CVPRW.2012.6239348.
Google Scholar
Ji, S., W. Xu, M. Yang, and K. Yu. 2013. 3d convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (1):221–31. doi:10.1109/TPAMI.2012.59.
PubMed Web of Science ®Google Scholar
Jia, Y., E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on Multimedia, Orlando, Florida, USA, 675–78. ACM.
Google Scholar
Karpathy, A., G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 1725–32. doi:10.1109/CVPR.2014.223.
Google Scholar
Lin, J., and W. Wang. 2009. Weakly-supervised violence detection in movies with audio and video based co-training. In Advances in multimedia information processing - PCM 2009, ed. P. Muneesawang, F. Wu, I. Kumazawa, A. Roeksabutr, M. Liao, and X. Tang, 930–35. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-10467-1_84.
Google Scholar
Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60 (2):91–110. doi:10.1023/B:VISI.0000029664.99615.94.
Web of Science ®Google Scholar
Marques, J., G. Falcao, and L. A. Alexandre. 2018. Distributed learning of cnns on heterogeneous cpu/gpu architectures. Applied Artificial Intelligence 32 (9–10):822–44. doi:10.1080/08839514.2018.1508814.
Web of Science ®Google Scholar
Meng, Z., J. Yuan, and Z. Li. 2017. Trajectory-pooled deep convolutional networks for violence detection in videos. In Computer vision systems, ed. M. Liu, H. Chen, and M. Vincze, 437–47. Springer International Publishing. doi:10.1007/978-3-319-68345-4_39.
Google Scholar
Misra, I., C. L. Zitnick, and M. Hebert. 2016. Shuffle and learn: Unsupervised learning using temporal order verification. European Conference on Computer Vision, 527–44. Springer. doi:10.1177/1753193415618391.
Google Scholar
Nam, J., M. Alghoniemy, and A. H. Tewfik. 1998. Audio-visual content-based violent scene characterization. Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269), vol. 1, 353–57. doi:10.1109/ICIP.1998.723496
Google Scholar
Neelakantan, A., L. Vilnis, Q. V. Le, I. Sutskever, L. Kaiser, K. Kurach, and J. Martens. 2015. Adding gradient noise improves learning for very deep networks. arXiv preprint arXiv:151106807.
Google Scholar
Nissan, E. 2012. An overview of data mining for combating crime. Applied Artificial Intelligence 26 (8):760–86. doi:10.1080/08839514.2012.713309.
Web of Science ®Google Scholar
Niu, X. X., and C. Y. Suen. 2012. A novel hybrid cnn–svm classifier for recognizing handwritten digits. Pattern Recognition 45 (4):1318–25. doi:10.1016/j.patcog.2011.09.021.
Web of Science ®Google Scholar
Simonyan, K., and A. Zisserman. 2014. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, Z. Ghahramani, eds. M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, vol 27, 568–76. Red Hook, New York, USA: Curran Associates, Inc.
Google Scholar
Soomro, K., A. R. Zamir, and M. Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:12120402.
Google Scholar
Sudhakaran, S., and O. Lanz. 2017. Learning to detect violent videos using convolutional long short-term memory. 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 1–6. doi:10.1109/AVSS.2017.8078468.
Google Scholar
Tang, Y. 2013. Deep learning using support vector machines. CoRR abs/1306.0239. http://arxiv.org/abs/1306.0239.
Google Scholar
Tao, Q. Q., S. Zhan, X. H. Li, and T. Kurihara. 2016. Robust face detection using local cnn and svm based on kernel combination. Neurocomputing 211:98–105. doi:10.1016/j.neucom.2015.10.139.
Web of Science ®Google Scholar
Taylor, G. W., R. Fergus, Y. LeCun, and C. Bregler. 2010. Convolutional learning of spatio-temporal features. In Computer vision – ECCV 2010, ed. K. Daniilidis, P. Maragos, and N. Paragios, 140–53. Berlin, Heidelberg: Springer Berlin Heidelberg.
Google Scholar
Tran, D., L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. 2015. Learning spatiotemporal features with 3d convolutional networks. 2015 IEEE International Conference on Computer Vision (ICCV), 4489–97. doi:10.1109/ICCV.2015.510.
Google Scholar
United Nations Office on Drugs and Crime. 2019a. Intentional homicide victims. Accessed January 21, 2019 https://dataunodc.un.org/crime/intentional-homicide-victims.
Google Scholar
United Nations Office on Drugs and Crime. 2019b. Official website. Accessed January 21, 2019. http://www.unodc.org/.
Google Scholar
Xu, D., E. Ricci, Y. Yan, J. Song, and N. Sebe. 2015. Learning deep representations of appearance and motion for anomalous event detection. arXiv preprint arXiv:151001553
Google Scholar
Xu, L., C. Gong, J. Yang, Q. Wu, and L. Yao. 2014. Violent video detection based on mosift feature and sparse coding. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3538–42. doi:10.1109/ICASSP.2014.6854259.
Google Scholar
Xue, D. X., R. Zhang, H. Feng, and Y. L. Wang. 2016. Cnn-svm for microvascular morphological type recognition with data augmentation. Journal of Medical and Biological Engineering 36 (6):755–64. doi:10.1007/s40846-016-0182-4.
PubMed Web of Science ®Google Scholar
Yu, S., Y. Cheng, S. Su, G. Cai, and S. Li. 2017. Stratified pooling based deep convolutional neural networks for human action recognition. Multimedia Tools and Applications 76 (11):13367–82. doi:10.1007/s11042-016-3768-5.
Web of Science ®Google Scholar
Zajdel, W., J. D. Krijnders, T. Andringa, and D. M. Gavrila. 2007. CASSANDRA: Audio-video sensor fusion for aggression detection. 2007 IEEE Conference on Advanced Video and Signal Based Surveillance, 200–05. doi:10.1109/AVSS.2007.4425310.
Google Scholar
Zhou, P., Q. Ding, H. Luo, and X. Hou. 2017. Violent interaction detection in video based on deep learning. Journal of Physics: Conference Series, 6th conference on Advances in Optoelectronics and Micro/nano-optics, Najing, Jiangsu China, vol. 844, 1–9. IOP Publishing.
Google Scholar
Zhou, P., Q. Ding, H. Luo, and X. Hou. 2018. Violence detection in surveillance video using low-level features. PloS One 13 (10):1–15. doi:10.1371/journal.pone.0203668.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date