136
Views
12
CrossRef citations to date
0
Altmetric
Articles

Emotion recognition of audio/speech data using deep learning approaches

, , , &

References

  • Vogt, T., André, E., & Wagner, J. (2008). Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realization. In Affect and emotion in human-computer interaction (pp. 75-91). Springer, Berlin, Heidelberg.
  • Wang, K., An, N., Li, B. N., Zhang, Y., & Li, L. (2015). Speech emotion recognition using Fourier parameters. IEEE Transactions on Affective Computing, 6(1), 69-75. Z. Xiao, E. Dellandrea, W. Dou, and L. Chen, “Features extraction and selection for emotional speech classification,” in Advanced Video and Signal Based Surveillance, 2005. AVSS 2005. IEEE Conference on, 2005, pp. 411-416.
  • Olatunji, S. O. (2019). Improved email spam detection model based on support vector machines. Neural Computing and Applications, 31(3), 691-699.
  • [4] Koolagudi, S. G., & Rao, K. S. (2010, December). Real life emotion classification using voice and pitch based spectral features. In 2010 Annual IEEE India Conference (INDICON) (pp. 1-4). IEEE.
  • Lim, W., Jang, D., & Lee, T. (2016, December). Speech emotion recognition using convolutional and recurrent neural networks. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (pp. 1-4). IEEE.
  • Zheng, W. Q., Yu, J. S., & Zou, Y. X. (2015, September). An experimental study of speech emotion recognition based on deep convolutional neural networks. In 2015 international conference on affective computing and intelligent interaction (ACII) (pp. 827-831). IEEE.
  • Zhang, B., Essl, G., & Provost, E. M. (2015, September). Recognizing emotion from singing and speaking using shared models. In 2015 International Conference on Affective Computing and Intelligent Interaction (ACII) (pp. 139-145). IEEE.
  • Li, L., Zhao, Y., Jiang, D., Zhang, Y., Wang, F., Gonzalez, I., … & Sahli, H. (2013, September). Hybrid Deep Neural Network--Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition. In 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (pp. 312-317). IEEE.
  • Pao, T. L., Chen, Y. T., Yeh, J. H., & Li, P. J. (2006, August). Mandarin emotional speech recognition based on SVM and NN. In 18th International Conference on Pattern Recognition (ICPR’06) (Vol. 1, pp. 1096-1100). IEEE. Rao, K. S., & Yegnanarayana, B. (2006). Prosody modification using instants of significant excitation. IEEE Transactions on Audio, Speech, and Language Processing, 14(3), 972-980.
  • Deb, S., & Dandapat, S. (2018). Multiscale amplitude feature and significance of enhanced vocal tract information for emotion classification. IEEE transactions on cybernetics, 49(3), 802-815. Khan, O., Al-Khatib, W. G., & Lahouari, C. (2007, December). detection of questions in Arabic audio monologues using prosodic features. In Ninth IEEE International Symposium on Multimedia (ISM 2007) (pp. 29–36). IEEE doi: 10.1109/TCYB.2017.2787717
  • P. Boersma. Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the Institute of Phonetic Sciences, pages 17:97–110. University of Amsterdam, 1993.
  • Dellaert, F., Polzin, T., & Waibel, A. (1996, October). Recognizing emotion in speech. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96 (Vol. 3, pp. 1970-1973). IEEE.
  • Gobl, C., & Chasaide, A. N. (2003). The role of voice quality in communicating emotion, mood and attitude. Speech communication, 40(1-2), 189-212. doi: 10.1016/S0167-6393(02)00082-1
  • Aouani, H., & Ayed, Y. B. (2018, March). Emotion recognition in speech using MFCC with SVM, DSVM and auto-encoder. In 2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) (pp. 1-5). IEEE.
  • Zhao, J., Mao, X., & Chen, L. (2019). Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomedical Signal Processing and Control, 47, 312-323. doi: 10.1016/j.bspc.2018.08.035
  • Tzirakis, P., Zhang, J., & Schuller, B. W. (2018, April). End-to-end speech emotion recognition using deep neural networks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5089-5093). IEEE.
  • Al-Tameem, A. B. A., & Saudagar, A. K. J. (2020). Machine learning approach for identification of threat content in audio messages shared on social media. Journal of Discrete Mathematical Sciences and Cryptography, 23(1), 83-93. doi: 10.1080/09720529.2020.1721876
  • Gupta, V., Singh, V. K., Mukhija, P., & Ghose, U. (2019). Aspect-based sentiment analysis of mobile reviews. Journal of Intelligent & Fuzzy Systems, 36(5), 4721-4730. doi: 10.3233/JIFS-179021

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.