Search in:

Journal of Information and Optimization Sciences Volume 41, 2020 - Issue 6: Applied Machine Learning for IoT and Smart Data Analysis (Part-I)

Journal homepage

136

Views

CrossRef citations to date

Altmetric

Articles

Emotion recognition of audio/speech data using deep learning approaches

Vedika Gupta1 Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Paschim Vihar, New Delhi110063, IndiaCorrespondence[email protected]

Stuti Juyal2 Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Paschim Vihar, New Delhi110063, India E-mail: [email protected]

Gurvinder Pal Singh3 Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Paschim Vihar, New Delhi110063, India E-mail: [email protected]

Chirag Killa4 Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Paschim Vihar, New Delhi110063, India E-mail: [email protected]

Nishant Gupta5 Department of Computer Science and Engineering, Bharati Vidyapeeth’s College of Engineering, Paschim Vihar, New Delhi110063, India E-mail: [email protected]

Pages 1309-1317 | Published online: 07 Sep 2020

Cite this article
https://doi.org/10.1080/02522667.2020.1809089
CrossMark

References
Citations
Metrics
Reprints & Permissions

References

Vogt, T., André, E., & Wagner, J. (2008). Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realization. In Affect and emotion in human-computer interaction (pp. 75-91). Springer, Berlin, Heidelberg.
Google Scholar
Wang, K., An, N., Li, B. N., Zhang, Y., & Li, L. (2015). Speech emotion recognition using Fourier parameters. IEEE Transactions on Affective Computing, 6(1), 69-75. Z. Xiao, E. Dellandrea, W. Dou, and L. Chen, “Features extraction and selection for emotional speech classification,” in Advanced Video and Signal Based Surveillance, 2005. AVSS 2005. IEEE Conference on, 2005, pp. 411-416.
Google Scholar
Olatunji, S. O. (2019). Improved email spam detection model based on support vector machines. Neural Computing and Applications, 31(3), 691-699.
Google Scholar
[4] Koolagudi, S. G., & Rao, K. S. (2010, December). Real life emotion classification using voice and pitch based spectral features. In 2010 Annual IEEE India Conference (INDICON) (pp. 1-4). IEEE.
Google Scholar
Lim, W., Jang, D., & Lee, T. (2016, December). Speech emotion recognition using convolutional and recurrent neural networks. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) (pp. 1-4). IEEE.
Google Scholar
Zheng, W. Q., Yu, J. S., & Zou, Y. X. (2015, September). An experimental study of speech emotion recognition based on deep convolutional neural networks. In 2015 international conference on affective computing and intelligent interaction (ACII) (pp. 827-831). IEEE.
Google Scholar
Zhang, B., Essl, G., & Provost, E. M. (2015, September). Recognizing emotion from singing and speaking using shared models. In 2015 International Conference on Affective Computing and Intelligent Interaction (ACII) (pp. 139-145). IEEE.
Google Scholar
Li, L., Zhao, Y., Jiang, D., Zhang, Y., Wang, F., Gonzalez, I., … & Sahli, H. (2013, September). Hybrid Deep Neural Network--Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition. In 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction (pp. 312-317). IEEE.
Google Scholar
Pao, T. L., Chen, Y. T., Yeh, J. H., & Li, P. J. (2006, August). Mandarin emotional speech recognition based on SVM and NN. In 18th International Conference on Pattern Recognition (ICPR’06) (Vol. 1, pp. 1096-1100). IEEE. Rao, K. S., & Yegnanarayana, B. (2006). Prosody modification using instants of significant excitation. IEEE Transactions on Audio, Speech, and Language Processing, 14(3), 972-980.
Google Scholar
Deb, S., & Dandapat, S. (2018). Multiscale amplitude feature and significance of enhanced vocal tract information for emotion classification. IEEE transactions on cybernetics, 49(3), 802-815. Khan, O., Al-Khatib, W. G., & Lahouari, C. (2007, December). detection of questions in Arabic audio monologues using prosodic features. In Ninth IEEE International Symposium on Multimedia (ISM 2007) (pp. 29–36). IEEE doi: 10.1109/TCYB.2017.2787717
Web of Science ®Google Scholar
P. Boersma. Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In Proceedings of the Institute of Phonetic Sciences, pages 17:97–110. University of Amsterdam, 1993.
Google Scholar
Dellaert, F., Polzin, T., & Waibel, A. (1996, October). Recognizing emotion in speech. In Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96 (Vol. 3, pp. 1970-1973). IEEE.
Google Scholar
Gobl, C., & Chasaide, A. N. (2003). The role of voice quality in communicating emotion, mood and attitude. Speech communication, 40(1-2), 189-212. doi: 10.1016/S0167-6393(02)00082-1
Web of Science ®Google Scholar
Aouani, H., & Ayed, Y. B. (2018, March). Emotion recognition in speech using MFCC with SVM, DSVM and auto-encoder. In 2018 4th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) (pp. 1-5). IEEE.
Google Scholar
Zhao, J., Mao, X., & Chen, L. (2019). Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomedical Signal Processing and Control, 47, 312-323. doi: 10.1016/j.bspc.2018.08.035
Web of Science ®Google Scholar
Tzirakis, P., Zhang, J., & Schuller, B. W. (2018, April). End-to-end speech emotion recognition using deep neural networks. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5089-5093). IEEE.
Google Scholar
Al-Tameem, A. B. A., & Saudagar, A. K. J. (2020). Machine learning approach for identification of threat content in audio messages shared on social media. Journal of Discrete Mathematical Sciences and Cryptography, 23(1), 83-93. doi: 10.1080/09720529.2020.1721876
Web of Science ®Google Scholar
Gupta, V., Singh, V. K., Mukhija, P., & Ghose, U. (2019). Aspect-based sentiment analysis of mobile reviews. Journal of Intelligent & Fuzzy Systems, 36(5), 4721-4730. doi: 10.3233/JIFS-179021
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Emotion recognition of audio/speech data using deep learning approaches

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Emotion recognition of audio/speech data using deep learning approaches

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date