Speech Based Emotion Recognition Using Machine Learning

Authors

  • Dr. Loganathan R.  Head of Department, Department of CSE, HKBKCE, Bangalore, Karnataka, India
  • Arjumand Yusufi  Student, Department of CSE, HKBKCE, Bangalore, Karnataka, India
  • Umar Rasool Yetoo  Student, Department of CSE, HKBKCE, Bangalore, Karnataka, India
  • Waseem Ahmed R.  Student, Department of CSE, HKBKCE, Bangalore, Karnataka, India
  • Zaki Hussain  Student, Department of CSE, HKBKCE, Bangalore, Karnataka, India

DOI:

https://doi.org/10.32628/IJSRST229168

Keywords:

Speech Emotion Recognition, Human Computer Interaction, Mel Frequency cepstrum coefficients, Support Vector Machine

Abstract

Emotion is a natural feeling which is distinguished from reasoning or knowledge, it is a strong feeling derived from one’s circumstance or surroundings. With the increase in man to machine interaction, speech analysis has become an integral part in reducing the gap between physical and digital world. An important sub field within this domain is the recognition of emotion in speech signals, which was traditionally studied in linguistics and psychology. Speech emotion recognition is a field having diverse applications. When implemented the Speech Emotion Recognition (SER) will be able to understand different human emotion such as anger, fear, happiness, sadness etc. Speech is a medium of expression of one’s perspective or feelings to other. Emotion recognition from audio signal requires feature extraction and classifier training. The feature vector consists of elements of the audio signal which characterize speaker specific features such as tone, pitch, energy, which is crucial to train the classifier model to recognize a particular emotion accurately. Thus, with the help of SER we can make conversations between human and computer more realistic and natural. Automatic Speech Emotion Recognition (SER) is a current research topic in the field of Human Computer Interaction (HCI) with wide range of applications. The speech features such as, Mel Frequency cepstrum coefficients (MFCC) and Mel Energy Spectrum Dynamic Coefficients (MEDC) are extracted from speech utterance. The Support Vector Machine (SVM) is used as classifier to classify different emotional states such as anger, happiness, sadness, neutral, fear, from Berlin emotional database.

References

  1. M. S. Likitha, S. R. R. Gupta, K. Hasitha and A. U. Raju, "Speech based human emotion recognition using MFCC." 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, 2017, pp. 2257-2260.Christopher. J. C. Burges, A tutorial on support vector machines for pattern recognition, DataMining and Knowledge Discovery, 2(2):955-974, Kluwer Academic Publishers, Boston, 1998.
  2. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., and Taylor, J. G., Emotion recognition in human-computer interaction, IEEE Signal Processing magazine, Vol. 18, No. 1, 32-80, Jan. 2001.
  3. Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines, 2001.
  4. M. D. Skowronski and J. G. Harris, Increased MFCC Filter Bandwidth for Noise-Robust Phoneme Recognition, Proc. ICASSP-02, Florida, May 2002.
  5. Fuhai Li, Jinwen Ma, and Dezhi Huang, MFCC and SVM based recognition of Chinese vowels, Lecture Notes in Artificial Intelligence, vol.3802, 812-819, 2005
  6. Chul Min Lee, and Shrikanth S. Narayanan, “Toward detecting. emotions in spoken dialogs”, IEEE Transaction on Speech and . Audio Processing,, vol. 13, no. 2, pp. 293- 303,Mar. 2005.
  7. Burkhardt, Felix; Paeschke, Astrid; Rolfes, Miriam; Sendlmeier, Walter F.; Weiss, Benjamin A Database of German Emotional Speech. Proceedings of Interspeech, Lissabon, Portugal. 2005
  8. Kamran Soltani, Raja Noor Ainon, “SPEECH EMOTION DETECTION BASED ON NEURAL NETWORKS”, IEEE International Symposium on Signal Processing and Its Applications, ISSPA 2007.

Downloads

Published

2022-02-28

Issue

Section

Research Articles

How to Cite

[1]
Dr. Loganathan R., Arjumand Yusufi, Umar Rasool Yetoo, Waseem Ahmed R., Zaki Hussain "Speech Based Emotion Recognition Using Machine Learning " International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011,Volume 9, Issue 1, pp.324-329, January-February-2022. Available at doi : https://doi.org/10.32628/IJSRST229168