Open Access Open Access  Restricted Access Subscription or Fee Access

Research Paper on Speech Emotion Recognition

Sunil Kumar Giri

Abstract


Communication is the key to clearly expressing your thoughts and ideas. Among all forms of communication, speech is the most preferred and powerful form of communication in human beings. The goal is to provide efficient and natural interaction between humans and computers. Enable computers to understand emotional states expressed by human subjects, so that personalized responses can be provided accordingly. Most studies in the literature focus on recognizing emotions from short, isolated sentences, which makes their practical application difficult. The main objective of being is to improve the man-machine interface. It can also be used to monitor a person's psychophysiological state on lie detectors. To realize a SER system, based on different classifiers and different feature extraction methods, it is developed. By discovering a way to remove random silence from the audio clip, we present a multi-faceted comparison between practical neural network approaches in the recognition of speech emotions. The aim of this study is to provide a study in the field of discrete recognition of speech emotions.


Full Text:

PDF

References


Mustaqeem and S. Kwon. A CNN-assisted enhanced audio signal processing for speech emotion recognition. Sensors. 2020;20(1):183.

J. Huang, B. Chen, et al. ECG arrhythmia classification using STFT-based spectrogram and convolutional neural network. IEEE Access. 2019;7:92871–92880.

T. Hussain, K. Muhammad, et al. Cloud- assisted multiview video summarization using CNN and bidirectional LSTM. IEEE Transactions on Industrial Informatics. 2020;16(1):77-86.

S. U. Khan, I. U. Haq, et al. Cover the violence: A novel Deep- Learning-Based approach towards violencedetection in movies. Applied Sciences. 2019;9(22):4963.

F. Karim, S. Majumdar, et al. Insights into LSTM fully convolutional networks for time series classification. IEEE Access. 2019;7:67718–67725.

Zhang, W. Zhu, et al. Spiking echo state convolutional neural network for robust time series classification. IEEE Access. 2019;7:4927–4935.

P. Tzirakis, J. Zhang, et al. End-to-End speech emotion recognition using deep neural networks. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 15-20 April 2018; Calgary, AB, Canada, New York: IEEE; 2018.

R. A. Khalil, E. Jones, M. I. Babar, T. Jan, M. H. Zafar, and T. Alhussain, ‘‘Speech emotion recognition using deep learning techniques: A review,’’ IEEE Access, vol. 7, pp. 117327–117345, 2019.

M. Badshah, N. Rahim, et al. Deep features-based speech emotion recognition for smart affective services. Multimedia Tools and Applications. 2019;78(5):5571–5589.

M. Sheri, M. A. Rafique, et al. Boosting Discrimination Information Based Document Clustering Using Consensus and Classification. IEEE Access. 2019;7:78954–78962.

M. Capó, A. Pérezet al. An efficient approximation to the K-means clustering for massive data. Knowledge-Based Systems. 2019;117:56-69.

P. K. Mishra, S. K. Nath, et al. Hybrid Gaussian-cubic radial basis functions for scattered data interpolation. Computational Geosciences. 2018;22:1203-1218.

O. Fresnedo, P. Suarez-Casal, et al. Transmission of analog information over the multiple access relay channel using zero-delay nonlinear mappings. IEEE Access. 2019;7:48405–48416.

M. A. Jalal, E. Loweimi, et al. Learning temporal clusters using capsule routing for speech emotion recognition. Proceedings of Interspeech. 15-19 Sep 2019; Graz, Austria. ISCA; pp. 1701-1705.

Bhavan, P. Chauhan, et al. Bagged support vector machines for emotion recognition from speech. Knowledge-Based Systems Volume. 2019;184:104886.




DOI: https://doi.org/10.37628/ijocspl.v7i1.675

Refbacks

  • There are currently no refbacks.