Speech Emotion Recognition Based on Joint Self-Assessment Manikins and Emotion Labels

Jing Ming Chen, Pao Chi Chang, Kai Wen Liang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this work, we propose a system for speech emotion recognition based on regression models and classification models jointly. This speech emotion recognition technology can achieve the accuracy of 64.70% in the dataset of script and improvised mixed scenes. The accuracy can be up to 66.34% in the dataset with only improvised scenes. Compared to the state-of-art technology without the mental states, the accuracy of the proposed method is increased by 2.95% and 2.09% respect to improvised and mixed scenes. The results show that the characteristics of mental states can effectively improve the performance of speech emotion recognition.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Symposium on Multimedia, ISM 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages327-330
Number of pages4
ISBN (Electronic)9781728156064
DOIs
StatePublished - Dec 2019
Event21st IEEE International Symposium on Multimedia, ISM 2019 - San Diego, United States
Duration: 9 Dec 201911 Dec 2019

Publication series

NameProceedings - 2019 IEEE International Symposium on Multimedia, ISM 2019

Conference

Conference21st IEEE International Symposium on Multimedia, ISM 2019
Country/TerritoryUnited States
CitySan Diego
Period9/12/1911/12/19

Keywords

  • convolutional recurrent neural network
  • deep learning
  • self-assessment manikin
  • speech emotion recognition

Fingerprint

Dive into the research topics of 'Speech Emotion Recognition Based on Joint Self-Assessment Manikins and Emotion Labels'. Together they form a unique fingerprint.

Cite this