eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Speaker-Invariant Affective Representation Learning via Adversarial Training

Resource Type: Conference
Authors: Li, Haoqi; Tu, Ming; Huang, Jing; Narayanan, Shrikanth; Georgiou, Panayiotis
Source: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2020 - 2020 IEEE International Conference on. :7144-7148 May, 2020
Subject: Signal Processing and Analysis
Training
Emotion recognition
Speech recognition
Machine learning
Speaker recognition
Task analysis
Speech processing
Speech emotion recognition
adversarial training
speaker invariant
affective representation
Language
ISSN: 2379-190X

Online Access

Full Text (IEEE)

초록

Representation learning for speech emotion recognition is challenging due to labeled data sparsity issue and lack of gold-standard references. In addition, there is much variability from input speech signals, human subjective perception of the signals and emotion label ambiguity. In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect of speaker variability in the speech signals. Specifically we propose to disentangle the speaker characteristics from emotion through an adversarial training network in order to better represent emotion. Our method combines the gradient reversal technique with an entropy loss function to remove such speaker information. Our approach is evaluated on both IEMOCAP and CMU-MOSEI datasets. We show that our method improves speech emotion classification and increases generalization to unseen speakers.

공지

DAU Library

eArticles

요약정보

Speaker-Invariant Affective Representation Learning via Adversarial Training

Online Access

초록