학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Privileged Knowledge Distillation for Dimensional Emotion Recognition in the Wild

Resource Type: Conference
Authors: Aslam, Muhammad Haseeb; Osama Zeeshan, Muhammad; Pedersoli, Marco; Koerich, Alessandro L.; Bacon, Simon; Granger, Eric
Source: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) CVPRW Computer Vision and Pattern Recognition Workshops (CVPRW), 2023 IEEE/CVF Conference on. :3338-3347 Jun, 2023
Subject: Computing and Processing
Engineering Profession
Training
Knowledge engineering
Visualization
Emotion recognition
Pain
Semantics
Estimation
Language
ISSN: 2160-7516

Online Access

Full Text (IEEE)

초록

Automated emotion recognition (AER) has a growing number of applications, ranging from behavior analysis in assistive robotics and e-learning to depression and pain estimation healthcare. Systems for multimodal AER typically outperform unimodal approaches due to the complementary and redundant semantic information across modalities like visual, audio, language, physiological, etc. However, in practice, only a subset of these modalities is available at inference time, and using multiple modalities increases systems complexity. This paper focuses on video-based AER and aims to enhance the accuracy of unimodal systems by leveraging the Learning Under Privileged Information (LUPI) paradigm with information from multiple modalities. Without loss of generality, this study considers the audio modality as privileged information (only available during training), and introduces a new multimodal to unimodal privileged knowledge distillation (PKD). The teacher network is comprised of a multimodal AER architecture that can process audio-visual information and distills the learned knowledge to a unimodal visual student network. We validate our proposed multimodal PKD method on the challenging RECOLA and Affwild2 datasets for video-based AER, using weak and strong baseline AER architectures, as well as joint cross-attention fusion methods. The proposed method increases the absolute average concordance correlation coefficient accuracy by 8% on the RECOLA dataset, and by 2% on the arousal dimension of the Affwild2 dataset. The code available at multimodal-pkd.

공지

DAU Library

학술논문

요약정보

Privileged Knowledge Distillation for Dimensional Emotion Recognition in the Wild

Online Access

초록