학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Facial Action Unit Detection with ViT and Perceiver Using Landmark Patches

Resource Type: Conference
Authors: Cakir, Duygu; Yilmaz, Gorkem; Arica, Nafiz
Source: 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON) Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2021 IEEE 12th Annual. :0281-0285 Oct, 2021
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Gold
Visualization
Transformers
Mobile communication
Feature extraction
Facial muscles
Encoding
FACS
action unit detection
vision transformer
perceiver
Language
ISSN: 2644-3163

Online Access

Full Text (IEEE)

초록

The expressions of the human face are defined by the contraction of facial muscles. The most widely used and accepted standard that provides the description of all visual changes on the face is the Facial Action Coding System (FACS). In this paper, Vision Transformer (ViT) and Perceiver attention mechanisms are individually employed to detect Action Units (AUs) from the whole face on two spontaneous datasets (DISFA, BP4D) and one in-the-wild dataset (EmotioNet) with different patch sizes, then experimented the same attention mechanisms using patches cropped around facial landmarks to examine the improvements on AU detection. The experiments show that ViT and Perceiver attention mechanisms reach, and most of the time outperform, state-of-the-art methods on AU detection on the first set of experiments. However, the most significant performance increase is observed when using only landmark patches as the input sequence to both networks.

공지

DAU Library

학술논문

요약정보

Facial Action Unit Detection with ViT and Perceiver Using Landmark Patches

Online Access

초록