학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

SwinGaze: Egocentric Gaze Estimation with Video Swin Transformer

Resource Type: Conference
Authors: Li, Yujie; Wang, Xinghe; Ma, Zihang; Wang, Yifu; Meyer, Michael C
Source: 2023 IEEE 16th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) MCSOC Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2023 IEEE 16th International Symposium on. :123-127 Dec, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Robotics and Control Systems
Human computer interaction
Multicore processing
Computational modeling
Estimation
Computer architecture
Transformers
Feature extraction
gaze estimation
egocentric video
video swin transformer
Language
ISSN: 2771-3075

Online Access

Full Text (IEEE)

초록

Egocentric gaze estimation represents a challenging and immensely significant task which has promising future applications in areas such as human-computer interaction and AR/VR. In this work, we propose a novel model based on the Video Swin Transformer architecture. Through the introduction of localized inductive bias, our model extracts essential local features from first person videos during the windowed self-attention computation process. Additionally, we approximate the modeling of the global context within the gaze region using a shift window approach. We evaluate our approach on the EGTEA Gaze+ dataset, a publicly available dataset for egocentric activity videos. Experimental results unequivocally demonstrate that our model achieves state-of-the-art performance.

공지

DAU Library

학술논문

요약정보

SwinGaze: Egocentric Gaze Estimation with Video Swin Transformer

Online Access

초록