학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Temporal-aware Mechanism with Bidirectional Complementarity for Video Q&A

Resource Type: Conference
Authors: Luo, Yuanmao; Wang, Ruomei; Zhang, Fuwei; Zhou, Fan; Lin, Shujin
Source: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC) Systems, Man, and Cybernetics (SMC), 2022 IEEE International Conference on. :3273-3278 Oct, 2022
Subject: Bioengineering
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Adaptation models
Fuses
Benchmark testing
Feature extraction
Information filters
Cognition
Question answering (information retrieval)
Video question answering
self-attention
graph network
multi-step reasoning
Language
ISSN: 2577-1655

Online Access

Full Text (IEEE)

초록

Video question answering (Video Q&A) is a challenging task as it requires a sufficient understanding of the video and question information. Video is composed of frame sequence, which contains multi-scale temporal relationships and corresponding contextual information. A model competently tackle Video Q&A task that needs to be able to: 1) construct long-term and neighborhood dependencies in frame sequences to extract global and local contextual features that can reflect multi-scale temporal dependencies, and deduce the temporal-aware refined features, and 2) identify static and dynamic features from pertinent moments of a video, while filtering away question-irrelated dependencies of feature sequences, to yield the most precise and reasonable temporal-aware overall contextual features. In response to the above requirements, we propose a novel Video Q&A mechanism which consists of Bidirectional Complementary Attention(BCA) module and Adaptive Temporal-aware(ATA) module. Bidirectional complementary attention module stacks multi-head self-attention layer and convolutional layer in different orders to designed two kinds of attention units, which is able to make bidirectional multi-step reasoning based on complete global information and accurate local information to obtain temporal-aware refined features. Adaptive temporal-aware module is used to filter away question-irrelated dependencies in the feature sequence to yield the most precise and reasonable temporal-aware overall contextual features. Comprehensive comparative experiments are conducted on publicly available benchmark datasets. An extended ablation study is further conducted to show the usefulness of each module of the solution in acquiring its computational Q&A capabilities.

공지

DAU Library

학술논문

요약정보

Temporal-aware Mechanism with Bidirectional Complementarity for Video Q&A

Online Access

초록