Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
- Resource Type
- Conference
- Authors
- Luo, Haozheng; Qin, Ruiyang; Xu, Chenwei; Ye, Guo; Luo, Zening
- Source
- 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) Robot and Human Interactive Communication (RO-MAN), 2023 32nd IEEE International Conference on. :363-369 Aug, 2023
- Subject
- Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Human-robot interaction
Benchmark testing
Question answering (information retrieval)
Cognition
Robots
- Language
- ISSN
- 1944-9437
In this paper, we introduce a robotic agent specifically designed to analyze external environments and address participants’ questions. The primary focus of this agent is to assist individuals using language-based interactions within video-based scenes. Our proposed method integrates video recognition technology and natural language processing models within the robotic agent. We investigate the crucial factors affecting human-robot interactions by examining pertinent issues arising between participants and robot agents. Methodologically, our experimental findings reveal a positive relationship between trust and interaction efficiency. Furthermore, our model demonstrates a 2% to 3% performance enhancement in comparison to other benchmark methods.