학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Multi-level Relational Reasoning and Attention Network for Medical Visual Question Answering

Resource Type: Conference
Authors: Jia, Xiteng; Pan, Haiwei; Zhang, Kejia; Feng, Yuechun; He, Shuning; Gong, Cheng
Source: 2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) HPCC-DSS-SMARTCITY-DEPENDSYS High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), 2023 IEEE International Conference on. :655-662 Dec, 2023
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Visualization
Vocabulary
Semantics
Object detection
Feature extraction
Transformers
Cognition
Medical VQA
Attention
Graph
GCN
Realtional Reasoning
Language

Online Access

Full Text (IEEE)

초록

Medical Visual Question Answering (Med-VQA) aims to address clinical questions using medical radiological images. However, existing studies have mainly focused on in-putting visual and textual features into attention-based network structures (such as Transformer), neglecting the more advanced relational features present in radiological images. Therefore, we propose a Med- Vqamodel based on attention and visual relational reasoning. Firstly, we introduce a Bidirectional-guided Attention Module, enabling the model not only to use questions to guide attention to important regions in the image, but also to utilize the image to guide attention to key vocabulary in the question. Secondly, we design a Multi-level Visual Relational Module that models both global and local visual features. We employ a graph convolutional neural network to extract latent visual relational features, enriching the semantic information con-tained in visual features. Experimental results on the VQA-RAD dataset and SLAKE dataset show that our model outperforms other state-of-the-art Med- Vqamodels with an overall accuracy of 77.2 % and 80.3 %, respectively, while guaranteeing the model performance.

공지

DAU Library

학술논문

요약정보

Multi-level Relational Reasoning and Attention Network for Medical Visual Question Answering

Online Access

초록