학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation

Resource Type: Conference
Authors: Khan, Rashid; Huang, Bingding; Hassan, Haseeb; Zaman, Asim; Ye, Zhongfu
Source: 2023 5th International Conference on Robotics and Computer Vision (ICRCV) Robotics and Computer Vision (ICRCV), 2023 5th International Conference on. :92-99 Sep, 2023
Subject: Computing and Processing
Robotics and Control Systems
Computer vision
Computational modeling
Feature extraction
Natural language processing
Decoding
Convolutional neural networks
Task analysis
Image captioning
Attention mechanism
Inception V3
Convolutional Neural Network
GRU
Language

Online Access

Full Text (IEEE)

초록

Image captioning is a challenging task involving generating a textual description for an image using good computer vision and natural language processing techniques. This paper proposes a deep neural framework for image caption generation using a GRU-based attention mechanism. Our approach employs multiple pre-trained convolutional neural networks as the encoder to extract features from the image and a GRU-based language model as the decoder to generate descriptive sentences. To improve performance, we integrate the Bahdanau attention model with the GRU decoder to enable learning to focus on specific image parts. We evaluate our approach using the MSCOCO and Flickr30k datasets and show that it achieves competitive scores compared to state-of-the-art methods. Our proposed framework can bridge the gap between computer vision and natural language and can be extended to specific domains.

공지

DAU Library

학술논문

요약정보

A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation

Online Access

초록