학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Positional Feature Generator-Based Transformer for Image Captioning

Resource Type: Conference
Authors: He, Shuai; Yang, Xiaobao; Ma, Sugang; Song, Bohui; He, Ziqing; Luo, Wei
Source: 2023 18th International Conference on Intelligent Systems and Knowledge Engineering (ISKE) Intelligent Systems and Knowledge Engineering (ISKE), 2023 18th International Conference on. :418-425 Nov, 2023
Subject: Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Knowledge engineering
Solid modeling
Image coding
Interference
Transformers
Solids
Generators
Image Captioning
Transformer
Positional Encoding
Language

Online Access

Full Text (IEEE)

초록

The Transformer-based architecture achieves state-of-the-art results in image captioning. Due to its non-recurrent nature, additional positional information needs to be provided. However, existing advanced methods attach positional information to the model by additional encoding or embedding, which is independently decoupled from the original input features. In addition, no matter absolute or relative methods, the encodings are fused with input features by add operation, which leads to information interference between the two types of features and affects the performance of the model. In this paper, we propose a novel architecture to remedy the above limitations, called positional feature generator (PFG). This module is effective in modeling image spatial positional frame by graph structure, which can learn absolute position explicitly and relative position implicitly. Meanwhile, we concatenate the captured positional features with the original features, making the positional information as a separate additional feature to avoid feature interference. Extensive experiments on MS COCO validate the effectiveness of PFG. Moreover, PFG outperforms some state-of-the-art positional representation methods, and positional feature generator-based Transformer (PFGT) is competitive with some state-of-the-art image captioning algorithms.

공지

DAU Library

학술논문

요약정보

Positional Feature Generator-Based Transformer for Image Captioning

Online Access

초록