학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

Resource Type: Article
Authors: Shang, Heng; Zhao, Guoshuai; Shi, Jing; Qian, Xueming
Source: IEEE Intelligent Systems; 2023, Vol. 38 Issue: 3 p41-50, 10p
Subject
Language
ISSN: 15411672

Online Access

초록

In image-text matching fields, one of the keys to improving performance is to extract features with more semantic information. Existing works demonstrate that semantic enrichment through knowledge expansion can improve performance. Most of them expand image features, however, the shortage of semantic information in text modality and the unilateral character of the view are often bottlenecks that limit the performance of image-text matching models. To solve the two problems, we aggregate knowledge from multiple views and propose a word imagination graph (WIG). A WIG can be used to expand textual semantic information by imagination based on input images. Then, utilizing WIG, we construct a novel multiview text imagination network (MTIN). A MTIN enables latent alignment of images and texts on tags, which can assist matching on a semantic level. Results from the Flickr30K and MS-COCO datasets demonstrate the effectiveness of our method. The source code has been released on GitHub https://github.com/smileslabsh/Multiview-Text-Imagination-Network.

공지

DAU Library

학술논문

요약정보

A Multiview Text Imagination Network Based on Latent Alignment for Image-Text Matching

Online Access

초록