Entity Linking Supported Multimodal Data: Fusing Text and Image features for Higher Accuracy
- Resource Type
- Conference
- Authors
- Chen, Yuke; Wu, Peng; Zhao, Xing; Dai, Yang
- Source
- 2023 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML) Image Processing, Computer Vision and Machine Learning (ICICML), 2023 International Conference on. :755-763 Nov, 2023
- Subject
- Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Measurement
Correlation
Logic gates
Feature extraction
Data models
Robustness
Online services
Multimodal Entity Linking
Hierarchical feature extraction
Multimodal fusion
Co-Attention
- Language
To address the problem of the low accuracy of traditional text-only data entity linking methods, this paper proposes a new multimodal entity-linking model that leverages the richness and complementarity of multi-modal information by effectively integrating text and image characteristics to enhance the accuracy of entity linking. The proposed method uses Bert model and the CNN-RNN model to stratify the image and text characteristics that contain the references, respectively; then the mechanism of co-attention and the method of gate fusion are imported to learn the correlation between text and images automatically; and the weight and importance of the characteristics are adjusted to achieve accurate alignment and interaction between the text and the images. Finally, the cosine similarity are used to measure the similarities between the candidate entities and mentions. Furthermore, experimental research is proposed with the RMEL and WMEL Multimodal Entity Linking Dataset.The results show that the proposed method outperforms other entity-linked models.