Interaction-Assisted Multi-Modal Representation Learning for Recommendation
- Resource Type
- Conference
- Authors
- Wu, Hao; Wang, Jiajie; Zu, Zhonglin
- Source
- ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Acoustics, Speech and Signal Processing (ICASSP), ICASSP 2023 - 2023 IEEE International Conference on. :1-5 Jun, 2023
- Subject
- Bioengineering
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Representation learning
Training
Learning systems
Industries
Visualization
Signal processing
Transformers
- Language
- ISSN
- 2379-190X
Personalized recommender systems have attracted significant attentions from both industry and academic. Recent studies have shed light on incorporating multi-modal side information into the recommender systems to further boost the performance. Meanwhile, transformer-based multi-modal representation learning has shown great enhancement for downstream visual and textual tasks. However, these self-supervised pre-training methods are not tailored for recommendation and may lead to suboptimal representations. To this end, we propose Interaction-Assisted Multi-Modal Representation Learning for Recommendation (IRL) to inject the information of user interactions into item multi-modal representation learning. Specifically, we extract item graph embedding through user-item interactions and then utilize it to formulate a novel triplet IRL training objective which serves as a behavior-aware pre-training task for the representation learning model. A range of experiments have been conducted on several real-world datasets and extensive results indicate the effectiveness of IRL.