eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Multi-modal Representation Learning for Short Video Understanding and Recommendation

Resource Type: Conference
Authors: Guo, Daya; Hong, Jiangshui; Luo, Binli; Yan, Qirui; Niu, Zhangming
Source: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) Multimedia & Expo Workshops (ICMEW), 2019 IEEE International Conference on. :687-690 Jul, 2019
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Signal Processing and Analysis
Multi-modal Representation
Factorization Machine
Key-Value Memory
Word2Vec
DeepWalk
Language

Online Access

Full Text (IEEE)

초록

We study the task of short video understanding and recommendation which predicts the user's preference based on multimodal contents, including visual features, text features, audio features and user interactive history. In this paper, we present a multi-modal representation learning method to improve the performance of recommender systems. The method first converts multi-modal contents into vectors in the embedding space, and then concatenates these vectors as the input of a multi-layer perceptron to make prediction. We also propose a novel Key-Value Memory to map dense real-values into vectors, which could obtain more sufficient semantic in a nonlinear manner. Experimental results show that our representation significantly improves several baselines and achieves the superior performance on the dataset of ICME 2019 Short Video Understanding and Recommendation Challenge.

공지

DAU Library

eArticles

요약정보

Multi-modal Representation Learning for Short Video Understanding and Recommendation

Online Access

초록