eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

Video Content Understanding Based on Spatio-Temporal Feature Extraction and Pruning Network

Resource Type: Conference
Authors: Qingnan, Huang; Xiaodong, Cai; Meixin, Zhou; Xiangqing, Wang
Source: 2023 4th International Symposium on Computer Engineering and Intelligent Communications (ISCEIC) Computer Engineering and Intelligent Communications (ISCEIC), 2023 4th International Symposium on. :493-497 Aug, 2023
Subject: Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Training
Adaptation models
Correlation
Video description
Fuses
Computational modeling
Neurons
space-time attention
Feature extraction
Pruning network
Variational method
Video understanding
Language

Online Access

Full Text (IEEE)

초록

The feature extraction method of traditional video description usually does not consider the correlation between temporal feature and spatial feature, resulting in insufficient comprehensive feature extraction and insufficient correlation. At the same time, the existing network parameters make the model over-rely on the parameter weights during training, which is easy to overfit and make the description inaccurate. In this paper, we design a feature extraction network based on spatiotemporal attention and a new pruning strategy to improve the accuracy of description. First of all, the key frames in the video are extracted by using the time-attention mechanism, and then the important regional information and background information of the key frames are extracted. Then the image features are extracted after the two features are closely related by the spatial fusion function. Secondly, by using variational discard method, the model adaptively adjusts the discard rate of neurons to select an optimal value, which effectively solves the problem of overfitting and makes the output description of the model more accurate. Experimental results on MSVD, a dataset widely used in this field, show that the proposed method can significantly improve the accuracy of video description.

공지

DAU Library

eArticles

요약정보

Video Content Understanding Based on Spatio-Temporal Feature Extraction and Pruning Network

Online Access

초록