학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Spatio-temporal compression for semi-supervised video object segmentation

Resource Type: Original Paper
Authors: Ji, Chuanjun; Chen, Yadang; Yang, Zhi-Xin; Wu, Enhua
Source: The Visual Computer: International Journal of Computer Graphics. 39(10):4929-4942
Subject: Video object segmentation
External memory
Spatial–temporal redundancy
Memory reading
Language: English
ISSN: 0178-2789
1432-2315

Online Access

초록

In this paper, we explore the spatial–temporal redundancy in video object segmentation (VOS) under semi-supervised context with the purpose to improve the computational efficiency. Recently, memory-based methods have attracted great attention for their excellent performance. These methods involve first constructing an external memory to store the target object information in the history frames and then selecting the information that is beneficial for modeling the target object by memory reading. However, such methods are inefficient and unable to achieve both high accuracy and high efficiency, due to the large amount of redundant information in memory. Moreover, they periodically sample historical frames and add them to memory; this operation may lose important information from dynamic frames with incremental object changing or aggravate temporal redundancy from static frames with no object changing. To address these problems, we propose an efficient semi-supervised VOS approach via spatio-temporal compression (termed as STCVOS). Specifically, we first adopt a temporally varying sensor to adaptively filter static frames with no target objects evolutions and trigger memory to update with frames containing noticeable variations. Furthermore, we propose a spatially compressed memory to absorb features with varied pixels and remove outdated features, which considerably reduces information redundancy. More importantly, we introduce an efficient memory reader to perform memory reading with less footprint and computational overhead. Experimental results indicate that STCVOS performs well against state-of-the-art methods on the DAVIS 2017 and YouTube-VOS datasets, with a J&FFPS overall score of 82.0% and 79.7%, respectively. Meanwhile, STCVOS achieves a high inference speed of approximately 30 J&FFPS.

공지

DAU Library

학술논문

요약정보

Spatio-temporal compression for semi-supervised video object segmentation

Online Access

초록