학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

High-level Visual Representation via Perceptual Representation Learning

Resource Type: Conference
Authors: Lee, Donghun; Kim, Seonghyun; Noh, Samyeul; Bae, Heechul; Jang, Ingook
Source: 2023 14th International Conference on Information and Communication Technology Convergence (ICTC) Information and Communication Technology Convergence (ICTC), 2023 14th International Conference on. :1793-1795 Oct, 2023
Subject: Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Fields, Waves and Electromagnetics
Power, Energy and Industry Applications
Signal Processing and Analysis
Transportation
Representation learning
Training
Visualization
Object oriented modeling
Predictive models
Feature extraction
Information and communication technology
Representation Learning
Object-centric
Generalization
Image Analysis
Language
ISSN: 2162-1241

Online Access

Full Text (IEEE)

초록

Recent advancements in the field of representation learning and video prediction have demonstrated the potential for enhancing manipulation and control strategies across various applications through precise anticipation of future states. Nevertheless, the intricate dynamic nature inherent in real-world data poses a formidable challenge in acquiring these representations. Autoregressive models, which employ the generated future frame as input for the subsequent frame prediction, suffer from issues such as compounding errors, memory overload, and extended training times due to the need for reconstructing the state from the latent vector in each iteration. To address these limitations, recent studies have introduced the concept of State Space Models (SSMs) to forecast from the latent space, offering the advantage of predicting distant future states. However, these methodologies often exhibit restricted capabilities in extracting object-centric representations. More recent object-centric approaches concentrate on closely associated features from the input data, yet their ability to capture higher-level representations remains constrained. In this paper, we propose integrating a perceptual network into the slot attention mechanism to facilitate the extraction and segregation of high-level representations. Leveraging a pre-trained perceptual network, we derive elevated object-oriented representations for each perceptual layer, aligning them with corresponding slots. This elevated representation, rich in object-centric information, holds the potential to enhance comprehension of the present state and provide valuable guidance for accurate future state prediction.

공지

DAU Library

학술논문

요약정보

High-level Visual Representation via Perceptual Representation Learning

Online Access

초록