eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

CLGFormer: Cross-Level-Guided transformer for RGB-D semantic segmentation

Resource Type: Original Paper
Authors: Li, Tao; Zhou, Qunbing; Wu, Dandan; Sun, Mingming; Hu, Ting
Source: Multimedia Tools and Applications: An International Journal. :1-23
Subject: Semantic segmentation
Transformer
Cross-level-guided
Attention mechanism
Language: English
ISSN: 1573-7721

Online Access

초록

RGB-D semantic segmentation has been widely studied and obtained remarkable performance. However, traditional methods fall short in exploiting complementary cues of different modalities. How to effectively fuse multi-modality features and multi-level features, is still a challenging problem in RGB-D semantic segmentation. To address this issue, we propose a novel network named cross-level-guided transformer (CLGFormer). Specifically, we devise a dynamic selection fusion module (DSF) to diminish the data discrepancy during multi-modality feature fusion. It adaptively selects multi-scale RGB features with the guidance of depth and employs channel attention to concentrate on significant channels. To eliminate the semantic gap between low-level detailed features and high-level semantic features, we adopt a cross-level-guided transformer module (CLGT) based on a bi-directional cross guidance strategy. CLGT module explicitly models spatial long-range dependencies and channel inter-dependencies to enhance the efficiency of multi-level feature fusion. Finally, an edge loss is introduced to solve the problem of edge inconsistency. Extensive experiments demonstrate that our CLGFormer outperforms other state-of-the-art methods and obtains 52.0%81.4%57.15% mIoU on NYUv2, 52.0%81.4%57.15% mIoU on Cityscapes and 52.0%81.4%57.15% mIoU on Semantic KITTI datasets.

공지

DAU Library

eArticles

요약정보

CLGFormer: Cross-Level-Guided transformer for RGB-D semantic segmentation

Online Access

초록