eArticles

Home

eArticles

검색결과 돌아가기

검색화면

Export 프린트

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

Resource Type: Conference
Authors: Xiong, Kaixin; Gong, Shi; Ye, Xiaoqing; Tan, Xiao; Wan, Ji; Ding, Errui; Wang, Jingdong; Bai, Xiang
Source: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) CVPR Computer Vision and Pattern Recognition (CVPR), 2023 IEEE/CVF Conference on. :21570-21579 Jun, 2023
Subject: Computing and Processing
Solid modeling
Computer vision
Three-dimensional displays
Laser radar
Codes
Computational modeling
Object detection
Autonomous driving
Language
ISSN: 2575-7075

Online Access

Full Text (IEEE)

초록

In this paper, we address the problem of detecting 3D ob-jects from multi-view images. Current query-based methods rely on global 3D position embeddings (PE) to learn the ge-ometric correspondence between images and 3D space. We claim that directly interacting 2D image features with global 3D PE could increase the difficulty of learning view trans-formation due to the variation of camera extrinsics. Thus we propose a novel method based on CAmera view Position Embedding, called CAPE. We form the 3D position embed-dings under the local camera-view coordinate system instead of the global coordinate system, such that 3D position em-bedding is free of encoding camera extrinsic parameters. Furthermore, we extend our CAPE to temporal modeling by exploiting the object queries of previous frames and encoding the ego motion for boosting 3D object detection. CAPE achieves the state-of-the-art performance (61.0% NDS and 52.5% mAP) among all LiDAR-free methods on nuScenes dataset. Codes and models are available. 1 1 Codes of Paddle3D and PyTorch Implementation.

공지

DAU Library

eArticles

요약정보

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

Online Access

초록