학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

BEVMOT: A Multi-Object Detection and Tracking Method in Bird's-Eye-View via Spatiotemporal Transformers

Resource Type: Conference
Authors: Cui, Chenglin; Song, Ruiqi; Li, XinQing; Ai, Yunfeng
Source: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC) Intelligent Transportation Systems (ITSC), 2023 IEEE 26th International Conference on. :5446-5451 Sep, 2023
Subject: Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineering Profession
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Target tracking
Three-dimensional displays
Cameras
Feature extraction
Trajectory
Task analysis
Autonomous vehicles
Language
ISSN: 2153-0017

Online Access

Full Text (IEEE)

초록

Camera-based Bird'View (BEV) 3D object detection is thoroughly challenging and essential in autonomous driving perception system for alleviating the impression caused by object overlap and occlusion effectively. 3D multi-object tracking is one of the most important perception tasks in autonomous driving system, which suffers from the narrow view provided by only one single camera. In this paper, we present a novel framework called BEVMOT, which combines multi-object detection and tracking task in a unified framework in a considerable inference speed. The encoder of our framework generates the 360 degree panoramic image around the ego-car in the bird's eyes view utilizing the multiple cameras equipped on the car to broaden the perception field. An efficient decoder composes of multi-head attention and deformable attention following by a multi-object detection and tracking head, which learn the object center point and tracking embedding for acquiring object boxes and tracking id directly without the NMS post-processing. Meanwhile, tracking branch utilize tracking embedding to initialize new trajectories, update existing trajectories and achieve data association frame-by-frame. Extensive experiments show that our approach represents wonderful performance on NuScenes datasets, which exceeds many classic methods in term of AMOTA and AMOTP metrics.

공지

DAU Library

학술논문

요약정보

BEVMOT: A Multi-Object Detection and Tracking Method in Bird's-Eye-View via Spatiotemporal Transformers

Online Access

초록