Abstract
With the rapid advancement of autonomous driving technology, 3-D multiobject tracking (MOT) has become critical for enhancing the safety of intelligent vehicles. The tracking by detection (TBD) framework improves 3-D tracking performance with simplicity and efficiency. However, existing light detection and ranging (LiDAR)-camera systems struggle in complex scenarios, and millimeter wave radar’s (radar) high-precision motion data remain underutilized. This article presents a multimodal 3-D MOT model that integrates LiDAR, camera, and radar data to overcome these limitations and enhance tracking performance across diverse environments. The proposed model leverages the radar’s high-precision motion data to design an adaptive refinement of confidence scores and adaptive-nonmaximum suppression (A-NMS), improving the accuracy of the preprocessing stage. Additionally, it incorporates LiDAR and camera features to enhance data association and trajectory estimation. A graph attention network (GAT)-based radar modality feature extraction network further improves tracking accuracy by refining motion metrics. Extensive experiments on the nuScenes dataset show that our model achieves a new state-of-the-art (SOTA) tracking accuracy of 74.6% AMOTA, 288 IDS, and 0.514 m AMOTP, along with a real-time performance of 21.8 Hz. Furthermore, in tests of generalizability across various real-vehicle scenarios, our model demonstrates exceptional tracking accuracy, real-time performance, and robustness.
Original language | English |
---|---|
Pages (from-to) | 16310-16320 |
Number of pages | 11 |
Journal | IEEE Sensors Journal |
Volume | 25 |
Issue number | 9 |
DOIs | |
Publication status | Published - 2025 |
Externally published | Yes |
Keywords
- 3-D multiobject tracking (MOT)
- adaptive enhancement
- millimeter wave radar
- sensor fusion