ACNTrack: Agent cross-attention guided Multimodal Multi-Object Tracking with Neural Kalman Filter

Lian Zhang, Lingxue Wang*, Yuzhen Wu, Mingkun Chen, Dezhi Zheng, Yi Cai

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Exploring and associating the complementary information from visible, thermal infrared, and low-light images is crucial for advancing Multimodal Multi-Object Tracking (MMOT). While previous studies have shown that efficient feature fusion modules can bolster tracking performance in complex environments, these methods often encounter constraints in global feature interaction and computational efficiency. We present a novel multimodal multi-object tracker based on a tracking-by-detection paradigm, comprising a multimodal detector and a data associator. A dual cross-attention feature fusion detection framework, predicated on an agent attention mechanism, is introduced to enhance feature interaction efficiency and effectively capture cross-modal complementary information. To more accurately capture detailed and complex information inherent in each modality, we propose a Feature Pyramid Shared Convolution (FPS-Conv) operation to supersede the Spatial Pyramid Pooling Fast (SPPF) operation within the detector. Additionally, a Neural Kalman Filter (NKF) is developed to augment the performance of the data associator, which dynamically adjusts process and observation noise in accordance with the current motion state. Our innovative fusion architecture significantly reduces computational complexity while maintaining high-quality feature interactions, and our proposed NKF demonstrates superior performance in handling diverse motion patterns compared to traditional fixed-parameter approaches. Experimental results validate these advantages, with our proposed method achieving state-of-the-art results on the KAIST, FLIR, and UniRTL test datasets and demonstrated competitive performance on the VT-MOT dataset.

源语言英语
文章编号130811
期刊Neurocomputing
650
DOI
出版状态已出版 - 14 10月 2025

指纹

探究 'ACNTrack: Agent cross-attention guided Multimodal Multi-Object Tracking with Neural Kalman Filter' 的科研主题。它们共同构成独一无二的指纹。

引用此