High-Order Temporal Context-Aware Aerial Tracking with Heterogeneous Visual Experts

Shichao Zhou, Xiangpan Fan, Zhuowei Wang, Wenzheng Wang*, Yunpu Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Visual tracking from the unmanned aerial vehicle (UAV) perspective has been at the core of many low-altitude remote sensing applications. Most of the aerial trackers follow “tracking-by-detection” paradigms or their temporal-context-embedded variants, where the only visual appearance cue is encompassed for representation learning and estimating the spatial likelihood of the target. However, the variation of the target appearance among consecutive frames is inherently unpredictable, which degrades the robustness of the temporal context-aware representation. To address this concern, we advocate extra visual motion exhibiting predictable temporal continuity for complete temporal context-aware representation and introduce a dual-stream tracker involving explicit heterogeneous visual tracking experts. Our technical contributions involve three-folds: (1) high-order temporal context-aware representation integrates motion and appearance cues over a temporal context queue, (2) bidirectional cross-domain refinement enhances feature representation through cross-attention based mutual guidance, and (3) consistent decision-making allows for anti-drifting localization via dynamic gating and failure-aware recovery. Extensive experiments on four UAV benchmarks (UAV123, UAV123@10fps, UAV20L, and DTB70) illustrate that our method outperforms existing aerial trackers in terms of success rate and precision, particularly in occlusion and fast motion scenarios. Such superior tracking stability highlights its potential for real-world UAV applications.

Original languageEnglish
Article number2237
JournalRemote Sensing
Volume17
Issue number13
DOIs
Publication statusPublished - Jul 2025
Externally publishedYes

Keywords

  • decision-making
  • low-altitude remote sensing
  • motion analysis
  • optical tracking
  • temporal reasoning
  • unmanned aerial vehicle

Fingerprint

Dive into the research topics of 'High-Order Temporal Context-Aware Aerial Tracking with Heterogeneous Visual Experts'. Together they form a unique fingerprint.

Cite this