Scalable Cooperative Decision-Making in Multi-UAV Confrontations: An Attention-Based Multi-Agent Actor-Critic Approach

Can Chen; Tao Song; Li Mo; Maolong Lv; Yinan Yu

doi:10.1109/TAES.2025.3571405

Scalable Cooperative Decision-Making in Multi-UAV Confrontations: An Attention-Based Multi-Agent Actor-Critic Approach

Can Chen, Tao Song, Li Mo^*, Maolong Lv, Yinan Yu

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

With the increasing use of unmanned aerial vehicles (UAVs) in military operations, autonomous cooperative decision-making for multiple UAVs in aerial confrontations has become a critical research challenge. This paper presents an attention-based multi-agent actor-critic (AMAAC) algorithm for UAV aerial confrontation decision-making. The algorithm combines multi-head attention and self-play within the centralized training-distributed execution (CTDE) framework, extending the actor-critic approach based on the missile hit probability prediction model (MHPAC) to multi-UAV scenarios. A fighter observation encoder (OFE) and a centralized critic network based on the attention mechanism are introduced to adapt to varying number of UAVs (different scales) and enhance training performance. Additionally, self-play-based extended training is used to generalize offensive and defensive strategies from small-scale aerial confrontations to larger scenarios. Experimental results demonstrate that the AMAAC algorithm achieves superior training effectiveness, and the strategies it produces perform well across various confrontation scales, even beyond the training scenario's scale. Compared to other decision-making algorithms, such as Multi-agent Proximal Policy Optimization (MAPPO), Multi-agent Hierarchical Policy Gradient (MAHPG), and the State-Event-Condition-Action (SECA) algorithm, the AMAAC-trained strategies yield higher win ratios and kill-death ratios in different scenarios, validating the algorithm's effectiveness and scalability.

源语言	英语
期刊	IEEE Transactions on Aerospace and Electronic Systems
DOI	http://doi.org/10.1109/TAES.2025.3571405
出版状态	已接受/待刊 - 2025
已对外发布	是

访问文件

10.1109/TAES.2025.3571405

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{47a5ecabbfbb44938b0940453a141f55,

title = "Scalable Cooperative Decision-Making in Multi-UAV Confrontations: An Attention-Based Multi-Agent Actor-Critic Approach",

abstract = "With the increasing use of unmanned aerial vehicles (UAVs) in military operations, autonomous cooperative decision-making for multiple UAVs in aerial confrontations has become a critical research challenge. This paper presents an attention-based multi-agent actor-critic (AMAAC) algorithm for UAV aerial confrontation decision-making. The algorithm combines multi-head attention and self-play within the centralized training-distributed execution (CTDE) framework, extending the actor-critic approach based on the missile hit probability prediction model (MHPAC) to multi-UAV scenarios. A fighter observation encoder (OFE) and a centralized critic network based on the attention mechanism are introduced to adapt to varying number of UAVs (different scales) and enhance training performance. Additionally, self-play-based extended training is used to generalize offensive and defensive strategies from small-scale aerial confrontations to larger scenarios. Experimental results demonstrate that the AMAAC algorithm achieves superior training effectiveness, and the strategies it produces perform well across various confrontation scales, even beyond the training scenario's scale. Compared to other decision-making algorithms, such as Multi-agent Proximal Policy Optimization (MAPPO), Multi-agent Hierarchical Policy Gradient (MAHPG), and the State-Event-Condition-Action (SECA) algorithm, the AMAAC-trained strategies yield higher win ratios and kill-death ratios in different scenarios, validating the algorithm's effectiveness and scalability.",

keywords = "Aerial confrontation, attention mechanism, reinforcement learning, scalability, unmanned aerial vehicles",

author = "Can Chen and Tao Song and Li Mo and Maolong Lv and Yinan Yu",

note = "Publisher Copyright: {\textcopyright} 1965-2011 IEEE.",

year = "2025",

doi = "10.1109/TAES.2025.3571405",

language = "English",

journal = "IEEE Transactions on Aerospace and Electronic Systems",

issn = "0018-9251",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Scalable Cooperative Decision-Making in Multi-UAV Confrontations

T2 - An Attention-Based Multi-Agent Actor-Critic Approach

AU - Chen, Can

AU - Song, Tao

AU - Mo, Li

AU - Lv, Maolong

AU - Yu, Yinan

PY - 2025

Y1 - 2025

N2 - With the increasing use of unmanned aerial vehicles (UAVs) in military operations, autonomous cooperative decision-making for multiple UAVs in aerial confrontations has become a critical research challenge. This paper presents an attention-based multi-agent actor-critic (AMAAC) algorithm for UAV aerial confrontation decision-making. The algorithm combines multi-head attention and self-play within the centralized training-distributed execution (CTDE) framework, extending the actor-critic approach based on the missile hit probability prediction model (MHPAC) to multi-UAV scenarios. A fighter observation encoder (OFE) and a centralized critic network based on the attention mechanism are introduced to adapt to varying number of UAVs (different scales) and enhance training performance. Additionally, self-play-based extended training is used to generalize offensive and defensive strategies from small-scale aerial confrontations to larger scenarios. Experimental results demonstrate that the AMAAC algorithm achieves superior training effectiveness, and the strategies it produces perform well across various confrontation scales, even beyond the training scenario's scale. Compared to other decision-making algorithms, such as Multi-agent Proximal Policy Optimization (MAPPO), Multi-agent Hierarchical Policy Gradient (MAHPG), and the State-Event-Condition-Action (SECA) algorithm, the AMAAC-trained strategies yield higher win ratios and kill-death ratios in different scenarios, validating the algorithm's effectiveness and scalability.

AB - With the increasing use of unmanned aerial vehicles (UAVs) in military operations, autonomous cooperative decision-making for multiple UAVs in aerial confrontations has become a critical research challenge. This paper presents an attention-based multi-agent actor-critic (AMAAC) algorithm for UAV aerial confrontation decision-making. The algorithm combines multi-head attention and self-play within the centralized training-distributed execution (CTDE) framework, extending the actor-critic approach based on the missile hit probability prediction model (MHPAC) to multi-UAV scenarios. A fighter observation encoder (OFE) and a centralized critic network based on the attention mechanism are introduced to adapt to varying number of UAVs (different scales) and enhance training performance. Additionally, self-play-based extended training is used to generalize offensive and defensive strategies from small-scale aerial confrontations to larger scenarios. Experimental results demonstrate that the AMAAC algorithm achieves superior training effectiveness, and the strategies it produces perform well across various confrontation scales, even beyond the training scenario's scale. Compared to other decision-making algorithms, such as Multi-agent Proximal Policy Optimization (MAPPO), Multi-agent Hierarchical Policy Gradient (MAHPG), and the State-Event-Condition-Action (SECA) algorithm, the AMAAC-trained strategies yield higher win ratios and kill-death ratios in different scenarios, validating the algorithm's effectiveness and scalability.

KW - Aerial confrontation

KW - attention mechanism

KW - reinforcement learning

KW - scalability

KW - unmanned aerial vehicles

UR - http://www.scopus.com/pages/publications/105005781612

U2 - 10.1109/TAES.2025.3571405

DO - 10.1109/TAES.2025.3571405

M3 - Article

AN - SCOPUS:105005781612

SN - 0018-9251

JO - IEEE Transactions on Aerospace and Electronic Systems

JF - IEEE Transactions on Aerospace and Electronic Systems

ER -

Scalable Cooperative Decision-Making in Multi-UAV Confrontations: An Attention-Based Multi-Agent Actor-Critic Approach

摘要

访问文件

其它文件与链接

指纹

引用此