Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat

Wenfei Wang; Le Ru; Maolong Lv; Li Mo

doi:10.1016/j.ast.2025.110327

Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat

Wenfei Wang, Le Ru^*, Maolong Lv, Li Mo

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

The environment of beyond-visual-range (BVR) air combat is complex and dynamic, making traditional decision-making methods insufficient for modern combat scenarios. This paper first analyzes the confrontation process in BVR air combat and develops a corresponding decision-making model for air combat. To address the challenge of coupling maneuver and missile launch decisions, we propose a hybrid bifurcation action space design method, allowing for more precise control and improved learning. Additionally, this paper introduces Progressive Opponent Reinforcement Learning (PORL), which incorporates progressively challenging opponents to simulate real-world adversary strategies. Based on the Soft Actor-Critic (SAC) algorithm, this method strengthens the exploration and utilization of learning balance through maximum entropy, and dynamically adjusts the opponent's tactics according to the agent's performance, thus improving the agent's learning efficiency and adaptability in the rapidly changing confrontation environment. Furthermore, a dynamic opponent sampling mechanism is designed to select adversaries with varying difficulty levels based on the agent's current performance, ensuring a balanced training process. Simulation results demonstrate that the proposed decision-making framework significantly improves the autonomous decision-making capabilities and countermeasure effectiveness of agents in BVR air combat.

源语言	英语
文章编号	110327
期刊	Aerospace Science and Technology
卷	163
DOI	http://doi.org/10.1016/j.ast.2025.110327
出版状态	已出版 - 8月 2025
已对外发布	是

访问文件

10.1016/j.ast.2025.110327

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{e3e00fd347744af287fdcacd86686953,

title = "Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat",

abstract = "The environment of beyond-visual-range (BVR) air combat is complex and dynamic, making traditional decision-making methods insufficient for modern combat scenarios. This paper first analyzes the confrontation process in BVR air combat and develops a corresponding decision-making model for air combat. To address the challenge of coupling maneuver and missile launch decisions, we propose a hybrid bifurcation action space design method, allowing for more precise control and improved learning. Additionally, this paper introduces Progressive Opponent Reinforcement Learning (PORL), which incorporates progressively challenging opponents to simulate real-world adversary strategies. Based on the Soft Actor-Critic (SAC) algorithm, this method strengthens the exploration and utilization of learning balance through maximum entropy, and dynamically adjusts the opponent's tactics according to the agent's performance, thus improving the agent's learning efficiency and adaptability in the rapidly changing confrontation environment. Furthermore, a dynamic opponent sampling mechanism is designed to select adversaries with varying difficulty levels based on the agent's current performance, ensuring a balanced training process. Simulation results demonstrate that the proposed decision-making framework significantly improves the autonomous decision-making capabilities and countermeasure effectiveness of agents in BVR air combat.",

keywords = "Beyond visual range air combat, Maneuver decision-making, Opponent learning, Reinforcement learning (RL)",

author = "Wenfei Wang and Le Ru and Maolong Lv and Li Mo",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier Masson SAS",

year = "2025",

month = aug,

doi = "10.1016/j.ast.2025.110327",

language = "English",

volume = "163",

journal = "Aerospace Science and Technology",

issn = "1270-9638",

publisher = "Elsevier Masson s.r.l.",

}

TY - JOUR

T1 - Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat

AU - Wang, Wenfei

AU - Ru, Le

AU - Lv, Maolong

AU - Mo, Li

PY - 2025/8

Y1 - 2025/8

N2 - The environment of beyond-visual-range (BVR) air combat is complex and dynamic, making traditional decision-making methods insufficient for modern combat scenarios. This paper first analyzes the confrontation process in BVR air combat and develops a corresponding decision-making model for air combat. To address the challenge of coupling maneuver and missile launch decisions, we propose a hybrid bifurcation action space design method, allowing for more precise control and improved learning. Additionally, this paper introduces Progressive Opponent Reinforcement Learning (PORL), which incorporates progressively challenging opponents to simulate real-world adversary strategies. Based on the Soft Actor-Critic (SAC) algorithm, this method strengthens the exploration and utilization of learning balance through maximum entropy, and dynamically adjusts the opponent's tactics according to the agent's performance, thus improving the agent's learning efficiency and adaptability in the rapidly changing confrontation environment. Furthermore, a dynamic opponent sampling mechanism is designed to select adversaries with varying difficulty levels based on the agent's current performance, ensuring a balanced training process. Simulation results demonstrate that the proposed decision-making framework significantly improves the autonomous decision-making capabilities and countermeasure effectiveness of agents in BVR air combat.

AB - The environment of beyond-visual-range (BVR) air combat is complex and dynamic, making traditional decision-making methods insufficient for modern combat scenarios. This paper first analyzes the confrontation process in BVR air combat and develops a corresponding decision-making model for air combat. To address the challenge of coupling maneuver and missile launch decisions, we propose a hybrid bifurcation action space design method, allowing for more precise control and improved learning. Additionally, this paper introduces Progressive Opponent Reinforcement Learning (PORL), which incorporates progressively challenging opponents to simulate real-world adversary strategies. Based on the Soft Actor-Critic (SAC) algorithm, this method strengthens the exploration and utilization of learning balance through maximum entropy, and dynamically adjusts the opponent's tactics according to the agent's performance, thus improving the agent's learning efficiency and adaptability in the rapidly changing confrontation environment. Furthermore, a dynamic opponent sampling mechanism is designed to select adversaries with varying difficulty levels based on the agent's current performance, ensuring a balanced training process. Simulation results demonstrate that the proposed decision-making framework significantly improves the autonomous decision-making capabilities and countermeasure effectiveness of agents in BVR air combat.

KW - Beyond visual range air combat

KW - Maneuver decision-making

KW - Opponent learning

KW - Reinforcement learning (RL)

UR - http://www.scopus.com/pages/publications/105005206993

U2 - 10.1016/j.ast.2025.110327

DO - 10.1016/j.ast.2025.110327

M3 - Article

AN - SCOPUS:105005206993

SN - 1270-9638

VL - 163

JO - Aerospace Science and Technology

JF - Aerospace Science and Technology

M1 - 110327

ER -

Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat

摘要

访问文件

其它文件与链接

指纹

引用此