基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法

Chuanhao Li; Zhenjun Ming; Guoxin Wang; Yan Yan; Wei Ding; Silai Wan; Tao Ding

doi:10.12382/bgxb.2024.0251

基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法

Chuanhao Li, Zhenjun Ming^*, Guoxin Wang, Yan Yan, Wei Ding, Silai Wan, Tao Ding

^*此作品的通讯作者

机械与车辆学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Chaff centroid jamming of unmanned platform is an important means of missile terminal defense. The intelligent decision-making ability in platform maneuvering and chaff launching is an important factor to determine whether the strategic assets can be protected successfully. The current decision-making methods,such as computational analysis based on mechanism model and space exploration based on heuristic algorithm,have the problems of low degree of intelligence,poor adaptability and slow decision-making speed. A dynamic decision-making method of chaff jamming for terminal defense based on multi-agent deep reinforcement learning is proposed. The problem of cooperative chaff jamming of multi-platform for terminal defense is defined,and a simulation environment is constructed. The missile guidance and fuze model,unmanned jamming platform maneuvering model,chaff diffusion model and centroid jamming model are established. The centroid jamming decision problem is transformed into a Markov decision problem,a decision-making agent is constructed,the state and action spaces are defined,and a reward function is set. The decision-making agent is trained by using the multi-agent proximal policy optimization (MAPPO) algorithm. The simulated results show that the proposed method reduces the training time by 85. 5% and increases the success rate of asset protection by 3. 84 compared with the multi-agent deep deterministic policy gradient (MADDPG) algorithm. Compared with the GA,it reduces the deciding time by 99. 96 % and increases the success rate of asset protection 1. 12.

投稿的翻译标题	Dynamic Decision-making Method of Unmanned Platform Chaff Jamming for Terminal Defense Based on Multi-agent Deep Reinforcement Learning
源语言	繁体中文
文章编号	240251
期刊	Binggong Xuebao/Acta Armamentarii
卷	46
期	3
DOI	http://doi.org/10.12382/bgxb.2024.0251
出版状态	已出版 - 31 3月 2025

关键词

centroid jamming
chaff jamming
electronic countermeasure
multi-agent reinforcement learning
terminal defense
unmanned platform

访问文件

10.12382/bgxb.2024.0251

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{d9f31ac1881c47ecbe581606fb534d83,

title = "基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法",

abstract = "Chaff centroid jamming of unmanned platform is an important means of missile terminal defense. The intelligent decision-making ability in platform maneuvering and chaff launching is an important factor to determine whether the strategic assets can be protected successfully. The current decision-making methods,such as computational analysis based on mechanism model and space exploration based on heuristic algorithm,have the problems of low degree of intelligence,poor adaptability and slow decision-making speed. A dynamic decision-making method of chaff jamming for terminal defense based on multi-agent deep reinforcement learning is proposed. The problem of cooperative chaff jamming of multi-platform for terminal defense is defined,and a simulation environment is constructed. The missile guidance and fuze model,unmanned jamming platform maneuvering model,chaff diffusion model and centroid jamming model are established. The centroid jamming decision problem is transformed into a Markov decision problem,a decision-making agent is constructed,the state and action spaces are defined,and a reward function is set. The decision-making agent is trained by using the multi-agent proximal policy optimization (MAPPO) algorithm. The simulated results show that the proposed method reduces the training time by 85. 5\% and increases the success rate of asset protection by 3. 84 compared with the multi-agent deep deterministic policy gradient (MADDPG) algorithm. Compared with the GA,it reduces the deciding time by 99. 96 \% and increases the success rate of asset protection 1. 12.",

keywords = "centroid jamming, chaff jamming, electronic countermeasure, multi-agent reinforcement learning, terminal defense, unmanned platform",

author = "Chuanhao Li and Zhenjun Ming and Guoxin Wang and Yan Yan and Wei Ding and Silai Wan and Tao Ding",

year = "2025",

month = mar,

day = "31",

doi = "10.12382/bgxb.2024.0251",

language = "繁体中文",

volume = "46",

journal = "Binggong Xuebao/Acta Armamentarii",

issn = "1000-1093",

publisher = "China Ordnance Society",

number = "3",

}

TY - JOUR

T1 - 基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法

AU - Li, Chuanhao

AU - Ming, Zhenjun

AU - Wang, Guoxin

AU - Yan, Yan

AU - Ding, Wei

AU - Wan, Silai

AU - Ding, Tao

PY - 2025/3/31

Y1 - 2025/3/31

N2 - Chaff centroid jamming of unmanned platform is an important means of missile terminal defense. The intelligent decision-making ability in platform maneuvering and chaff launching is an important factor to determine whether the strategic assets can be protected successfully. The current decision-making methods,such as computational analysis based on mechanism model and space exploration based on heuristic algorithm,have the problems of low degree of intelligence,poor adaptability and slow decision-making speed. A dynamic decision-making method of chaff jamming for terminal defense based on multi-agent deep reinforcement learning is proposed. The problem of cooperative chaff jamming of multi-platform for terminal defense is defined,and a simulation environment is constructed. The missile guidance and fuze model,unmanned jamming platform maneuvering model,chaff diffusion model and centroid jamming model are established. The centroid jamming decision problem is transformed into a Markov decision problem,a decision-making agent is constructed,the state and action spaces are defined,and a reward function is set. The decision-making agent is trained by using the multi-agent proximal policy optimization (MAPPO) algorithm. The simulated results show that the proposed method reduces the training time by 85. 5% and increases the success rate of asset protection by 3. 84 compared with the multi-agent deep deterministic policy gradient (MADDPG) algorithm. Compared with the GA,it reduces the deciding time by 99. 96 % and increases the success rate of asset protection 1. 12.

AB - Chaff centroid jamming of unmanned platform is an important means of missile terminal defense. The intelligent decision-making ability in platform maneuvering and chaff launching is an important factor to determine whether the strategic assets can be protected successfully. The current decision-making methods,such as computational analysis based on mechanism model and space exploration based on heuristic algorithm,have the problems of low degree of intelligence,poor adaptability and slow decision-making speed. A dynamic decision-making method of chaff jamming for terminal defense based on multi-agent deep reinforcement learning is proposed. The problem of cooperative chaff jamming of multi-platform for terminal defense is defined,and a simulation environment is constructed. The missile guidance and fuze model,unmanned jamming platform maneuvering model,chaff diffusion model and centroid jamming model are established. The centroid jamming decision problem is transformed into a Markov decision problem,a decision-making agent is constructed,the state and action spaces are defined,and a reward function is set. The decision-making agent is trained by using the multi-agent proximal policy optimization (MAPPO) algorithm. The simulated results show that the proposed method reduces the training time by 85. 5% and increases the success rate of asset protection by 3. 84 compared with the multi-agent deep deterministic policy gradient (MADDPG) algorithm. Compared with the GA,it reduces the deciding time by 99. 96 % and increases the success rate of asset protection 1. 12.

KW - centroid jamming

KW - chaff jamming

KW - electronic countermeasure

KW - multi-agent reinforcement learning

KW - terminal defense

KW - unmanned platform

UR - http://www.scopus.com/pages/publications/105001473372

U2 - 10.12382/bgxb.2024.0251

DO - 10.12382/bgxb.2024.0251

M3 - 文章

AN - SCOPUS:105001473372

SN - 1000-1093

VL - 46

JO - Binggong Xuebao/Acta Armamentarii

JF - Binggong Xuebao/Acta Armamentarii

IS - 3

M1 - 240251

ER -

基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法

摘要

关键词

访问文件

其它文件与链接

指纹

引用此