Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation

Yong Sheng Ma; Jian Sun; Yong Xu; Shi Sheng Cui; Zheng Guang Wu

doi:10.1109/TAC.2025.3542328

Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation

Yong Sheng Ma, Jian Sun, Yong Xu^*, Shi Sheng Cui, Zheng Guang Wu

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.

源语言	英语
页（从-至）	4896-4903
页数	8
期刊	IEEE Transactions on Automatic Control
卷	70
期	7
DOI	http://doi.org/10.1109/TAC.2025.3542328
出版状态	已出版 - 2025
已对外发布	是

访问文件

10.1109/TAC.2025.3542328

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f380a4fd86b64737b95303c11718ddf5,

title = "Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation",

abstract = "In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.",

keywords = "Adaptive dynamic programming (ADP), optimal control, policy iteration (PI)",

author = "Ma, \{Yong Sheng\} and Jian Sun and Yong Xu and Cui, \{Shi Sheng\} and Wu, \{Zheng Guang\}",

note = "Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2025",

doi = "10.1109/TAC.2025.3542328",

language = "English",

volume = "70",

pages = "4896--4903",

journal = "IEEE Transactions on Automatic Control",

issn = "0018-9286",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "7",

}

TY - JOUR

T1 - Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation

AU - Ma, Yong Sheng

AU - Sun, Jian

AU - Xu, Yong

AU - Cui, Shi Sheng

AU - Wu, Zheng Guang

PY - 2025

Y1 - 2025

N2 - In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.

AB - In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.

KW - Adaptive dynamic programming (ADP)

KW - optimal control

KW - policy iteration (PI)

UR - http://www.scopus.com/pages/publications/85217909788

U2 - 10.1109/TAC.2025.3542328

DO - 10.1109/TAC.2025.3542328

M3 - Article

AN - SCOPUS:85217909788

SN - 0018-9286

VL - 70

SP - 4896

EP - 4903

JO - IEEE Transactions on Automatic Control

JF - IEEE Transactions on Automatic Control

IS - 7

ER -

Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation

摘要

访问文件

其它文件与链接

指纹

引用此