TY - JOUR
T1 - Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation
AU - Ma, Yong Sheng
AU - Sun, Jian
AU - Xu, Yong
AU - Cui, Shi Sheng
AU - Wu, Zheng Guang
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.
AB - In this article, we investigate the optimal control problem for an unknown linear time-invariant system. To solve this problem, a novel composite policy iteration algorithm based on adaptive dynamic programming is developed to adaptively learn the optimal control policy from system data. The existing methods require the initial stabilizing control policy, the persistence of excitation (PE) condition and the data storage to ensure the algorithm convergence. Fundamentally different from them, these restrictions can be relaxed in the proposed method. Specifically, an adaptive parameter is elaborately designed to remove the requirement of the initial stabilizing control policy. Besides, an online data calculation scheme is proposed, which cannot only replace the stored historical data by online data, but also can relax the PE condition to the interval excitation condition. The simulation results demonstrate the efficacy of the proposed algorithm, and its superiority is also demonstrated by comparing it with existing algorithms.
KW - Adaptive dynamic programming (ADP)
KW - optimal control
KW - policy iteration (PI)
UR - http://www.scopus.com/pages/publications/85217909788
U2 - 10.1109/TAC.2025.3542328
DO - 10.1109/TAC.2025.3542328
M3 - Article
AN - SCOPUS:85217909788
SN - 0018-9286
VL - 70
SP - 4896
EP - 4903
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
IS - 7
ER -