학술논문

Safe Adaptive Dynamic Programming for Multiplayer Systems With Static and Moving No-Entry Regions
Document Type
Periodical
Author
Source
IEEE Transactions on Artificial Intelligence IEEE Trans. Artif. Intell. Artificial Intelligence, IEEE Transactions on. 5(5):2079-2092 May, 2024
Subject
Computing and Processing
Dynamic programming
Games
Adaptive systems
Differential games
Kernel
Heuristic algorithms
Safety
Barrier function
multiplayer differential game
optimal avoidance control
reinforcement learning (RL)
safe adaptive dynamic programming (SADP)
state extrapolation
Language
ISSN
2691-4581
Abstract
In recent years, the use of adaptive dynamic programming algorithm to solve the Nash equilibrium problem of multiplayer differential games has received extensive attention. Although approximate solutions can be obtained, the current algorithms have such a premise that the operation domain of the system is completely safe and the probing noise is required to excite the system during learning. To deal with these challenges, this article considers the optimal avoidance control problem that the system needs to avoid multiple static or dynamic no-entry regions while reaching the target point, and thus proposes a safe adaptive dynamic programming approach. First, the optimal avoidance control problem is formulated and multiple no-entry regions are encoded into each player's cost function using the barrier function. Then, a safe adaptive dynamic programming approach is proposed with several novel features, including actor–critic neural networks composed of state-following kernel function, state extrapolation for achieving virtual excitation, and weight tuning laws for executing adaptive learning. Next, this approach is extended to the case of moving regions and some theoretical results are provided. Finally, the proposed safe learning scheme is demonstrated on three simulation examples, and is also compared with other control methods.