학술논문

A Deep Reinforcement Learning Approach Combined With Model-Based Paradigms for Multiagent Formation Control With Collision Avoidance
Document Type
Periodical
Source
IEEE Transactions on Systems, Man, and Cybernetics: Systems IEEE Trans. Syst. Man Cybern, Syst. Systems, Man, and Cybernetics: Systems, IEEE Transactions on. 53(7):4189-4204 Jul, 2023
Subject
Signal Processing and Analysis
Robotics and Control Systems
Power, Energy and Industry Applications
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
General Topics for Engineers
Collision avoidance
Task analysis
Navigation
Maintenance engineering
Adaptation models
Shape
Predictive models
combined model-based and data-driven
deep reinforcement learning (DRL)
formation control
multiagent system (MAS)
Language
ISSN
2168-2216
2168-2232
Abstract
Generating collision-free formation control strategy for multiagent systems faces huge challenges in collaborative navigation tasks, especially in a highly dynamic and uncertain environment. Two typical methodologies for solving this problem are the conventional model-based paradigm and the data-driven paradigm, particularly the widely used deep reinforcement learning (DRL) method. However, both the model-based and data-driven paradigms encounter inherent drawbacks. In this paper, we present two novel general schemes that combine these two paradigms together in an online mode. Specifically, the two paradigms are combined in a parallel and a serial structure in these two schemes, respectively. In the parallel scheme, the outputs of the model-based and DRL-based controllers are lumped together. In the serial scheme, the output of the model-based controller is fed as an input of the DRL-based controller. The interpretation of the two combined schemes is suggested from a control-oriented perspective, where the parallel DRL controller is viewed as a complementary uncertainty compensator and the serial DRL controller is taken as an inverse dynamics estimator. Finally, comprehensive simulations are conducted to demonstrate the superiority of the proposed schemes, and the effectiveness is further verified by deploying our schemes to a physical experiment platform based on a set of three-wheeled omnidirectional robots.