학술논문

Safe Reinforcement Learning-Based Motion Planning for Functional Mobile Robots Suffering Uncontrollable Mobile Robots
Document Type
Periodical
Source
IEEE Transactions on Intelligent Transportation Systems IEEE Trans. Intell. Transport. Syst. Intelligent Transportation Systems, IEEE Transactions on. 25(5):4346-4363 May, 2024
Subject
Transportation
Aerospace
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Safety
Planning
Mobile robots
Robot kinematics
Task analysis
Reinforcement learning
Collision avoidance
Autonomous mobile robots
control barrier functions
safe reinforcement learning
safe motion planning
uncontrollable robots
Language
ISSN
1524-9050
1558-0016
Abstract
An increasing number of Autonomous Mobile Robots (AMRs) are used in warehouses and factories in recent years. The risk of some of the AMRs being out of control is surging. Although Reinforcement Learning (RL)-based approaches have achieved dramatic success in the motion planning of a large number of AMRs, the available RL-based motion planning approaches cannot provide a safety guarantee for the remaining functional AMRs if some of the AMRs are out of control. To this end, this paper develops a scalable Multi-agent RL (MARL) with Control Barrier Function (CBF)-based shields algorithm. The MARL with CBF-based shields algorithm can address complex high-level tasks by MARL and deal with the safety issue of every single functional AMR by a low-level CBF-based shield. A CBF-based shield is designed for every single functional AMR to ensure that the action of the functional AMR is safe, even if an uncontrollable AMR is pursuing the functional AMR. Experiments are conducted based on simulated warehouse environments to evaluate the effectiveness and scalability of a safe RL-based motion planning approach (The safe RL-based motion planning approach developed in this study is demonstrated in a video: https://youtu.be/I7ja5nFVpY4). developed according to the MARL with CBF-based shields algorithm.