학술논문

Learning Adaptive Control of a UUV Using a Bio-Inspired Experience Replay Mechanism
Document Type
Periodical
Source
IEEE Access Access, IEEE. 11:123505-123518 2023
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Task analysis
Process control
Training
Current measurement
Adaptive control
Adaptation models
Vehicle dynamics
Deep learning
Reinforcement learning
Autonomous underwater vehicles
Deep reinforcement learning
machine learning
adaptive control
underwater robotics
Language
ISSN
2169-3536
Abstract
Deep Reinforcement Learning (DRL) methods are increasingly being applied in Unmanned Underwater Vehicles (UUV) providing adaptive control responses to environmental disturbances. However, in physical platforms, these methods are hindered by their inherent data inefficiency and performance degradation when subjected to unforeseen process variations. This is particularly notorious in UUV manoeuvring tasks, where process observability is limited due to the complex dynamics of the environment in which these vehicles operate. To overcome these limitations, this paper proposes a novel Biologically-Inspired Experience Replay method (BIER), which considers two types of memory buffers: one that uses incomplete (but recent) trajectories of state-action pairs, and another that emphasises positive rewards. The BIER method’s ability to generalise was assessed by training neural network controllers for tasks such as inverted pendulum stabilisation, hopping, walking, and simulating halfcheetah running from the Gym-based Mujoco continuous control benchmark. BIER was then used with the Soft Actor-Critic (SAC) method on UUV manoeuvring tasks to stabilise the vehicle at a given velocity and pose under unknown environment dynamics. The proposed method was evaluated through simulated scenarios in a ROS-based UUV Simulator, progressively increasing in complexity. These scenarios varied in terms of target velocity values and the intensity of current disturbances. The results showed that BIER outperformed standard Experience Replay (ER) methods, achieving optimal performance twice as fast as the latter in the assumed UUV domain.