학술논문

Bidirectional Progressive Neural Networks With Episodic Return Progress for Emergent Task Sequencing and Robotic Skill Transfer
Document Type
Periodical
Source
IEEE Access Access, IEEE. 12:69690-69699 2024
Subject
Aerospace
Bioengineering
Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Engineered Materials, Dielectrics and Plasmas
Engineering Profession
Fields, Waves and Electromagnetics
General Topics for Engineers
Geoscience
Nuclear Engineering
Photonics and Electrooptics
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Task analysis
Robots
Multitasking
Training
Switches
Reinforcement learning
Sequential analysis
Robot learning
Knowledge transfer
Deep learning
Transfer learning
Multi-task learning
reinforcement learning
robot learning
intrinsic motivation
knowledge transfer
deep learning
cognitive robotics
transfer learning
Language
ISSN
2169-3536
Abstract
Human brain and behavior provide a rich venue that can inspire novel control and learning methods for robotics. In an attempt to exemplify such a development by inspiring how humans acquire knowledge and transfer skills among tasks, we introduce a novel multi-task reinforcement learning framework named Episodic Return Progress with Bidirectional Progressive Neural Networks (ERP-BPNN). The proposed ERP-BPNN model 1) learns in a human-like interleaved manner by 2) autonomous task switching based on a novel intrinsic motivation signal and, in contrast to existing methods, 3) allows bidirectional skill transfer among tasks. ERP-BPNN is a general architecture applicable to several multi-task learning settings; in this paper, we present the details of its neural architecture and show its ability to enable effective learning and skill transfer among morphologically different robots in a reaching task. The developed Bidirectional Progressive Neural Network (BPNN) architecture enables bidirectional skill transfer without requiring incremental training and seamlessly integrates with online task arbitration. The task arbitration mechanism developed is based on soft Episodic Return progress (ERP), a novel intrinsic motivation (IM) signal. To evaluate our method, we use quantifiable robotics metrics such as ‘expected distance to goal’ and ‘path straightness’ in addition to the usual reward-based measure of episodic return common in reinforcement learning. With simulation experiments, we show that ERP-BPNN achieves faster cumulative convergence and improves performance in all metrics considered among morphologically different robots compared to the baselines. Overall, our method provides a human-inspired and efficient multi-task reinforcement learning approach with interleaved learning, making it highly suitable for lifelong learning applications.