학술논문

Neural reinforcement learning to swing-up and balance a real pole
Document Type
Conference
Source
2005 IEEE International Conference on Systems, Man and Cybernetics System, Man and Cybernetics Systems, Man and Cybernetics, 2005 IEEE International Conference on. 4:3191-3196 Vol. 4 2005
Subject
Robotics and Control Systems
Computing and Processing
Components, Circuits, Devices and Systems
Learning
Multilayer perceptrons
Neural networks
Stochastic systems
Multi-layer neural network
State-space methods
Acceleration
Regression tree analysis
Stress
Algorithm design and analysis
Language
ISSN
1062-922X
Abstract
This paper proposes a neural network based reinforcement learning controller that is able to learn control policies in a highly data efficient manner. This allows to apply reinforcement learning directly to real plants -neither a transition model nor a simulation model of the plant is needed for training. The only training information provided to the controller are transition experiences collected from interactions with the real plant. By storing these transition experiences explicitly, they can be reconsidered for updating the neural Q-function in every training step. This results in a stable learning process of a neural Q-value function. The algorithm is applied to learn the highly nonlinear and noisy task of swinging-up and balancing a real inverted pendulum. The amount of real time interaction needed to learn a highly effective policy from scratch was less than 14 minutes.