학술논문

Estimating Model Utility for Deformable Object Manipulation Using Multiarmed Bandit Methods
Document Type
Periodical
Source
IEEE Transactions on Automation Science and Engineering IEEE Trans. Automat. Sci. Eng. Automation Science and Engineering, IEEE Transactions on. 15(3):967-979 Jul, 2018
Subject
Robotics and Control Systems
Power, Energy and Industry Applications
Components, Circuits, Devices and Systems
Task analysis
Deformable models
Robots
Data models
Computational modeling
Training
Jacobian matrices
Deformable objects
manipulation
robot learning
Language
ISSN
1545-5955
1558-3783
Abstract
We present a novel approach to deformable object manipulation that does not rely on highly accurate modeling. The key contribution of this paper is to formulate the task as a multiarmed bandit problem, with each arm representing a model of the deformable object. To “pull” an arm and evaluate its utility, we use the arm’s model to generate a velocity command for the gripper(s) holding the object and execute it. As the task proceeds and the object deforms, the utility of each model can change. Our framework estimates these changes and balances the exploration of the model set with exploitation of high-utility models. We also propose an approach based on Kalman filtering for nonstationary multiarmed normal bandits to leverage the coupling between the models to learn more from each arm pull. We demonstrate that our method outperforms previous methods on synthetic trials and performs competitively on several manipulation tasks in simulation. Note to Practitioners —This paper is motivated by the problem of how to choose appropriate parameters for models used in deformable object manipulation. Existing approaches use time-consuming data collection and analysis methods to generate models for a given object and/or task. In contrast, our approach can be applied to multiple deformable objects or tasks without this time-consuming model generation step, learning what models are useful during the execution of the task itself. This enables the faster development of systems for a wide range of tasks, including those where a priori data collection is not possible. In this paper, we are only considering the modeling and control aspects of deformable object manipulation. These ideas can then be integrated with sensing and planning systems for commercial/industrial applications. We briefly discuss some methods and considerations for such an integration at the end of this paper.