학술논문

Programmatic Imitation Learning From Unlabeled and Noisy Demonstrations

Document Type

Periodical

Author

Xin, J.; Zheng, L.; Rahmani, K.; Wei, J.; Holtz, J.; Dillig, I.; Biswas, J.

Source

IEEE Robotics and Automation Letters IEEE Robot. Autom. Lett. Robotics and Automation Letters, IEEE. 9(6):4894-4901 Jun, 2024

Subject

Robotics and Control Systems
Computing and Processing
Components, Circuits, Devices and Systems
Task analysis
Trajectory
Probabilistic logic
Noise measurement
Training data
Noise
Approximation algorithms
Computer science
formal languages
machine learning
programming
representation learning
robot learning
robot programming

Language

ISSN

2377-3766
2377-3774

Abstract

Imitation Learning (IL) is a promising paradigm for teaching robots to perform novel tasks using demonstrations. Most existing approaches for IL utilize neural networks (NN), however, these methods suffer from several well-known limitations: they 1) require large amounts of training data, 2) are hard to interpret, and 3) are hard to refine and adapt. There is an emerging interest in Programmatic Imitation Learning (PIL), which offers significant promise in addressing the above limitations. In PIL, the learned policy is represented in a programming language, making it amenable to interpretation and adaptation to novel settings. However, state-of-the-art PIL algorithms assume access to action labels and struggle to learn from noisy real-world demonstrations. In this paper, we propose Plunder, a novel PIL algorithm that addresses these shortcomings by synthesizing probabilistic programmatic policies that are particularly well-suited for modeling the uncertainties inherent in real-world demonstrations. Our approach leverages an EM loop to simultaneously infer the missing action labels and the most likely probabilistic policy. We benchmark Plunder against several established IL techniques, and demonstrate its superiority across five challenging imitation learning tasks under noise. Plunder policies outperform the next-best baseline by 19% and 17% in matching the given demonstrations and successfully completing the tasks, respectively.

Online Access

Full Text (IEEE) Web of Science JCR 저널정보 Scopus Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송