학술논문

ParPER: A Partitioned Prioritized Experience Replay in Multi-Agent Setting of Reinforcement Learning

Document Type

Conference

Author

Source

2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT) Smart Systems and Inventive Technology (ICSSIT), 2023 5th International Conference on. :932-938 Jan, 2023

Subject

Communication, Networking and Broadcast Technologies
Components, Circuits, Devices and Systems
Computing and Processing
Training
Reinforcement learning
Stability analysis
Multi-Agent Reinforcement Learning (MARL)
Multi-Agent Deep Deterministic Policy Gradient (MADDPG)
Prioritized Experience Replay(PER)
Partitioned PER(ParPER)

Language

ISSN

2832-3017

Abstract

Multi-Agent Reinforcement Learning (MARL) these days are used for modelling numerous real life applications. It is mostly used for modelling such scenarios in which there is very limited information of the environment. Off-policy Reinforcement Learning and Off-policy MARL makes use of buffer to store the experiences or transitions of every agent and later samples it for training purpose. Prioritized Experience Replay (PER) proposed a way to sample transition by the magnitude of the error a transition has during training. The more the error, the more is the probability of that transition of getting sampled for the next training. This method, though effective, encounters a problem. A transition which has low error, might never get sampled, though it is important for the model for that particular transition to be sampled to learn optimal policies. We propose Partitioned PER (ParPER) where we partition the memory into fixed length size and sample equal transitions from them independently. We test our proposed method on Multi-Agent Particle Environment (MPE) and the results were found to be promising.

Online Access

Full Text (IEEE) Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송