학술논문

GCEN: Multiagent Deep Reinforcement Learning With Grouped Cognitive Feature Representation
Document Type
Periodical
Source
IEEE Transactions on Cognitive and Developmental Systems IEEE Trans. Cogn. Dev. Syst. Cognitive and Developmental Systems, IEEE Transactions on. 16(2):458-473 Apr, 2024
Subject
Computing and Processing
Signal Processing and Analysis
Training
Mutual information
Feature extraction
Observability
Deep learning
Markov processes
Representation learning
Attention mechanism
cognitive feature extraction
cooperative multiagent reinforcement learning (RL)
deep learning
Language
ISSN
2379-8920
2379-8939
Abstract
In recent years, cooperative multiagent deep reinforcement learning (MADRL) has received increasing research interest and has been widely applied to computer games and coordinated multirobot systems, etc. However, it is still challenging to realize high-solution quality and learning efficiency for MADRL under the conditions of incomplete and noisy observations. To this end, this article proposes an MADRL approach with grouped cognitive feature representation (GCEN), following the paradigm of centralized training and decentralized execution (CTDE). Different from previous works, GCEN incorporates a new cognitive feature representation that combines a grouped attention mechanism and a training approach using mutual information (MI). The grouped attention mechanism is proposed to selectively extract entity features within the observation field for each agent while avoiding the influence of irrelevant observations. The MI regularization term is designed to guide the agents to learn grouped cognitive features based on global information, aiming to mitigate the influence of partial observations. The proposed GCEN approach can be extended as a feature representation module to different MADRL methods. Extensive experiments on the challenging level-based foraging and StarCraft II micromanagement benchmarks were conducted to illustrate the effectiveness and advantages of the proposed approach. Compared with seven representative MADRL algorithms, our proposed approach achieves state-of-the-art performance in winning rates and training efficiency. Experimental results further demonstrate that GCEN has improved generalization ability across varying sight ranges.