학술논문

Multi-Agent Reinforcement Learning in Dynamic Industrial Context
Document Type
Conference
Source
2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC) COMPSAC Computers, Software, and Applications Conference (COMPSAC), 2023 IEEE 47th Annual. :448-457 Jun, 2023
Subject
Computing and Processing
Engineering Profession
General Topics for Engineers
Training
Adaptation models
Heuristic algorithms
Mission critical systems
Reinforcement learning
Quality of service
Network architecture
Reinforcement Learning
Multi-Agent
Machine Learning
Software Engineering
Emergency Communication Network
Language
Abstract
Deep reinforcement learning has advanced signifi-cantly in recent years, and it is now used in embedded systems in addition to simulators and games. Reinforcement Learning (RL) algorithms are currently being used to enhance device operation so that they can learn on their own and offer clients better services. It has recently been studied in a variety of industrial applications. However, reinforcement learning, especially when controlling a large number of agents in an industrial environment, has been demonstrated to be unstable and unable to adapt to realistic situations when used in a real-world setting. To address this problem, the goal of this study is to enable multiple reinforcement learning agents to independently learn control policies on their own in dynamic industrial contexts. In order to solve the problem, we propose a dynamic multi-agent reinforcement learning (dynamic multi-RL) method along with adaptive exploration (AE) and vector-based action selection (VAS) techniques for accelerating model convergence and adapting to a complex industrial environment. The proposed algorithm is tested for validation in emergency situations within the telecommunications industry. In such circumstances, three unmanned aerial vehicles (UAV-BSs) are used to provide temporary coverage to mission-critical (MC) customers in disaster zones when the original serving base station (BS) is destroyed by natural disasters. The algorithm directs the participating agents automatically to enhance service quality. Our findings demonstrate that the proposed dynamic multi-RL algorithm can proficiently manage the learning of multiple agents and adjust to dynamic industrial environments. Additionally, it enhances learning speed and improves the quality of service.