학술논문

Physics-Shielded Multi-Agent Deep Reinforcement Learning for Safe Active Voltage Control With Photovoltaic/Battery Energy Storage Systems
Document Type
Periodical
Source
IEEE Transactions on Smart Grid IEEE Trans. Smart Grid Smart Grid, IEEE Transactions on. 14(4):2656-2667 Jul, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Power, Energy and Industry Applications
Automatic voltage control
Reactive power
Safety
Training
Distribution networks
Fluctuations
Voltage fluctuations
Active voltage control
distribution systems
multi-agent deep reinforcement learning
shield
battery energy storage systems
Language
ISSN
1949-3053
1949-3061
Abstract
While many multi-agent deep reinforcement learning (MADRL) algorithms have been implemented for active voltage control (AVC) in power distribution systems, the safety of electrical components involved in the operation of these algorithms are mostly ignored. In this work, a safe MADRL control scheme is proposed to regulate the reactive and active power control of photovoltaics (PVs) to alleviate power congestion and improve voltage quality by coordinating battery energy storage systems (BESSs) and static var compensators (SVCs). Uniquely, the learning algorithm designed in this paper can limit the action of the agent when approaching a dangerous state to ensure the safety of BESSs during the training process, which is realized by developing a multi-agent twin delayed deep deterministic (MATD3) policy gradient algorithm with a physics-based shielding mechanism. Specifically, actions that lead to dangerous states, the state-of-charge (SoC) of BESSs is fully loaded or drained, are replaced by the shielding mechanism with safe actions while maintaining system stability. Furthermore, each PVs node in the power distribution network is treated as an agent under the fact of reactive and active power sensitivities to voltage in the MATD3 algorithm, which is beneficial for improving scalability. Training, testing and comparative results on IEEE 33-bus and 141-bus with real-world data are provided to demonstrate the effectiveness and superiority of the proposed algorithm.