Abstract
This paper proposes a human-in-The-loop deep reinforcement learning (HL-DRL)-based VVC strategy to simultaneously reduce power losses, mitigate voltage violations and compensate for voltage unbalance in three-phase unbalanced distribution networks. Instead of fully trusting DRL actions made by deep neural networks, a human intervention module is proposed to modify dangerous actions that violate operation constraints during offline training. This module refers to well-designed human guidance rules based on voltage-reactive power sensitivities, which regulate PV reactive power to sequentially address local voltage violation and unbalance issues to obtain safe transitions. To efficiently and safely learn the optimal control policy from these training samples, a human-in-The-loop soft actor-critic (HL-SAC) solution method is then developed. Different from the standard SAC algorithm, an online switch mechanism between action exploration and human intervention is designed. The actor network loss function is modified to incorporate human guidance terms, which alleviates the inconsistency of the updating direction of actor and critic networks. A hybrid experience replay buffer including both dangerous and safe transitions is also used to facilitate the learning process towards human actions. Comparative simulation results on a modified IEEE 123-bus unbalanced distribution system demonstrate the effectiveness and superiority of the proposed method in voltage control.
Original language | English |
---|---|
Pages (from-to) | 2639-2651 |
Number of pages | 13 |
Journal | IEEE Transactions on Smart Grid |
Volume | 15 |
Issue number | 3 |
DOIs | |
Publication status | Published - 1 May 2024 |
Keywords
- human-in-The-loop
- safe deep reinforcement learning
- soft actor-critic
- unbalanced distribution networks
- Volt/Var control
ASJC Scopus subject areas
- General Computer Science