Hierarchical correlated Q-learning for multi-layer optimal generation command dispatch

T. Yu, X. S. Zhang, B. Zhou, Ka Wing Chan

Research output: Journal article publicationJournal articleAcademic researchpeer-review

29 Citations (Scopus)

Abstract

This paper presents a novel hierarchical correlated Q-learning (HCEQ) algorithm to solve the dynamic optimization of generation command dispatch (GCD) in the Automatic Generation Control (AGC). The GCD problem is to dynamically allocate the total AGC generation command from the central to each individual AGC generator. The proposed HCEQ is a novel multi-agent Q-learning algorithm based on the concept of correlated equilibrium point, and each AGC generator with an agent is to optimize its regulation participation factor and coordinate its decision with others for the overall GCD performance enhancement. In order to cope with the curse of dimensionality in the GCD problem with the increased number of AGC plants involved, a multi-layer optimum GCD framework is developed in this paper. In this hierarchical framework, the multiobjective design and a time-varying coordination factor have been formulated into the reward functions to improve the optimization efficiency and convergence of HCEQ. The application of the proposed approach has been fully verified on the China southern power grid (CSG) model to demonstrate its superior performance and dynamic optimization capability in various power system scenarios.
Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalInternational Journal of Electrical Power and Energy Systems
Volume78
DOIs
Publication statusPublished - 1 Jun 2016

Keywords

  • Automatic generation control
  • Control performance standards
  • Correlated equilibrium
  • Dynamic generation allocation
  • Hierarchical multi-agent reinforcement learning

ASJC Scopus subject areas

  • Energy Engineering and Power Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Hierarchical correlated Q-learning for multi-layer optimal generation command dispatch'. Together they form a unique fingerprint.

Cite this