A Pose-Aware Global Representation Network for Human Parsing

Yanghong Zhou, P. Y. Mok

Research output: Journal article publicationJournal articleAcademic researchpeer-review

1 Citation (Scopus)


Many recognition tasks including image/video classification, segmentation and object detection can be improved by the integration of global information. Although global information may be better represented in some recognition tasks than the others, it is worth exploring how global information from related tasks can be effectively used to improve the performance of a target task. The task of pose estimation predicts the locations of human joints, thus providing global information about the human body. In this paper, we propose a pose-aware global representation network model (PAGRnet) that exploits global information from pose estimation to enhance feature learning in human parsing. In our PAGRnet model, a novel learning module with three integrated parts is used to learn global information. The first part generates a global joint representation, while the second part learns the relationship between the pixels and joints. By integrating the global joint representation with the pixel-joint relationship, the resulting pose-aware global representation is augmented for the parsing task. Our experimental results show competitive performance of our method on the LIP, the Pascal-person-part and the ATR datasets, with reduced computation costs in comparison to other proposals of global information fusion. We also demonstrate the advantages of our feature fusion model over concatenation, pixel-wise and channel-wise relation models.

Original languageEnglish
Pages (from-to)1710-1724
Number of pages15
JournalIEEE Transactions on Circuits and Systems for Video Technology
Issue number4
Publication statusPublished - 1 Apr 2023


  • Global information fusion
  • human parsing
  • human pose estimation
  • pose-aware information fusion
  • semantic segmentation

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering


Dive into the research topics of 'A Pose-Aware Global Representation Network for Human Parsing'. Together they form a unique fingerprint.

Cite this