Ergonomic posture recognition using 3D view-invariant features from single ordinary camera

Hong Zhang, Xuzhong Yan, Heng Li

Research output: Journal article publicationJournal articleAcademic researchpeer-review

90 Citations (Scopus)


Manual construction tasks are physically demanding, requiring prolonged awkward postures that can cause pain and injury. Person posture recognition (PPR) is essential in postural ergonomic hazard assessment. This paper proposed an ergonomic posture recognition method using 3D view-invariant features from a single 2D camera that is non-intrusive and widely installed on construction sites. Based on the detected 2D skeletons, view-invariant relative 3D joint position (R3DJP) and joint angle are extracted as classification features by employing a multi-stage convolutional nerual network (CNN) architecture, so that the learned classifier is not sensitive to camera viewpoints. Three posture classifiers regarding arms, back, and legs are trained, so that they can be simultaneously classified in one video frame. The posture recognition accuracies of three body parts are 98.6%, 99.5%, 99.8%, respectively. For generalization ability, the relevant accuracies are 94.9%, 93.9%, 94.6%, respectively. Both the classification accuracy and generalization ability of the method outperform previous vision-based methods in construction. The proposed method enables reliable and accurate postural ergonomic assessment for improving construction workers' safety and healthy.
Original languageEnglish
Pages (from-to)1-10
Number of pages10
JournalAutomation in Construction
Publication statusPublished - 1 Oct 2018


  • 3D view-invariant
  • Construction worker
  • Convolutional neural network
  • Ergonomics
  • Joint angle
  • Person posture recognition
  • Relative 3D joint position
  • Safety and health

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Civil and Structural Engineering
  • Building and Construction


Dive into the research topics of 'Ergonomic posture recognition using 3D view-invariant features from single ordinary camera'. Together they form a unique fingerprint.

Cite this