TY - JOUR
T1 - Effects of dataset characteristics on the performance of fatigue detection for crane operators using hybrid deep neural networks
AU - Liu, Pengkun
AU - Chi, Hung Lin
AU - Li, Xiao
AU - Guo, Jingjing
N1 - Funding Information:
The authors would like to thank the Research Grants Council of Hong Kong for their Early Career Scheme funding support ( PolyU 25221519 ). Partial research works, including the interviews and experiments participated by crane operators, are sponsored by the Start-up Research Project Scheme (No. P0001224 ) of the Hong Kong Polytechnic University . Also, thanks to You-Pin Construction Co., Ltd. for recruiting licensed crane operators to participate in the experiments of this research.
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/12
Y1 - 2021/12
N2 - Fatigue of operators due to intensive workloads and long working time is a significant constraint that leads to inefficient crane operations and increased risk of safety issues. It can be potentially prevented through early warnings of fatigue for further appropriate work shift arrangements. Many deep neural networks have recently been developed for the fatigue detection of vehicle drivers through training and processing the facial image or video data from the public driver's datasets. However, these datasets are difficult to directly use for the fatigue detections under crane operation scenarios due to the variations of facial features and head movement patterns between crane operators and vehicle drivers. Furthermore, there is no representative and public dataset with the facial information of crane operators under construction scenarios. Therefore, this study aims to explore and analyse the features of multi-sources datasets and the corresponding data acquisition methods which are suitable for crane operators' fatigue detection, further providing collection guidelines of crane operators dataset. Variations on public datasets such as real or pretend facial expression, the segment level of human-verified labelling, camera positions, acquisition scenarios, and illumination conditions are analysed. A hybrid learning architecture is proposed by combining convolutional neural networks (CNN) and long short-term memory (LSTM) for fatigue detection. In order to establish a unified evaluation criterion, the effort of the study includes relabelling three public vehicle drivers datasets, NTHU-DDD, UTA-RLDD, and YawnDD, with human-verified labels at the frame and minute segment levels, and training the corresponding hybrid fatigue detection models accordingly. The average detection accuracies and losses are identified for the trained models of UTA-RLDD, NTHU-DDD, and YawnDD individually. The trained models are used to evaluate the fatigue status of facial videos from licensed crane operators under simulated crane operation scenarios. The results suggest the necessary considerations of different influential factors for establishing a large and public fatigue dataset for crane operators.
AB - Fatigue of operators due to intensive workloads and long working time is a significant constraint that leads to inefficient crane operations and increased risk of safety issues. It can be potentially prevented through early warnings of fatigue for further appropriate work shift arrangements. Many deep neural networks have recently been developed for the fatigue detection of vehicle drivers through training and processing the facial image or video data from the public driver's datasets. However, these datasets are difficult to directly use for the fatigue detections under crane operation scenarios due to the variations of facial features and head movement patterns between crane operators and vehicle drivers. Furthermore, there is no representative and public dataset with the facial information of crane operators under construction scenarios. Therefore, this study aims to explore and analyse the features of multi-sources datasets and the corresponding data acquisition methods which are suitable for crane operators' fatigue detection, further providing collection guidelines of crane operators dataset. Variations on public datasets such as real or pretend facial expression, the segment level of human-verified labelling, camera positions, acquisition scenarios, and illumination conditions are analysed. A hybrid learning architecture is proposed by combining convolutional neural networks (CNN) and long short-term memory (LSTM) for fatigue detection. In order to establish a unified evaluation criterion, the effort of the study includes relabelling three public vehicle drivers datasets, NTHU-DDD, UTA-RLDD, and YawnDD, with human-verified labels at the frame and minute segment levels, and training the corresponding hybrid fatigue detection models accordingly. The average detection accuracies and losses are identified for the trained models of UTA-RLDD, NTHU-DDD, and YawnDD individually. The trained models are used to evaluate the fatigue status of facial videos from licensed crane operators under simulated crane operation scenarios. The results suggest the necessary considerations of different influential factors for establishing a large and public fatigue dataset for crane operators.
KW - Construction safety
KW - Convolutional neural network (CNN)
KW - Fatigue detection
KW - Long short-term memory network (LSTM)
KW - Multi-sources datasets
KW - Tower crane operator
UR - http://www.scopus.com/inward/record.url?scp=85114986468&partnerID=8YFLogxK
U2 - 10.1016/j.autcon.2021.103901
DO - 10.1016/j.autcon.2021.103901
M3 - Journal article
AN - SCOPUS:85114986468
SN - 0926-5805
VL - 132
JO - Automation in Construction
JF - Automation in Construction
M1 - 103901
ER -