Fatigue of operators due to intensive workloads and long working time is one of the significant constraints lead to inefficient crane operations and safety issues. It can be potentially prevented through early warnings of fatigue for further appropriate work shift arrangements. Recently, many deep neural networks have been developed for the fatigue detection of vehicle drivers, through training and processing the facial image or video data of the drivers from available datasets. However, these datasets are difficult to be directly used for the fatigue detections under crane operation scenarios due to the variations of facial features and movement patterns between crane operators and vehicle drivers. Furthermore, there is no representative and public dataset with the facial information of crane operators under construction scenarios. Therefore, this study aims to explore and analyze the features of available datasets and the corresponding data acquisition methods suitable for crane operators' fatigue detection, further providing collection guidelines on crane operators dataset for early warning system development. A hybrid learning architecture is proposed by combining convolutional neural networks (CNN) and long short-term memory (LSTM) for fatigue detection. In order to establish a unified evaluation criterion, the effort of the study includes relabeling three available vehicle drivers datasets, NTHU-DDD, UTA-RLDD, and YawnDD, with human-verified labels at the minute segment level, and to train three hybrid fatigue detection models separately. Then the three trained models are used to evaluate the fatigue status on facial videos of licensed crane operators under simulated crane operation scenarios. The results show that the average test losses are 0.78458, 0.32191, and 0.20294 individually. One of the three datasets with the pretending facial fatigue features is comparatively more accurate in detecting operators' status than the rest of those with subtle facial fatigue features. Further comparisons in terms of labeling interval, environment, and accuracy are discussed for future dataset collections.