Self-supervised multi-task learning framework for safety and health-oriented road environment surveillance based on connected vehicle visual perception

Shaocheng Jia, Wei Yao

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Cutting-edge connected vehicle (CV) technologies have drawn much attention in recent years. The real-time traffic data captured by a CV can be shared with other CVs and data centers so as to open new possibilities for solving diverse transportation problems. The trajectory data of CVs have been well-studied and widely used. However, image data captured by onboard cameras in a connected environment, as being a kind of fundamental data source, are not sufficiently investigated, especially for safety and health-oriented visual perception. In this paper, a bidirectional process of image synthesis and decomposition (BPISD) approach is proposed, and thus a novel self-supervised multi-task learning framework, to simultaneously estimate depth map, atmospheric visibility, airlight, and PM2.5 mass concentration, in which depth map and visibility are considered highly associated with traffic safety, while airlight and PM2.5 mass concentration are directly correlated with human health. Both the training and testing phases of the proposed system solely require a single image as input. Due to the innovative training pipeline, the depth estimation network can automatically manage various levels of visibility conditions and overcome diverse inherent problems in current image-synthesis-based self-supervised depth estimation, thereby generating high-quality depth maps even in low-visibility situations and further benefiting accurate estimations of visibility, airlight, and PM2.5 mass concentration. Extensive experiments on the original and synthesized data from the KITTI dataset and real-world data collected in Beijing demonstrate that the proposed method can (1) achieve performance comparable in self-supervised depth estimation as compared with other state-of-the-art methods when taking clear images as input; (2) predict vivid depth map for images contaminated by various levels of haze when the network trained with previous framework fails; and (3) accurately estimate visibility, airlight, and PM2.5 mass concentrations. Beneficial applications can be developed based on the presented work to contribute to high-precise and dynamic geoinformation reconstruction, transportation, meteorology, and smart city.

Original languageEnglish
Article number103753
JournalInternational Journal of Applied Earth Observation and Geoinformation
Volume128
DOIs
Publication statusPublished - Apr 2024

Keywords

  • Airlight estimation
  • Bidirectional process of image synthesis and decomposition (BPISD)
  • Depth estimation
  • PM mass concentration estimation
  • Self-supervised learning
  • Visibility estimation

ASJC Scopus subject areas

  • Global and Planetary Change
  • Earth-Surface Processes
  • Computers in Earth Sciences
  • Management, Monitoring, Policy and Law

Fingerprint

Dive into the research topics of 'Self-supervised multi-task learning framework for safety and health-oriented road environment surveillance based on connected vehicle visual perception'. Together they form a unique fingerprint.

Cite this