TY - JOUR
T1 - FuseSeg: Semantic Segmentation of Urban Scenes Based on RGB and Thermal Data Fusion
AU - Sun, Yuxiang
AU - Zuo, Weixun
AU - Yun, Peng
AU - Wang, Hengli
AU - Liu, Ming
N1 - Funding Information:
Manuscript received January 9, 2020; revised April 1, 2020; accepted May 4, 2020. Date of publication June 4, 2020; date of current version July 2, 2021. This article was recommended for publication by Associate Editor C. Yang and Editor D. O. Popa upon evaluation of the reviewers’ comments. This work was supported in part by the National Natural Science Foundation of China under Project U1713211, and in part by the Research Grant Council of Hong Kong under Project 11210017. (Corresponding author: Ming Liu.) Yuxiang Sun, Weixun Zuo, Hengli Wang, and Ming Liu are with the Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong (e-mail: [email protected], [email protected]; [email protected]; [email protected]; [email protected]).
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2021/7
Y1 - 2021/7
N2 - Semantic segmentation of urban scenes is an essential component in various applications of autonomous driving. It makes great progress with the rise of deep learning technologies. Most of the current semantic segmentation networks use single-modal sensory data, which are usually the RGB images produced by visible cameras. However, the segmentation performance of these networks is prone to be degraded when lighting conditions are not satisfied, such as dim light or darkness. We find that thermal images produced by thermal imaging cameras are robust to challenging lighting conditions. Therefore, in this article, we propose a novel RGB and thermal data fusion network named FuseSeg to achieve superior performance of semantic segmentation in urban scenes. The experimental results demonstrate that our network outperforms the state-of-the-art networks. Note to Practitioners - This article investigates the problem of semantic segmentation of urban scenes when lighting conditions are not satisfied. We provide a solution to this problem via information fusion with RGB and thermal data. We build an end-to-end deep neural network, which takes as input a pair of RGB and thermal images and outputs pixel-wise semantic labels. Our network could be used for urban scene understanding, which serves as a fundamental component of many autonomous driving tasks, such as environment modeling, obstacle avoidance, motion prediction, and planning. Moreover, the simple design of our network allows it to be easily implemented using various deep learning frameworks, which facilitates the applications on different hardware or software platforms.
AB - Semantic segmentation of urban scenes is an essential component in various applications of autonomous driving. It makes great progress with the rise of deep learning technologies. Most of the current semantic segmentation networks use single-modal sensory data, which are usually the RGB images produced by visible cameras. However, the segmentation performance of these networks is prone to be degraded when lighting conditions are not satisfied, such as dim light or darkness. We find that thermal images produced by thermal imaging cameras are robust to challenging lighting conditions. Therefore, in this article, we propose a novel RGB and thermal data fusion network named FuseSeg to achieve superior performance of semantic segmentation in urban scenes. The experimental results demonstrate that our network outperforms the state-of-the-art networks. Note to Practitioners - This article investigates the problem of semantic segmentation of urban scenes when lighting conditions are not satisfied. We provide a solution to this problem via information fusion with RGB and thermal data. We build an end-to-end deep neural network, which takes as input a pair of RGB and thermal images and outputs pixel-wise semantic labels. Our network could be used for urban scene understanding, which serves as a fundamental component of many autonomous driving tasks, such as environment modeling, obstacle avoidance, motion prediction, and planning. Moreover, the simple design of our network allows it to be easily implemented using various deep learning frameworks, which facilitates the applications on different hardware or software platforms.
KW - Autonomous driving
KW - information fusion
KW - semantic segmentation
KW - thermal images
KW - urban scenes
UR - http://www.scopus.com/inward/record.url?scp=85112695465&partnerID=8YFLogxK
U2 - 10.1109/TASE.2020.2993143
DO - 10.1109/TASE.2020.2993143
M3 - Journal article
AN - SCOPUS:85112695465
SN - 1545-5955
VL - 18
SP - 1000
EP - 1011
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
IS - 3
M1 - 9108585
ER -