TY - GEN
T1 - NeRF-FCM: Feature Calibration Mechanisms for NeRF-based 3D Object Detection
AU - Goshu, Hana Lebeta
AU - Xiao, Jun
AU - Chan, Kin Chung
AU - Zhang, Cong
AU - Gemeda, Mulugeta Tegegn
AU - Lam, Kin Man
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/12
Y1 - 2024/12
N2 - With the fast development of 3D vision, 3D object detection based on posed RGB images has become increasingly popular and attracted significant attention from researchers in recent years. Given the remarkable performance of Neural Radiance Field (NeRF) in modeling 3D scenes, recent 3D detection methods utilizing posed RGB images generated by NeRF models have achieved promising results. However, NeRF-based models often suffer from poor generalization and are prone to generating inconsistent image content for unseen views, which inevitably degrades the performance of existing NeRF-based 3D detectors. In this paper, we propose an effective feature calibration method to enhance the performance of 3D detection models based on posed RGB images produced by NeRF models. Specifically, our proposed method efficiently recalibrates the 3D features extracted from the backbone network, and adaptively computes the weights for fusion based on the statistical properties of the features. Experiments show that our method significantly outperforms the baseline model, achieving improvement of +8.6 [email protected], +5.5 [email protected], and +5.1 [email protected] on the Hypersim, 3D-FRONT, and ScanNet benchmarks, respectively, with anchor-free heads. Particularly, compared with the baseline model, our method can more accurately predict 3D bounding boxes in 3D space, even when objects are poorly reconstructed by NeRF while keeping low computational costs with a minimal increase in model complexity.
AB - With the fast development of 3D vision, 3D object detection based on posed RGB images has become increasingly popular and attracted significant attention from researchers in recent years. Given the remarkable performance of Neural Radiance Field (NeRF) in modeling 3D scenes, recent 3D detection methods utilizing posed RGB images generated by NeRF models have achieved promising results. However, NeRF-based models often suffer from poor generalization and are prone to generating inconsistent image content for unseen views, which inevitably degrades the performance of existing NeRF-based 3D detectors. In this paper, we propose an effective feature calibration method to enhance the performance of 3D detection models based on posed RGB images produced by NeRF models. Specifically, our proposed method efficiently recalibrates the 3D features extracted from the backbone network, and adaptively computes the weights for fusion based on the statistical properties of the features. Experiments show that our method significantly outperforms the baseline model, achieving improvement of +8.6 [email protected], +5.5 [email protected], and +5.1 [email protected] on the Hypersim, 3D-FRONT, and ScanNet benchmarks, respectively, with anchor-free heads. Particularly, compared with the baseline model, our method can more accurately predict 3D bounding boxes in 3D space, even when objects are poorly reconstructed by NeRF while keeping low computational costs with a minimal increase in model complexity.
KW - 3D Object Detection
KW - Channel Attention
KW - Multi-View
KW - NeRF
UR - https://www.scopus.com/pages/publications/85218201083
U2 - 10.1109/APSIPAASC63619.2025.10849009
DO - 10.1109/APSIPAASC63619.2025.10849009
M3 - Conference article published in proceeding or book
AN - SCOPUS:85218201083
T3 - APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024
SP - 1
EP - 6
BT - APSIPA ASC 2024 - Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024
Y2 - 3 December 2024 through 6 December 2024
ER -