TY - JOUR
T1 - Confidence-based Large-scale Dense Multi-view Stereo
AU - Li, Zhaoxin
AU - Zuo, Wangmeng
AU - Wang, Zhaoqi
AU - Zhang, Lei
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2020/6
Y1 - 2020/6
N2 - Albeit remarkable progress has been made to improve the accuracy and completeness of multi-view stereo (MVS), existing methods still suffer from either sparse reconstructions of low-textured surfaces or heavy computational burden. In this paper, we propose a Confidence-based Large-scale Dense Multi-view Stereo (CLD-MVS) method for high resolution imagery. Firstly, we formulate MVS as a multi-view depth estimation problem, and employ a normal-aware efficient PatchMatch stereo to estimate the initial depth and normal map for each reference view. A self-supervised deep learning method is then developed to predict the spatial confidence for multi-view depth maps, which is combined with cross-view consistency to generate the ground control points. Subsequently, a confidence-driven and boundary-aware interpolation scheme using static and dynamic guidance is adopted to synthesize dense depth and normal maps. Finally, a refinement procedure which leverages synthesized depth and normal as prior is conducted to estimate cross-view consistent surface. Experiments show that the proposed CLD-MVS method achieves high geometric completeness while preserving fine-scale details. In particular, it has ranked No. 1 on the ETH3D high-resolution MVS benchmark in terms of F1-score.
AB - Albeit remarkable progress has been made to improve the accuracy and completeness of multi-view stereo (MVS), existing methods still suffer from either sparse reconstructions of low-textured surfaces or heavy computational burden. In this paper, we propose a Confidence-based Large-scale Dense Multi-view Stereo (CLD-MVS) method for high resolution imagery. Firstly, we formulate MVS as a multi-view depth estimation problem, and employ a normal-aware efficient PatchMatch stereo to estimate the initial depth and normal map for each reference view. A self-supervised deep learning method is then developed to predict the spatial confidence for multi-view depth maps, which is combined with cross-view consistency to generate the ground control points. Subsequently, a confidence-driven and boundary-aware interpolation scheme using static and dynamic guidance is adopted to synthesize dense depth and normal maps. Finally, a refinement procedure which leverages synthesized depth and normal as prior is conducted to estimate cross-view consistent surface. Experiments show that the proposed CLD-MVS method achieves high geometric completeness while preserving fine-scale details. In particular, it has ranked No. 1 on the ETH3D high-resolution MVS benchmark in terms of F1-score.
KW - confidence
KW - interpolation
KW - large-scale
KW - Multi-view stereo
KW - refinement
KW - static and dynamic guidance
UR - http://www.scopus.com/inward/record.url?scp=85088031885&partnerID=8YFLogxK
U2 - 10.1109/TIP.2020.2999853
DO - 10.1109/TIP.2020.2999853
M3 - Journal article
SN - 1057-7149
VL - 29
SP - 7176
EP - 7191
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9112642
ER -