Deep Surface Normal Estimation with Hierarchical RGB-D Fusion
2Beijing Institute of Technology
Abstract
The growing availability of commodity RGB-D cameras has boosted the applications in the field of scene understanding. However, as a fundamental scene understanding task, surface normal estimation from RGB-D data lacks thorough investigation. In this paper, a hierarchical fusion network with adaptive feature re-weighting is proposed for surface normal estimation from a single RGB-D image. Specifically, the features from color image and depth are successively integrated at multiple scales to ensure global surface smoothness while preserving visually salient details. Meanwhile, the depth features are re-weighted with a confidence map estimated from depth before merging into the color branch to avoid artifacts caused by input depth corruption. Additionally, a hybrid multi-scale loss function is designed to learn accurate normal estimation given noisy ground-truth dataset. Extensive experimental results validate the effectiveness of the fusion strategy and the loss design, outperforming state-of-the-art normal estimation schemes.
Architecture

Visual Results
Scannet Dataset
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Matterport Dataset
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Materials
![]() Paper |
![]() Supplementary |
Code
![]() Codes |
Citation
@inproceedings{zeng2019deep, title={Deep Surface Normal Estimation with Hierarchical RGB-D Fusion}, author={Zeng, Jin and Tong, Yanfeng and Huang, Yunmu and Yan, Qiong and Sun, Wenxiu and Chen, Jing and Wang, Yongtian}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, year={2019} }
Contact
Jin Zeng, zengjin@sensetime.com