Deep Surface Normal Estimation with Hierarchical RGB-D Fusion

Jin Zeng¹ Yanfeng Tong^1,2& Yunmu Huang^1& Qiong Yan¹ Wenxiu Sun¹ Jing Chen² Yongtian Wang²

¹SenseTime Research
²Beijing Institute of Technology

Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2019)

Abstract

The growing availability of commodity RGB-D cameras has boosted the applications in the field of scene understanding. However, as a fundamental scene understanding task, surface normal estimation from RGB-D data lacks thorough investigation. In this paper, a hierarchical fusion network with adaptive feature re-weighting is proposed for surface normal estimation from a single RGB-D image. Specifically, the features from color image and depth are successively integrated at multiple scales to ensure global surface smoothness while preserving visually salient details. Meanwhile, the depth features are re-weighted with a confidence map estimated from depth before merging into the color branch to avoid artifacts caused by input depth corruption. Additionally, a hybrid multi-scale loss function is designed to learn accurate normal estimation given noisy ground-truth dataset. Extensive experimental results validate the effectiveness of the fusion strategy and the loss design, outperforming state-of-the-art normal estimation schemes.

Architecture

Visual Results

Scannet Dataset


RGB input	Depth input	Ground-truth	Skip-Net	Zhang’s

Levin’s	DC	GeoNet-D	GFMM	HFM-Net

Matterport Dataset


RGB input	Depth input	Ground-truth	Skip-Net	Zhang’s

Levin’s	DC	GeoNet-D	GFMM	HFM-Net

Materials

Paper

Supplementary

Code

Codes

Citation

@inproceedings{zeng2019deep,
  title={Deep Surface Normal Estimation with Hierarchical RGB-D Fusion},
  author={Zeng, Jin and Tong, Yanfeng and Huang, Yunmu and Yan, Qiong and Sun, Wenxiu and Chen, Jing and Wang, Yongtian},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Contact

Jin Zeng, zengjin@sensetime.com