MonoRelief V2: Leveraging Real Data for High-Fidelity Monocular Relief Recovery
It addresses the challenge of monocular relief recovery for applications in computer vision, but is incremental as it builds on a predecessor with added real data.
This paper tackles the problem of recovering 2.5D reliefs from single images under complex conditions by introducing MonoRelief V2, which leverages real data to achieve state-of-the-art performance in depth and normal predictions.
This paper presents MonoRelief V2, an end-to-end model designed for directly recovering 2.5D reliefs from single images under complex material and illumination variations. In contrast to its predecessor, MonoRelief V1 [1], which was solely trained on synthetic data, MonoRelief V2 incorporates real data to achieve improved robustness, accuracy and efficiency. To overcome the challenge of acquiring large-scale real-world dataset, we generate approximately 15,000 pseudo real images using a text-to-image generative model, and derive corresponding depth pseudo-labels through fusion of depth and normal predictions. Furthermore, we construct a small-scale real-world dataset (800 samples) via multi-view reconstruction and detail refinement. MonoRelief V2 is then progressively trained on the pseudo-real and real-world datasets. Comprehensive experiments demonstrate its state-of-the-art performance both in depth and normal predictions, highlighting its strong potential for a range of downstream applications. Code is at: https://github.com/glp1001/MonoreliefV2.