Pygmalion Effect in Vision: Image-to-Clay Translation for Reflective Geometry Reconstruction
This addresses a long-standing problem in computer vision for applications like robotics and AR/VR, though it is an incremental advance in reflection-handling techniques.
The paper tackles the challenge of 3D reconstruction from images with complex reflections by introducing a framework that translates reflective objects into clay-like forms to suppress specular cues, resulting in substantial improvements in normal accuracy and mesh completeness over existing methods.
Understanding reflection remains a long-standing challenge in 3D reconstruction due to the entanglement of appearance and geometry under view-dependent reflections. In this work, we present the Pygmalion Effect in Vision, a novel framework that metaphorically "sculpts" reflective objects into clay-like forms through image-to-clay translation. Inspired by the myth of Pygmalion, our method learns to suppress specular cues while preserving intrinsic geometric consistency, enabling robust reconstruction from multi-view images containing complex reflections. Specifically, we introduce a dual-branch network in which a BRDF-based reflective branch is complemented by a clay-guided branch that stabilizes geometry and refines surface normals. The two branches are trained jointly using the synthesized clay-like images, which provide a neutral, reflection-free supervision signal that complements the reflective views. Experiments on both synthetic and real datasets demonstrate substantial improvement in normal accuracy and mesh completeness over existing reflection-handling methods. Beyond technical gains, our framework reveals that seeing by unshining, translating radiance into neutrality, can serve as a powerful inductive bias for reflective object geometry learning.