Mask-Conditioned Voxel Diffusion for Joint Geometry and Color Inpainting
This work addresses the digital restoration of cultural heritage artifacts, representing an incremental improvement in 3D inpainting methods.
The paper tackles the problem of joint geometry and color inpainting for damaged 3D objects, such as cultural heritage artifacts, by introducing a two-stage framework that uses mask-conditioned diffusion to achieve more complete geometry and coherent color reconstructions compared to symmetry-based baselines at a fixed 32^3 resolution.
We present a lightweight two-stage framework for joint geometry and color inpainting of damaged 3D objects, motivated by the digital restoration of cultural heritage artifacts. The pipeline separates damage localization from reconstruction. In the first stage, a 2D convolutional network predicts damage masks on RGB slices extracted from a voxelized object, and these predictions are aggregated into a volumetric mask. In the second stage, a diffusion-based 3D U-Net performs mask-conditioned inpainting directly on voxel grids, reconstructing geometry and color while preserving observed regions. The model jointly predicts occupancy and color using a composite objective that combines occupancy reconstruction with masked color reconstruction and perceptual regularization. We evaluate the approach on a curated set of textured artifacts with synthetically generated damage using standard geometric and color metrics. Compared to symmetry-based baselines, our method produces more complete geometry and more coherent color reconstructions at a fixed 32^3 resolution. Overall, the results indicate that explicit mask conditioning is a practical way to guide volumetric diffusion models for joint 3D geometry and color inpainting.