CVGRJun 19, 2024

Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images

arXiv:2406.13393v320 citations
Originality Incremental advance
AI Analysis

This addresses the problem of applying artistic styles to 3D scenes for users in graphics and AI, though it is incremental as it builds on existing NeRF and diffusion methods.

The paper tackles 3D style transfer by using a style-aligned diffusion model to generate multi-view images and refining a NeRF model with a sliced Wasserstein loss, achieving competitive quality in transferring diverse artistic styles to real-world 3D scenes.

We propose a simple yet effective pipeline for stylizing a 3D scene, harnessing the power of 2D image diffusion models. Given a NeRF model reconstructed from a set of multi-view images, we perform 3D style transfer by refining the source NeRF model using stylized images generated by a style-aligned image-to-image diffusion model. Given a target style prompt, we first generate perceptually similar multi-view images by leveraging a depth-conditioned diffusion model with an attention-sharing mechanism. Next, based on the stylized multi-view images, we propose to guide the style transfer process with the sliced Wasserstein loss based on the feature maps extracted from a pre-trained CNN model. Our pipeline consists of decoupled steps, allowing users to test various prompt ideas and preview the stylized 3D result before proceeding to the NeRF fine-tuning stage. We demonstrate that our method can transfer diverse artistic styles to real-world 3D scenes with competitive quality. Result videos are also available on our project page: https://haruolabs.github.io/style-n2n/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes