CVNov 11, 2025

Twist and Compute: The Cost of Pose in 3D Generative Diffusion

Kyle Fogarty, Jack Foster, Boqiao Zhang, Jing Yang, Cengiz Öztireli

arXiv:2511.08203v13.6h-index: 4

Originality Incremental advance

AI Analysis

This addresses a robustness issue in 3D generative AI for applications requiring multi-view consistency, though it is incremental as it patches an existing model.

The paper identified a canonical view bias in image-conditioned 3D generative models, showing that the Hunyuan3D 2.0 model's performance degrades under rotated inputs, and mitigated this with a lightweight CNN that restores performance without modifying the backbone.

Despite their impressive results, large-scale image-to-3D generative models remain opaque in their inductive biases. We identify a significant limitation in image-conditioned 3D generative models: a strong canonical view bias. Through controlled experiments using simple 2D rotations, we show that the state-of-the-art Hunyuan3D 2.0 model can struggle to generalize across viewpoints, with performance degrading under rotated inputs. We show that this failure can be mitigated by a lightweight CNN that detects and corrects input orientation, restoring model performance without modifying the generative backbone. Our findings raise an important open question: Is scale enough, or should we pursue modular, symmetry-aware designs?

View on arXiv PDF

Similar