Memorization in 3D Shape Generation: An Empirical Study
This addresses data privacy and diversity issues in 3D shape generation for researchers and practitioners, though it is incremental as it builds on existing methods with new analysis.
The paper tackles the problem of quantifying memorization in 3D generative models to prevent data leakage and improve diversity, finding that memorization depends on data modality and modeling factors like guidance scale, with strategies such as longer Vecsets and rotation augmentation reducing it without quality loss.
Generative models are increasingly used in 3D vision to synthesize novel shapes, yet it remains unclear whether their generation relies on memorizing training shapes. Understanding their memorization could help prevent training data leakage and improve the diversity of generated results. In this paper, we design an evaluation framework to quantify memorization in 3D generative models and study the influence of different data and modeling designs on memorization. We first apply our framework to quantify memorization in existing methods. Next, through controlled experiments with a latent vector-set (Vecset) diffusion model, we find that, on the data side, memorization depends on data modality, and increases with data diversity and finer-grained conditioning; on the modeling side, it peaks at a moderate guidance scale and can be mitigated by longer Vecsets and simple rotation augmentation. Together, our framework and analysis provide an empirical understanding of memorization in 3D generative models and suggest simple yet effective strategies to reduce it without degrading generation quality. Our code is available at https://github.com/zlab-princeton/3d_mem.