CVAIGRLGNov 28, 2023

Generative Models: What Do They Know? Do They Know Things? Let's Find Out!

arXiv:2311.17137v350 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the challenge of extracting interpretable knowledge from generative models for researchers in computer vision and AI, offering a lightweight and efficient method, though it is incremental in building on existing LoRA techniques.

The paper tackled the problem of understanding what intrinsic scene properties generative models encode and developed a framework using Low-Rank Adaptors (LoRA) to recover intrinsic images like depth, normals, albedo, and shading across different generative architectures, achieving this with as few as 0.04% of model weights and 250 labeled images.

Generative models excel at mimicking real scenes, suggesting they might inherently encode important intrinsic scene properties. In this paper, we aim to explore the following key questions: (1) What intrinsic knowledge do generative models like GANs, Autoregressive models, and Diffusion models encode? (2) Can we establish a general framework to recover intrinsic representations from these models, regardless of their architecture or model type? (3) How minimal can the required learnable parameters and labeled data be to successfully recover this knowledge? (4) Is there a direct link between the quality of a generative model and the accuracy of the recovered scene intrinsics? Our findings indicate that a small Low-Rank Adaptators (LoRA) can recover intrinsic images-depth, normals, albedo and shading-across different generators (Autoregressive, GANs and Diffusion) while using the same decoder head that generates the image. As LoRA is lightweight, we introduce very few learnable parameters (as few as 0.04% of Stable Diffusion model weights for a rank of 2), and we find that as few as 250 labeled images are enough to generate intrinsic images with these LoRA modules. Finally, we also show a positive correlation between the generative model's quality and the accuracy of the recovered intrinsics through control experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes