Information-Regularized Constrained Inversion for Stable Avatar Editing from Sparse Supervision
This addresses the challenge of stable avatar editing for applications in animation and virtual reality, though it appears incremental as it builds on existing avatar reconstruction methods.
The paper tackles the problem of identity leakage and temporal flicker when editing animatable human avatars from sparse supervision like a few edited keyframes, by proposing a conditioning-guided framework that performs editing as constrained inversion in a structured latent space, resulting in improved stability under limited supervision.
Editing animatable human avatars typically relies on sparse supervision, often a few edited keyframes, yet naively fitting a reconstructed avatar to these edits frequently causes identity leakage and pose-dependent temporal flicker. We argue that these failures are best understood as an ill-conditioned inversion: the available edited constraints do not sufficiently determine the latent directions responsible for the intended edit. We propose a conditioning-guided edited reconstruction framework that performs editing as a constrained inversion in a structured avatar latent space, restricting updates to a low-dimensional, part-specific edit subspace to prevent unintended identity changes. Crucially, we design the editing constraints during inversion by optimizing a conditioning objective derived from a local linearization of the full decoding-and-rendering pipeline, yielding an edit-subspace information matrix whose spectrum predicts stability and drives frame reweighting / keyframe activation. The resulting method operates on small subspace matrices and can be implemented efficiently (e.g., via Hessian-vector products), and improves stability under limited edited supervision.