Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction
This addresses the challenge of domain-specific attributes like camera parameters and occlusions in human mesh reconstruction, representing an incremental improvement over existing methods.
The paper tackles the problem of adapting a pre-trained human mesh reconstruction model to out-of-domain streaming videos, achieving state-of-the-art results on two benchmarks by dynamically fine-tuning with temporal constraints to mitigate domain gaps without overfitting.
This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos. However, most previous methods based on the parametric SMPL model \cite{loper2015smpl} underperform in new domains with unexpected, domain-specific attributes, such as camera parameters, lengths of bones, backgrounds, and occlusions. Our general idea is to dynamically fine-tune the source model on test video streams with additional temporal constraints, such that it can mitigate the domain gaps without over-fitting the 2D information of individual test frames. A subsequent challenge is how to avoid conflicts between the 2D and temporal constraints. We propose to tackle this problem using a new training algorithm named Bilevel Online Adaptation (BOA), which divides the optimization process of overall multi-objective into two steps of weight probe and weight update in a training iteration. We demonstrate that BOA leads to state-of-the-art results on two human mesh reconstruction benchmarks.