CVApr 23

MeshLAM: Feed-Forward One-Shot Animatable Textured Mesh Avatar Reconstruction

arXiv:2604.2286586.1

Predicted impact top 20% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For computer vision and graphics, this enables fast, one-shot animatable avatar creation without test-time optimization, addressing a key bottleneck in efficiency.

MeshLAM reconstructs high-fidelity, animatable 3D head avatars from a single image in a single forward pass, outperforming state-of-the-art methods in reconstruction quality, animation capability, and computational efficiency.

We introduce MeshLAM, a feed-forward framework for one-shot animatable mesh head reconstruction that generates high-fidelity, animatable 3D head avatars from a single image. Unlike previous work that relies on time-consuming test-time optimization or extensive multi-view data, our method produces complete mesh representations with inherent animatability from a single image in a single forward pass. Our approach employs a dual shape and texture map architecture that simultaneously processes mesh vertices and texture map with extracted image features from a shared transformer backbone, allowing for coherent shape carving and appearance modeling. To prevent mesh collapse and ensure topological integrity during feed-forward deformation, we propose an iterative GRU-based decoding mechanism with progressive geometry deformation and texture refinement, coupled with a novel reprojection-based texture guidance mechanism that anchors appearance learning to the input image. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches in reconstruction quality, animation capability, and computational efficiency. Project page at https://meshlam.github.io.

View on arXiv PDF

Similar