CVJan 5

DiffProxy: Multi-View Human Mesh Recovery via Diffusion-Generated Dense Proxies

arXiv:2601.02267v11 citationsh-index: 18
Originality Highly original
AI Analysis

This work solves the challenge of accurate human mesh recovery for computer vision applications, particularly in scenarios with occlusions and partial views, representing a novel method for a known bottleneck.

The paper tackles the problem of human mesh recovery from multi-view images by addressing biases from imperfect real-world annotations and domain gaps in synthetic data, proposing DiffProxy which generates multi-view consistent human proxies using diffusion priors and achieves state-of-the-art performance on five real-world benchmarks with strong zero-shot generalization.

Human mesh recovery from multi-view images faces a fundamental challenge: real-world datasets contain imperfect ground-truth annotations that bias the models' training, while synthetic data with precise supervision suffers from domain gap. In this paper, we propose DiffProxy, a novel framework that generates multi-view consistent human proxies for mesh recovery. Central to DiffProxy is leveraging the diffusion-based generative priors to bridge the synthetic training and real-world generalization. Its key innovations include: (1) a multi-conditional mechanism for generating multi-view consistent, pixel-aligned human proxies; (2) a hand refinement module that incorporates flexible visual prompts to enhance local details; and (3) an uncertainty-aware test-time scaling method that increases robustness to challenging cases during optimization. These designs ensure that the mesh recovery process effectively benefits from the precise synthetic ground truth and generative advantages of the diffusion-based pipeline. Trained entirely on synthetic data, DiffProxy achieves state-of-the-art performance across five real-world benchmarks, demonstrating strong zero-shot generalization particularly on challenging scenarios with occlusions and partial views. Project page: https://wrk226.github.io/DiffProxy.html

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes