CVMar 26

Unleashing Guidance Without Classifiers for Human-Object Interaction Animation

Ziyin Wang, Sirui Xu, Chuan Guo, Bing Zhou, Jiangshan Gong, Jian Wang, Yu-Xiong Wang, Liang-Yan Gui

arXiv:2603.2573477.71 citationsh-index: 7

AI Analysis

This work addresses the problem of realistic human-object interaction animation for computer graphics and robotics applications, representing an incremental improvement over prior diffusion-based approaches.

The paper tackles the challenge of generating realistic human-object interaction animations by proposing LIGHT, a data-driven diffusion method that eliminates the need for hand-crafted contact priors through pace-induced guidance, achieving higher contact fidelity and stronger generalization to unseen objects and tasks.

Generating realistic human-object interaction (HOI) animations remains challenging because it requires jointly modeling dynamic human actions and diverse object geometries. Prior diffusion-based approaches often rely on hand-crafted contact priors or human-imposed kinematic constraints to improve contact quality. We propose LIGHT, a data-driven alternative in which guidance emerges from the denoising pace itself, reducing dependence on manually designed priors. Building on diffusion forcing, we factor the representation into modality-specific components and assign individualized noise levels with asynchronous denoising schedules. In this paradigm, cleaner components guide noisier ones through cross-attention, yielding guidance without auxiliary classifiers. We find that this data-driven guidance is inherently contact-aware, and can be enhanced when training is augmented with a broad spectrum of synthetic object geometries, encouraging invariance of contact semantics to geometric diversity. Extensive experiments show that pace-induced guidance more effectively mirrors the benefits of contact priors than conventional classifier-free guidance, while achieving higher contact fidelity, more realistic HOI generation, and stronger generalization to unseen objects and tasks.

View on arXiv PDF

Similar