CVGRDec 4, 2023

Semantics-aware Motion Retargeting with Vision-Language Models

arXiv:2312.01964v310 citationsh-index: 12CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of semantic preservation in animation for creators, though it appears incremental by building on existing vision-language models.

The paper tackled the problem of preserving motion semantics in motion retargeting between animation characters, and the result was a method that effectively produces high-quality retargeting results while accurately maintaining motion semantics.

Capturing and preserving motion semantics is essential to motion retargeting between animation characters. However, most of the previous works neglect the semantic information or rely on human-designed joint-level representations. Here, we present a novel Semantics-aware Motion reTargeting (SMT) method with the advantage of vision-language models to extract and maintain meaningful motion semantics. We utilize a differentiable module to render 3D motions. Then the high-level motion semantics are incorporated into the motion retargeting process by feeding the vision-language model with the rendered images and aligning the extracted semantic embeddings. To ensure the preservation of fine-grained motion details and high-level semantics, we adopt a two-stage pipeline consisting of skeleton-aware pre-training and fine-tuning with semantics and geometry constraints. Experimental results show the effectiveness of the proposed method in producing high-quality motion retargeting results while accurately preserving motion semantics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes