CVDec 12, 2025

Kinetic Mining in Context: Few-Shot Action Synthesis via Text-to-Motion Distillation

arXiv:2512.11654v2h-index: 5
Originality Incremental advance
AI Analysis

This work addresses the high acquisition cost of annotated motion datasets for HAR researchers, though it is incremental as it builds on existing T2M models with a novel adaptation method.

The paper tackled the problem of generating synthetic motion data for Human Activity Recognition (HAR) by adapting a Text-to-Motion (T2M) model to produce kinematically precise actions, resulting in a +23.1% accuracy improvement in HAR classification.

The acquisition cost for large, annotated motion datasets remains a critical bottleneck for skeletal-based Human Activity Recognition (HAR). Although Text-to-Motion (T2M) generative models offer a compelling, scalable source of synthetic data, their training objectives, which emphasize general artistic motion, and dataset structures fundamentally differ from HAR's requirements for kinematically precise, class-discriminative actions. This disparity creates a significant domain gap, making generalist T2M models ill-equipped for generating motions suitable for HAR classifiers. To address this challenge, we propose KineMIC (Kinetic Mining In Context), a transfer learning framework for few-shot action synthesis. KineMIC adapts a T2M diffusion model to an HAR domain by hypothesizing that semantic correspondences in the text encoding space can provide soft supervision for kinematic distillation. We operationalize this via a kinetic mining strategy that leverages CLIP text embeddings to establish correspondences between sparse HAR labels and T2M source data. This process guides fine-tuning, transforming the generalist T2M backbone into a specialized few-shot Action-to-Motion generator. We validate KineMIC using HumanML3D as the source T2M dataset and a subset of NTU RGB+D 120 as the target HAR domain, randomly selecting just 10 samples per action class. Our approach generates significantly more coherent motions, providing a robust data augmentation source that delivers a +23.1% accuracy points improvement. Animated illustrations and supplementary materials are available at https://lucazzola.github.io/publications/kinemic.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes