CVApr 10, 2025

On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition

arXiv:2504.07598v12 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work provides practical insights into resource allocation and performance estimation for real-world gait recognition systems, addressing a domain-specific problem in computer vision biometrics.

The authors tackled the problem of scaling skeleton-based self-supervised gait recognition by conducting an empirical study to quantify the effects of data quantity, model size, and compute on performance, finding predictable power-law improvements with increased scale and that data and compute scaling significantly influence downstream accuracy.

Gait recognition from video streams is a challenging problem in computer vision biometrics due to the subtle differences between gaits and numerous confounding factors. Recent advancements in self-supervised pretraining have led to the development of robust gait recognition models that are invariant to walking covariates. While neural scaling laws have transformed model development in other domains by linking performance to data, model size, and compute, their applicability to gait remains unexplored. In this work, we conduct the first empirical study scaling on skeleton-based self-supervised gait recognition to quantify the effect of data quantity, model size and compute on downstream gait recognition performance. We pretrain multiple variants of GaitPT - a transformer-based architecture - on a dataset of 2.7 million walking sequences collected in the wild. We evaluate zero-shot performance across four benchmark datasets to derive scaling laws for data, model size, and compute. Our findings demonstrate predictable power-law improvements in performance with increased scale and confirm that data and compute scaling significantly influence downstream accuracy. We further isolate architectural contributions by comparing GaitPT with GaitFormer under controlled compute budgets. These results provide practical insights into resource allocation and performance estimation for real-world gait recognition systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes