Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance
This addresses the need for scalable analysis of animal behavior in research, enabling large-scale, longitudinal studies with high resolution, though it appears incremental as it builds on existing 3D Gaussian splatting techniques.
The authors tackled the problem of accurately and scalably quantifying animal pose and appearance for behavior studies by proposing Pose Splatter, a framework that models complete pose and appearance without prior geometry, per-frame optimization, or manual annotations, achieving better low-dimensional pose embeddings over state-of-the-art as evaluated by humans and generalizing to unseen data.
Accurate and scalable quantification of animal pose and appearance is crucial for studying behavior. Current 3D pose estimation techniques, such as keypoint- and mesh-based techniques, often face challenges including limited representational detail, labor-intensive annotation requirements, and expensive per-frame optimization. These limitations hinder the study of subtle movements and can make large-scale analyses impractical. We propose Pose Splatter, a novel framework leveraging shape carving and 3D Gaussian splatting to model the complete pose and appearance of laboratory animals without prior knowledge of animal geometry, per-frame optimization, or manual annotations. We also propose a novel rotation-invariant visual embedding technique for encoding pose and appearance, designed to be a plug-in replacement for 3D keypoint data in downstream behavioral analyses. Experiments on datasets of mice, rats, and zebra finches show Pose Splatter learns accurate 3D animal geometries. Notably, Pose Splatter represents subtle variations in pose, provides better low-dimensional pose embeddings over state-of-the-art as evaluated by humans, and generalizes to unseen data. By eliminating annotation and per-frame optimization bottlenecks, Pose Splatter enables analysis of large-scale, longitudinal behavior needed to map genotype, neural activity, and micro-behavior at unprecedented resolution.