Towards Texture- And Shape-Independent 3D Keypoint Estimation in Birds
This addresses pose estimation for birds, particularly pigeons and potentially other species, but is incremental as it builds directly on an existing framework.
The paper tackles 3D keypoint estimation in birds by extending the 3D-MuPPET framework with a texture-independent segmentation method that uses silhouettes to estimate 2D keypoints, achieving comparable accuracy to the original texture-dependent approach and showing preliminary promising results on four other bird species without fine-tuning.
In this paper, we present a texture-independent approach to estimate and track 3D joint positions of multiple pigeons. For this purpose, we build upon the existing 3D-MuPPET framework, which estimates and tracks the 3D poses of up to 10 pigeons using a multi-view camera setup. We extend this framework by using a segmentation method that generates silhouettes of the individuals, which are then used to estimate 2D keypoints. Following 3D-MuPPET, these 2D keypoints are triangulated to infer 3D poses, and identities are matched in the first frame and tracked in 2D across subsequent frames. Our proposed texture-independent approach achieves comparable accuracy to the original texture-dependent 3D-MuPPET framework. Additionally, we explore our approach's applicability to other bird species. To do that, we infer the 2D joint positions of four bird species without additional fine-tuning the model trained on pigeons and obtain preliminary promising results. Thus, we think that our approach serves as a solid foundation and inspires the development of more robust and accurate texture-independent pose estimation frameworks.