AROS: Affordance Recognition with One-Shot Human Stances
This addresses the challenge of quickly adapting to new affordances in 3D environments for robotics or AR/VR applications, though it appears incremental as it builds on existing one-shot and pose-based methods.
The paper tackles the problem of affordance recognition in 3D scenes using a one-shot learning approach that leverages human poses, achieving up to 80% better performance than data-intensive baselines.
We present AROS, a one-shot learning approach that uses an explicit representation of interactions between highly-articulated human poses and 3D scenes. The approach is one-shot as the method does not require re-training to add new affordance instances. Furthermore, only one or a small handful of examples of the target pose are needed to describe the interaction. Given a 3D mesh of a previously unseen scene, we can predict affordance locations that support the interactions and generate corresponding articulated 3D human bodies around them. We evaluate on three public datasets of scans of real environments with varied degrees of noise. Via rigorous statistical analysis of crowdsourced evaluations, results show that our one-shot approach outperforms data-intensive baselines by up to 80\%.