Pose Embeddings: A Deep Architecture for Learning to Match Human Poses
This addresses pose comparison for computer vision applications, but it is incremental as it builds on existing embedding and triplet-based methods.
The paper tackles the problem of comparing human poses in images by learning an embedding that places similar poses nearby, avoiding joint estimation challenges, and demonstrates its potential in pose matching and retrieval from video data.
We present a method for learning an embedding that places images of humans in similar poses nearby. This embedding can be used as a direct method of comparing images based on human pose, avoiding potential challenges of estimating body joint positions. Pose embedding learning is formulated under a triplet-based distance criterion. A deep architecture is used to allow learning of a representation capable of making distinctions between different poses. Experiments on human pose matching and retrieval from video data demonstrate the potential of the method.