CVAIDec 2, 2024

Object Agnostic 3D Lifting in Space and Time

arXiv:2412.01166v2h-index: 43DV
Originality Incremental advance
AI Analysis

This work addresses the challenge of 3D reconstruction from 2D videos for multiple object categories without category-specific training, which is incremental by combining temporal consistency with object-agnostic lifting.

The paper tackles the problem of 3D lifting of 2D keypoints from video sequences in a category-agnostic manner, achieving state-of-the-art performance on per-frame and per-sequence metrics for various animal categories.

We present a spatio-temporal perspective on category-agnostic 3D lifting of 2D keypoints over a temporal sequence. Our approach differs from existing state-of-the-art methods that are either: (i) object-agnostic, but can only operate on individual frames, or (ii) can model space-time dependencies, but are only designed to work with a single object category. Our approach is grounded in two core principles. First, general information about similar objects can be leveraged to achieve better performance when there is little object-specific training data. Second, a temporally-proximate context window is advantageous for achieving consistency throughout a sequence. These two principles allow us to outperform current state-of-the-art methods on per-frame and per-sequence metrics for a variety of animal categories. Lastly, we release a new synthetic dataset containing 3D skeletons and motion sequences for a variety of animal categories.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes