PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments
This provides a new benchmark for researchers in computer vision to evaluate pose estimation methods in cluttered scenarios, but it is incremental as it focuses on dataset creation rather than algorithmic innovation.
The authors tackled the problem of pose estimation in cluttered environments by introducing PACE, a large-scale dataset with 55K real-world frames and 258K annotations, along with a simulated version, to benchmark and reveal challenges for state-of-the-art methods.
We introduce PACE (Pose Annotations in Cluttered Environments), a large-scale benchmark designed to advance the development and evaluation of pose estimation methods in cluttered scenarios. PACE provides a large-scale real-world benchmark for both instance-level and category-level settings. The benchmark consists of 55K frames with 258K annotations across 300 videos, covering 238 objects from 43 categories and featuring a mix of rigid and articulated items in cluttered scenes. To annotate the real-world data efficiently, we develop an innovative annotation system with a calibrated 3-camera setup. Additionally, we offer PACE-Sim, which contains 100K photo-realistic simulated frames with 2.4M annotations across 931 objects. We test state-of-the-art algorithms in PACE along two tracks: pose estimation, and object pose tracking, revealing the benchmark's challenges and research opportunities. Our benchmark code and data is available on https://github.com/qq456cvb/PACE.