Pavel Burget

h-index7

5papers

63citations

Novelty36%

AI Score41

Ranked #66,300 of 194,257 authors (top 34%)#22,669 in CV (top 38%)

5 Papers

6.5CVAug 15, 2024Code

Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation

Varun Burde, Assia Benbihi, Pavel Burget et al.

Object pose estimation is essential to many industrial applications involving robotic manipulation, navigation, and augmented reality. Current generalizable object pose estimators, i.e., approaches that do not need to be trained per object, rely on accurate 3D models. Predominantly, CAD models are used, which can be hard to obtain in practice. At the same time, it is often possible to acquire images of an object. Naturally, this leads to the question whether 3D models reconstructed from images are sufficient to facilitate accurate object pose estimation. We aim to answer this question by proposing a novel benchmark for measuring the impact of 3D reconstruction quality on pose estimation accuracy. Our benchmark provides calibrated images for object reconstruction registered with the test images of the YCB-V dataset for pose evaluation under the BOP benchmark format. Detailed experiments with multiple state-of-the-art 3D reconstruction and object pose estimation approaches show that the geometry produced by modern reconstruction methods is often sufficient for accurate pose estimation. Our experiments lead to interesting observations: (1) Standard metrics for measuring 3D reconstruction quality are not necessarily indicative of pose estimation accuracy, which shows the need for dedicated benchmarks such as ours. (2) Classical, non-learning-based approaches can perform on par with modern learning-based reconstruction techniques and can even offer a better reconstruction time-pose accuracy tradeoff. (3) There is still a sizable gap between performance with reconstructed and with CAD models. To foster research on closing this gap, our benchmark is publicly available at https://github.com/VarunBurde/reconstruction_pose_benchmark}.

2.2ROFeb 19

Benchmarking the Effects of Object Pose Estimation and Reconstruction on Robotic Grasping Success

Varun Burde, Pavel Burget, Torsten Sattler

3D reconstruction serves as the foundational layer for numerous robotic perception tasks, including 6D object pose estimation and grasp pose generation. Modern 3D reconstruction methods for objects can produce visually and geometrically impressive meshes from multi-view images, yet standard geometric evaluations do not reflect how reconstruction quality influences downstream tasks such as robotic manipulation performance. This paper addresses this gap by introducing a large-scale, physics-based benchmark that evaluates 6D pose estimators and 3D mesh models based on their functional efficacy in grasping. We analyze the impact of model fidelity by generating grasps on various reconstructed 3D meshes and executing them on the ground-truth model, simulating how grasp poses generated with an imperfect model affect interaction with the real object. This assesses the combined impact of pose error, grasp robustness, and geometric inaccuracies from 3D reconstruction. Our results show that reconstruction artifacts significantly decrease the number of grasp pose candidates but have a negligible effect on grasping performance given an accurately estimated pose. Our results also reveal that the relationship between grasp success and pose error is dominated by spatial error, and even a simple translation error provides insight into the success of the grasping pose of symmetric objects. This work provides insight into how perception systems relate to object manipulation using robots.

3.7CVOct 17, 2024

Object Pose Estimation Using Implicit Representation For Transparent Objects

Varun Burde, Artem Moroz, Vit Zeman et al.

Object pose estimation is a prominent task in computer vision. The object pose gives the orientation and translation of the object in real-world space, which allows various applications such as manipulation, augmented reality, etc. Various objects exhibit different properties with light, such as reflections, absorption, etc. This makes it challenging to understand the object's structure in RGB and depth channels. Recent research has been moving toward learning-based methods, which provide a more flexible and generalizable approach to object pose estimation utilizing deep learning. One such approach is the render-and-compare method, which renders the object from multiple views and compares it against the given 2D image, which often requires an object representation in the form of a CAD model. We reason that the synthetic texture of the CAD model may not be ideal for rendering and comparing operations. We showed that if the object is represented as an implicit (neural) representation in the form of Neural Radiance Field (NeRF), it exhibits a more realistic rendering of the actual scene and retains the crucial spatial features, which makes the comparison more versatile. We evaluated our NeRF implementation of the render-and-compare method on transparent datasets and found that it surpassed the current state-of-the-art results.

3.6CVNov 16, 2025

OPFormer: Object Pose Estimation leveraging foundation model with geometric encoding

Artem Moroz, Vít Zeman, Martin Mikšík et al.

We introduce a unified, end-to-end framework that seamlessly integrates object detection and pose estimation with a versatile onboarding process. Our pipeline begins with an onboarding stage that generates object representations from either traditional 3D CAD models or, in their absence, by rapidly reconstructing a high-fidelity neural representation (NeRF) from multi-view images. Given a test image, our system first employs the CNOS detector to localize target objects. For each detection, our novel pose estimation module, OPFormer, infers the precise 6D pose. The core of OPFormer is a transformer-based architecture that leverages a foundation model for robust feature extraction. It uniquely learns a comprehensive object representation by jointly encoding multiple template views and enriches these features with explicit 3D geometric priors using Normalized Object Coordinate Space (NOCS). A decoder then establishes robust 2D-3D correspondences to determine the final pose. Evaluated on the challenging BOP benchmarks, our integrated system demonstrates a strong balance between accuracy and efficiency, showcasing its practical applicability in both model-based and model-free scenarios.

2.9ROFeb 16, 2018Code

Energy Optimization of Robotic Cells

Libor Bukata, Přemysl Šůcha, Zdeněk Hanzálek et al.

This study focuses on the energy optimization of industrial robotic cells, which is essential for sustainable production in the long term. A holistic approach that considers a robotic cell as a whole toward minimizing energy consumption is proposed. The mathematical model, which takes into account various robot speeds, positions, power-saving modes, and alternative orders of operations, can be transformed into a mixed-integer linear programming formulation that is, however, suitable only for small instances. To optimize complex robotic cells, a hybrid heuristic accelerated by using multicore processors and the Gurobi simplex method for piecewise linear convex functions is implemented. The experimental results showed that the heuristic solved 93 % of instances with a solution quality close to a proven lower bound. Moreover, compared with the existing works, which typically address problems with three to four robots, this study solved real-size problem instances with up to 12 robots and considered more optimization aspects. The proposed algorithms were also applied on an existing robotic cell in Škoda Auto. The outcomes, based on simulations and measurements, indicate that, compared with the previous state (at maximal robot speeds and without deeper power-saving modes), the energy consumption can be reduced by about 20 % merely by optimizing the robot speeds and applying power-saving modes. All the software and generated datasets used in this research are publicly available.