Anders Glent Buch

CV
10papers
1,006citations
Novelty54%
AI Score29

10 Papers

CVSep 20, 2022
Ki-Pode: Keypoint-based Implicit Pose Distribution Estimation of Rigid Objects

Thorbjørn Mosekjær Iversen, Rasmus Laurvig Haugaard, Anders Glent Buch

The estimation of 6D poses of rigid objects is a fundamental problem in computer vision. Traditionally pose estimation is concerned with the determination of a single best estimate. However, a single estimate is unable to express visual ambiguity, which in many cases is unavoidable due to object symmetries or occlusion of identifying features. Inability to account for ambiguities in pose can lead to failure in subsequent methods, which is unacceptable when the cost of failure is high. Estimates of full pose distributions are, contrary to single estimates, well suited for expressing uncertainty on pose. Motivated by this, we propose a novel pose distribution estimation method. An implicit formulation of the probability distribution over object pose is derived from an intermediary representation of an object as a set of keypoints. This ensures that the pose distribution estimates have a high level of interpretability. Furthermore, our method is based on conservative approximations, which leads to reliable estimates. The method has been evaluated on the task of rotation distribution estimation on the YCB-V and T-LESS datasets and performs reliably on all objects.

CVMar 2, 2022
ParaPose: Parameter and Domain Randomization Optimization for Pose Estimation using Synthetic Data

Frederik Hagelskjaer, Anders Glent Buch

Pose estimation is the task of determining the 6D position of an object in a scene. Pose estimation aid the abilities and flexibility of robotic set-ups. However, the system must be configured towards the use case to perform adequately. This configuration is time-consuming and limits the usability of pose estimation and, thereby, robotic systems. Deep learning is a method to overcome this configuration procedure by learning parameters directly from the dataset. However, obtaining this training data can also be very time-consuming. The use of synthetic training data avoids this data collection problem, but a configuration of the training procedure is necessary to overcome the domain gap problem. Additionally, the pose estimation parameters also need to be configured. This configuration is jokingly known as grad student descent as parameters are manually adjusted until satisfactory results are obtained. This paper presents a method for automatic configuration using only synthetic data. This is accomplished by learning the domain randomization during network training, and then using the domain randomization to optimize the pose estimation parameters. The developed approach shows state-of-the-art performance of 82.0 % recall on the challenging OCCLUSION dataset, outperforming all previous methods with a large margin. These results prove the validity of automatic set-up of pose estimation using purely synthetic data.

CVNov 26, 2021
SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings

Rasmus Laurvig Haugaard, Anders Glent Buch

We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data with no prior knowledge of visual ambiguities like symmetry. We also present a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses. The correspondence distributions are learnt with a contrastive loss, represented in object-specific latent spaces by an encoder-decoder query model and a small fully connected key model. Our method is unsupervised with respect to visual ambiguities, yet we show that the query- and key models learn to represent accurate multi-modal surface distributions. Our pose estimation method improves the state-of-the-art significantly on the comprehensive BOP Challenge, trained purely on synthetic data, even compared with methods trained on real data. The project site is at https://surfemb.github.io/ .

CVNov 17, 2020
Bridging the Reality Gap for Pose Estimation Networks using Sensor-Based Domain Randomization

Frederik Hagelskjaer, Anders Glent Buch

Since the introduction of modern deep learning methods for object pose estimation, test accuracy and efficiency has increased significantly. For training, however, large amounts of annotated training data are required for good performance. While the use of synthetic training data prevents the need for manual annotation, there is currently a large performance gap between methods trained on real and synthetic data. This paper introduces a new method, which bridges this gap. Most methods trained on synthetic data use 2D images, as domain randomization in 2D is more developed. To obtain precise poses, many of these methods perform a final refinement using 3D data. Our method integrates the 3D data into the network to increase the accuracy of the pose estimation. To allow for domain randomization in 3D, a sensor-based data augmentation has been developed. Additionally, we introduce the SparseEdge feature, which uses a wider search space during point cloud propagation to avoid relying on specific features without increasing run-time. Experiments on three large pose estimation benchmarks show that the presented method outperforms previous methods trained on synthetic data and achieves comparable results to existing methods trained on real data.

RONov 12, 2020
Fast robust peg-in-hole insertion with continuous visual servoing

Rasmus Laurvig Haugaard, Jeppe Langaa, Christoffer Sloth et al.

This paper demonstrates a visual servoing method which is robust towards uncertainties related to system calibration and grasping, while significantly reducing the peg-in-hole time compared to classical methods and recent attempts based on deep learning. The proposed visual servoing method is based on peg and hole point estimates from a deep neural network in a multi-cam setup, where the model is trained on purely synthetic data. Empirical results show that the learnt model generalizes to the real world, allowing for higher success rates and lower cycle times than existing approaches.

CVDec 19, 2019
PointVoteNet: Accurate Object Detection and 6 DoF Pose Estimation in Point Clouds

Frederik Hagelskjær, Anders Glent Buch

We present a learning-based method for 6 DoF pose estimation of rigid objects in point cloud data. Many recent learning-based approaches use primarily RGB information for detecting objects, in some cases with an added refinement step using depth data. Our method consumes unordered point sets with/without RGB information, from initial detection to the final transformation estimation stage. This allows us to achieve accurate pose estimates, in some cases surpassing state of the art methods trained on the same data.

CVAug 24, 2018
BOP: Benchmark for 6D Object Pose Estimation

Tomas Hodan, Frank Michel, Eric Brachmann et al.

We propose a benchmark for 6D pose estimation of a rigid object from a single RGB-D input image. The training data consists of a texture-mapped 3D object model or images of the object in known 6D poses. The benchmark comprises of: i) eight datasets in a unified format that cover different practical scenarios, including two new datasets focusing on varying lighting conditions, ii) an evaluation methodology with a pose-error function that deals with pose ambiguities, iii) a comprehensive evaluation of 15 diverse recent methods that captures the status quo of the field, and iv) an online evaluation system that is open for continuous submission of new results. The evaluation shows that methods based on point-pair features currently perform best, outperforming template matching methods, learning-based methods and methods based on 3D local features. The project website is available at bop.felk.cvut.cz.

CVSep 7, 2017
Rotational Subgroup Voting and Pose Clustering for Robust 3D Object Recognition

Anders Glent Buch, Lilita Kiforenko, Dirk Kraft

It is possible to associate a highly constrained subset of relative 6 DoF poses between two 3D shapes, as long as the local surface orientation, the normal vector, is available at every surface point. Local shape features can be used to find putative point correspondences between the models due to their ability to handle noisy and incomplete data. However, this correspondence set is usually contaminated by outliers in practical scenarios, which has led to many past contributions based on robust detectors such as the Hough transform or RANSAC. The key insight of our work is that a single correspondence between oriented points on the two models is constrained to cast votes in a 1 DoF rotational subgroup of the full group of poses, SE(3). Kernel density estimation allows combining the set of votes efficiently to determine a full 6 DoF candidate pose between the models. This modal pose with the highest density is stable under challenging conditions, such as noise, clutter, and occlusions, and provides the output estimate of our method. We first analyze the robustness of our method in relation to noise and show that it handles high outlier rates much better than RANSAC for the task of 6 DoF pose estimation. We then apply our method to four state of the art data sets for 3D object recognition that contain occluded and cluttered scenes. Our method achieves perfect recall on two LIDAR data sets and outperforms competing methods on two RGB-D data sets, thus setting a new standard for general 3D object recognition using point cloud data.

CVAug 23, 2017
In search of inliers: 3d correspondence by local and global voting

Anders Glent Buch, Yang Yang, Norbert Krüger et al.

We present a method for finding correspondence between 3D models. From an initial set of feature correspondences, our method uses a fast voting scheme to separate the inliers from the outliers. The novelty of our method lies in the use of a combination of local and global constraints to determine if a vote should be cast. On a local scale, we use simple, low-level geometric invariants. On a global scale, we apply covariant constraints for finding compatible correspondences. We guide the sampling for collecting voters by downward dependencies on previous voting stages. All of this together results in an accurate matching procedure. We evaluate our algorithm by controlled and comparative testing on different datasets, giving superior performance compared to state of the art methods. In a final experiment, we apply our method for 3D object detection, showing potential use of our method within higher-level vision.

CVAug 23, 2017
Pose Estimation using Local Structure-Specific Shape and Appearance Context

Anders Glent Buch, Dirk Kraft, Joni-Kristian Kamarainen et al.

We address the problem of estimating the alignment pose between two models using structure-specific local descriptors. Our descriptors are generated using a combination of 2D image data and 3D contextual shape data, resulting in a set of semi-local descriptors containing rich appearance and shape information for both edge and texture structures. This is achieved by defining feature space relations which describe the neighborhood of a descriptor. By quantitative evaluations, we show that our descriptors provide high discriminative power compared to state of the art approaches. In addition, we show how to utilize this for the estimation of the alignment pose between two point sets. We present experiments both in controlled and real-life scenarios to validate our approach.