Rasmus Laurvig Haugaard

CV
7papers
244citations
Novelty61%
AI Score32

7 Papers

CVSep 20, 2022
Ki-Pode: Keypoint-based Implicit Pose Distribution Estimation of Rigid Objects

Thorbjørn Mosekjær Iversen, Rasmus Laurvig Haugaard, Anders Glent Buch

The estimation of 6D poses of rigid objects is a fundamental problem in computer vision. Traditionally pose estimation is concerned with the determination of a single best estimate. However, a single estimate is unable to express visual ambiguity, which in many cases is unavoidable due to object symmetries or occlusion of identifying features. Inability to account for ambiguities in pose can lead to failure in subsequent methods, which is unacceptable when the cost of failure is high. Estimates of full pose distributions are, contrary to single estimates, well suited for expressing uncertainty on pose. Motivated by this, we propose a novel pose distribution estimation method. An implicit formulation of the probability distribution over object pose is derived from an intermediary representation of an object as a set of keypoints. This ensures that the pose distribution estimates have a high level of interpretability. Furthermore, our method is based on conservative approximations, which leads to reliable estimates. The method has been evaluated on the task of rotation distribution estimation on the YCB-V and T-LESS datasets and performs reliably on all objects.

CVOct 3, 2022
Multi-view object pose estimation from correspondence distributions and epipolar geometry

Rasmus Laurvig Haugaard, Thorbjørn Mosekjær Iversen

In many automation tasks involving manipulation of rigid objects, the poses of the objects must be acquired. Vision-based pose estimation using a single RGB or RGB-D sensor is especially popular due to its broad applicability. However, single-view pose estimation is inherently limited by depth ambiguity and ambiguities imposed by various phenomena like occlusion, self-occlusion, reflections, etc. Aggregation of information from multiple views can potentially resolve these ambiguities, but the current state-of-the-art multi-view pose estimation method only uses multiple views to aggregate single-view pose estimates, and thus rely on obtaining good single-view estimates. We present a multi-view pose estimation method which aggregates learned 2D-3D distributions from multiple views for both the initial estimate and optional refinement. Our method performs probabilistic sampling of 3D-3D correspondences under epipolar constraints using learned 2D-3D correspondence distributions which are implicitly trained to respect visual ambiguities such as symmetry. Evaluation on the T-LESS dataset shows that our method reduces pose estimation errors by 80-91% compared to the best single-view method, and we present state-of-the-art results on T-LESS with four views, even compared with methods using five and eight views.

CVSep 10, 2024
Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences

Shishir Reddy Vutukur, Rasmus Laurvig Haugaard, Junwen Huang et al.

Object pose distribution estimation is crucial in robotics for better path planning and handling of symmetric objects. Recent distribution estimation approaches employ contrastive learning-based approaches by maximizing the likelihood of a single pose estimate in the absence of a CAD model. We propose a pose distribution estimation method leveraging symmetry respecting correspondence distributions and shape information obtained using a CAD model. Contrastive learning-based approaches require an exhaustive amount of training images from different viewpoints to learn the distribution properly, which is not possible in realistic scenarios. Instead, we propose a pipeline that can leverage correspondence distributions and shape information from the CAD model, which are later used to learn pose distributions. Besides, having access to pose distribution based on correspondences before learning pose distributions conditioned on images, can help formulate the loss between distributions. The prior knowledge of distribution also helps the network to focus on getting sharper modes instead. With the CAD prior, our approach converges much faster and learns distribution better by focusing on learning sharper distribution near all the valid modes, unlike contrastive approaches, which focus on a single mode at a time. We achieve benchmark results on SYMSOL-I and T-Less datasets.

CVMar 9, 2023
SpyroPose: SE(3) Pyramids for Object Pose Distribution Estimation

Rasmus Laurvig Haugaard, Frederik Hagelskjær, Thorbjørn Mosekjær Iversen

Object pose estimation is a core computer vision problem and often an essential component in robotics. Pose estimation is usually approached by seeking the single best estimate of an object's pose, but this approach is ill-suited for tasks involving visual ambiguity. In such cases it is desirable to estimate the uncertainty as a pose distribution to allow downstream tasks to make informed decisions. Pose distributions can have arbitrary complexity which motivates estimating unparameterized distributions, however, until now they have only been used for orientation estimation on SO(3) due to the difficulty in training on and normalizing over SE(3). We propose a novel method for pose distribution estimation on SE(3). We use a hierarchical grid, a pyramid, which enables efficient importance sampling during training and sparse evaluation of the pyramid at inference, allowing real time 6D pose distribution estimation. Our method outperforms state-of-the-art methods on SO(3), and to the best of our knowledge, we provide the first quantitative results on pose distribution estimation on SE(3). Code will be available at spyropose.github.io

CVMar 28, 2023
KeyMatchNet: Zero-Shot Pose Estimation in 3D Point Clouds by Generalized Keypoint Matching

Frederik Hagelskjær, Rasmus Laurvig Haugaard

In this paper, we present KeyMatchNet, a novel network for zero-shot pose estimation in 3D point clouds. Our method uses only depth information, making it more applicable for many industrial use cases, as color information is seldom available. The network is composed of two parallel components for computing object and scene features. The features are then combined to create matches used for pose estimation. The parallel structure allows for pre-processing of the individual parts, which decreases the run-time. Using a zero-shot network allows for a very short set-up time, as it is not necessary to train models for new objects. However, as the network is not trained for the specific object, zero-shot pose estimation methods generally have lower accuracy compared with conventional methods. To address this, we reduce the complexity of the task by including the scenario information during training. This is typically not feasible as collecting real data for new tasks drastically increases the cost. However, for zero-shot pose estimation, training for new objects is not necessary and the expensive data collection can thus be performed only once. Our method is trained on 1,500 objects and is only tested on unseen objects. We demonstrate that the trained network can not only accurately estimate poses for novel objects, but also demonstrate the ability of the network on objects outside of the trained class. Test results are also shown on real data. We believe that the presented method is valuable for many real-world scenarios. Project page available at keymatchnet.github.io

CVNov 26, 2021
SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings

Rasmus Laurvig Haugaard, Anders Glent Buch

We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data with no prior knowledge of visual ambiguities like symmetry. We also present a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses. The correspondence distributions are learnt with a contrastive loss, represented in object-specific latent spaces by an encoder-decoder query model and a small fully connected key model. Our method is unsupervised with respect to visual ambiguities, yet we show that the query- and key models learn to represent accurate multi-modal surface distributions. Our pose estimation method improves the state-of-the-art significantly on the comprehensive BOP Challenge, trained purely on synthetic data, even compared with methods trained on real data. The project site is at https://surfemb.github.io/ .

RONov 12, 2020
Fast robust peg-in-hole insertion with continuous visual servoing

Rasmus Laurvig Haugaard, Jeppe Langaa, Christoffer Sloth et al.

This paper demonstrates a visual servoing method which is robust towards uncertainties related to system calibration and grasping, while significantly reducing the peg-in-hole time compared to classical methods and recent attempts based on deep learning. The proposed visual servoing method is based on peg and hole point estimates from a deep neural network in a multi-cam setup, where the model is trained on purely synthetic data. Empirical results show that the learnt model generalizes to the real world, allowing for higher success rates and lower cycle times than existing approaches.