ROJan 16, 2023Code
Swarm-SLAM : Sparse Decentralized Collaborative Simultaneous Localization and Mapping Framework for Multi-Robot SystemsPierre-Yves Lajoie, Giovanni Beltrame
Collaborative Simultaneous Localization And Mapping (C-SLAM) is a vital component for successful multi-robot operations in environments without an external positioning system, such as indoors, underground or underwater. In this paper, we introduce Swarm-SLAM, an open-source C-SLAM system that is designed to be scalable, flexible, decentralized, and sparse, which are all key properties in swarm robotics. Our system supports inertial, lidar, stereo, and RGB-D sensing, and it includes a novel inter-robot loop closure prioritization technique that reduces communication and accelerates convergence. We evaluated our ROS-2 implementation on five different datasets, and in a real-world experiment with three robots communicating through an ad-hoc network. Our code is publicly available: https://github.com/MISTLab/Swarm-SLAM
CVMar 8, 2022Code
Self-Supervised Domain Calibration and Uncertainty Estimation for Place RecognitionPierre-Yves Lajoie, Giovanni Beltrame
Visual place recognition techniques based on deep learning, which have imposed themselves as the state-of-the-art in recent years, do not generalize well to environments visually different from the training set. Thus, to achieve top performance, it is sometimes necessary to fine-tune the networks to the target environment. To this end, we propose a self-supervised domain calibration procedure based on robust pose graph optimization from Simultaneous Localization and Mapping (SLAM) as the supervision signal without requiring GPS or manual labeling. Moreover, we leverage the procedure to improve uncertainty estimation for place recognition matches which is important in safety critical applications. We show that our approach can improve the performance of a state-of-the-art technique on a target environment dissimilar from its training set and that we can obtain uncertainty estimates. We believe that this approach will help practitioners to deploy robust place recognition solutions in real-world applications. Our code is available publicly: https://github.com/MISTLab/vpr-calibration-and-uncertainty
CVSep 24, 2024
Frequency-based View Selection in Gaussian Splatting ReconstructionMonica M. Q. Li, Pierre-Yves Lajoie, Giovanni Beltrame
Three-dimensional reconstruction is a fundamental problem in robotics perception. We examine the problem of active view selection to perform 3D Gaussian Splatting reconstructions with as few input images as possible. Although 3D Gaussian Splatting has made significant progress in image rendering and 3D reconstruction, the quality of the reconstruction is strongly impacted by the selection of 2D images and the estimation of camera poses through Structure-from-Motion (SfM) algorithms. Current methods to select views that rely on uncertainties from occlusions, depth ambiguities, or neural network predictions directly are insufficient to handle the issue and struggle to generalize to new scenes. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints without ground truth data. By overcoming current constraints on model architecture and efficacy, our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.
62.2ROApr 1Code
Compact Keyframe-Optimized Multi-Agent Gaussian Splatting SLAMMonica M. Q. Li, Pierre-Yves Lajoie, Jialiang Liu et al.
Efficient multi-agent 3D mapping is essential for robotic teams operating in unknown environments, but dense representations hinder real-time exchange over constrained communication links. In multi-agent Simultaneous Localization and Mapping (SLAM), systems typically rely on a centralized server to merge and optimize the local maps produced by individual agents. However, sharing these large map representations, particularly those generated by recent methods such as Gaussian Splatting, becomes a bottleneck in real-world scenarios with limited bandwidth. We present an improved multi-agent RGB-D Gaussian Splatting SLAM framework that reduces communication load while preserving map fidelity. First, we incorporate a compaction step into our SLAM system to remove redundant 3D Gaussians, without degrading the rendering quality. Second, our approach performs centralized loop closure computation without initial guess, operating in two modes: a pure rendered-depth mode that requires no data beyond the 3D Gaussians, and a camera-depth mode that includes lightweight depth images for improved registration accuracy and additional Gaussian pruning. Evaluation on both synthetic and real-world datasets shows up to 85-95\% reduction in transmitted data compared to state-of-the-art approaches in both modes, bringing 3D Gaussian multi-agent SLAM closer to practical deployment in real-world scenarios. Code: https://github.com/lemonci/coko-slam
LGDec 1, 2025
Fantastic Features and Where to Find Them: A Probing Method to combine Features from Multiple Foundation ModelsBenjamin Ramtoula, Pierre-Yves Lajoie, Paul Newman et al.
Foundation models (FMs) trained with different objectives and data learn diverse representations, making some more effective than others for specific downstream tasks. Existing adaptation strategies, such as parameter-efficient fine-tuning, focus on individual models and do not exploit the complementary strengths across models. Probing methods offer a promising alternative by extracting information from frozen models, but current techniques do not scale well with large feature sets and often rely on dataset-specific hyperparameter tuning. We propose Combined backBones (ComBo), a simple and scalable probing-based adapter that effectively integrates features from multiple models and layers. ComBo compresses activations from layers of one or more FMs into compact token-wise representations and processes them with a lightweight transformer for task-specific prediction. Crucially, ComBo does not require dataset-specific tuning or backpropagation through the backbone models. However, not all models are equally relevant for all tasks. To address this, we introduce a mechanism that leverages ComBo's joint multi-backbone probing to efficiently evaluate each backbone's task-relevance, enabling both practical model comparison and improved performance through selective adaptation. On the 19 tasks of the VTAB-1k benchmark, ComBo outperforms previous probing methods, matches or surpasses more expensive alternatives, such as distillation-based model merging, and enables efficient probing of tuned models. Our results demonstrate that ComBo offers a practical and general-purpose framework for combining diverse representations from multiple FMs.
ROSep 26, 2019Code
DOOR-SLAM: Distributed, Online, and Outlier Resilient SLAM for Robotic TeamsPierre-Yves Lajoie, Benjamin Ramtoula, Yun Chang et al.
To achieve collaborative tasks, robots in a team need to have a shared understanding of the environment and their location within it. Distributed Simultaneous Localization and Mapping (SLAM) offers a practical solution to localize the robots without relying on an external positioning system (e.g. GPS) and with minimal information exchange. Unfortunately, current distributed SLAM systems are vulnerable to perception outliers and therefore tend to use very conservative parameters for inter-robot place recognition. However, being too conservative comes at the cost of rejecting many valid loop closure candidates, which results in less accurate trajectory estimates. This paper introduces DOOR-SLAM, a fully distributed SLAM system with an outlier rejection mechanism that can work with less conservative parameters. DOOR-SLAM is based on peer-to-peer communication and does not require full connectivity among the robots. DOOR-SLAM includes two key modules: a pose graph optimizer combined with a distributed pairwise consistent measurement set maximization algorithm to reject spurious inter-robot loop closures; and a distributed SLAM front-end that detects inter-robot loop closures without exchanging raw sensor data. The system has been evaluated in simulations, benchmarking datasets, and field experiments, including tests in GPS-denied subterranean environments. DOOR-SLAM produces more inter-robot loop closures, successfully rejects outliers, and results in accurate trajectory estimates, while requiring low communication bandwidth. Full source code is available at https://github.com/MISTLab/DOOR-SLAM.git.
SPDec 26, 2023
Device-Free Human State Estimation using UWB Multi-Static RadiosSaria Al Laham, Bobak H. Baghi, Pierre-Yves Lajoie et al.
We present a human state estimation framework that allows us to estimate the location, and even the activities, of people in an indoor environment without the requirement that they carry a specific devices with them. To achieve this "device free" localization we use a small number of low-cost Ultra-Wide Band (UWB) sensors distributed across the environment of interest. To achieve high quality estimation from the UWB signals merely reflected of people in the environment, we exploit a deep network that can learn to make inferences. The hardware setup consists of commercial off-the-shelf (COTS) single antenna UWB modules for sensing, paired with Raspberry PI units for computational processing and data transfer. We make use of the channel impulse response (CIR) measurements from the UWB sensors to estimate the human state - comprised of location and activity - in a given area. Additionally, we can also estimate the number of humans that occupy this region of interest. In our approach, first, we pre-process the CIR data which involves meticulous aggregation of measurements and extraction of key statistics. Afterwards, we leverage a convolutional deep neural network to map the CIRs into precise location estimates with sub-30 cm accuracy. Similarly, we achieve accurate human activity recognition and occupancy counting results. We show that we can quickly fine-tune our model for new out-of-distribution users, a process that requires only a few minutes of data and a few epochs of training. Our results show that UWB is a promising solution for adaptable smart-home localization and activity recognition problems.
ROSep 29, 2021
DORA: Distributed Online Risk-Aware ExplorerDavid Vielfaure, Samuel Arseneault, Pierre-Yves Lajoie et al.
Exploration of unknown environments is an important challenge in the field of robotics. While a single robot can achieve this task alone, evidence suggests it could be accomplished more efficiently by groups of robots, with advantages in terms of terrain coverage as well as robustness to failures. Exploration can be guided through belief maps, which provide probabilistic information about which part of the terrain is interesting to explore (either based on risk management or reward). This process can be centrally coordinated by building a collective belief map on a common server. However, relying on a central processing station creates a communication bottleneck and single point of failure for the system. In this paper, we present Distributed Online Risk-Aware (DORA) Explorer, an exploration system that leverages decentralized information sharing to update a common risk belief map. DORA Explorer allows a group of robots to explore an unknown environment discretized as a 2D grid with obstacles, with high coverage while minimizing exposure to risk, effectively reducing robot failures
ROAug 18, 2021
Towards Collaborative Simultaneous Localization and Mapping: a Survey of the Current Research LandscapePierre-Yves Lajoie, Benjamin Ramtoula, Fang Wu et al.
Motivated by the tremendous progress we witnessed in recent years, this paper presents a survey of the scientific literature on the topic of Collaborative Simultaneous Localization and Mapping (C-SLAM), also known as multi-robot SLAM. With fleets of self-driving cars on the horizon and the rise of multi-robot systems in industrial applications, we believe that Collaborative SLAM will soon become a cornerstone of future robotic applications. In this survey, we introduce the basic concepts of C-SLAM and present a thorough literature review. We also outline the major challenges and limitations of C-SLAM in terms of robustness, communication, and resource management. We conclude by exploring the area's current trends and promising research avenues.
ROOct 27, 2018
Modeling Perceptual Aliasing in SLAM via Discrete-Continuous Graphical ModelsPierre-Yves Lajoie, Siyi Hu, Giovanni Beltrame et al.
Perceptual aliasing is one of the main causes of failure for Simultaneous Localization and Mapping (SLAM) systems operating in the wild. Perceptual aliasing is the phenomenon where different places generate a similar visual (or, in general, perceptual) footprint. This causes spurious measurements to be fed to the SLAM estimator, which typically results in incorrect localization and mapping results. The problem is exacerbated by the fact that those outliers are highly correlated, in the sense that perceptual aliasing creates a large number of mutually-consistent outliers. Another issue stems from the fact that most state-of-the-art techniques rely on a given trajectory guess (e.g., from odometry) to discern between inliers and outliers and this makes the resulting pipeline brittle, since the accumulation of error may result in incorrect choices and recovery from failures is far from trivial. This work provides a unified framework to model perceptual aliasing in SLAM and provides practical algorithms that can cope with outliers without relying on any initial guess. We present two main contributions. The first is a Discrete-Continuous Graphical Model (DC-GM) for SLAM: the continuous portion of the DC-GM captures the standard SLAM problem, while the discrete portion describes the selection of the outliers and models their correlation. The second contribution is a semidefinite relaxation to perform inference in the DC-GM that returns estimates with provable sub-optimality guarantees. Experimental results on standard benchmarking datasets show that the proposed technique compares favorably with state-of-the-art methods while not relying on an initial guess for optimization.