CVDec 7, 2022Code
Domain generalization of 3D semantic segmentation in autonomous drivingJules Sanchez, Jean-Emmanuel Deschaud, Francois Goulette
Using deep learning, 3D autonomous driving semantic segmentation has become a well-studied subject, with methods that can reach very high performance. Nonetheless, because of the limited size of the training datasets, these models cannot see every type of object and scene found in real-world applications. The ability to be reliable in these various unknown environments is called \textup{domain generalization}. Despite its importance, domain generalization is relatively unexplored in the case of 3D autonomous driving semantic segmentation. To fill this gap, this paper presents the first benchmark for this application by testing state-of-the-art methods and discussing the difficulty of tackling Laser Imaging Detection and Ranging (LiDAR) domain shifts. We also propose the first method designed to address this domain generalization, which we call 3DLabelProp. This method relies on leveraging the geometry and sequentiality of the LiDAR data to enhance its generalization performances by working on partially accumulated point clouds. It reaches a mean Intersection over Union (mIoU) of 50.4% on SemanticPOSS and of 55.2% on PandaSet solid-state LiDAR while being trained only on SemanticKITTI, making it the state-of-the-art method for generalization (+5% and +33% better, respectively, than the second best method). The code for this method is available on GitHub: https://github.com/JulesSanchez/3DLabelProp.
CVAug 2, 2023Code
MDT3D: Multi-Dataset Training for LiDAR 3D Object Detection GeneralizationLouis Soum-Fontez, Jean-Emmanuel Deschaud, François Goulette
Supervised 3D Object Detection models have been displaying increasingly better performance in single-domain cases where the training data comes from the same environment and sensor as the testing data. However, in real-world scenarios data from the target domain may not be available for finetuning or for domain adaptation methods. Indeed, 3D object detection models trained on a source dataset with a specific point distribution have shown difficulties in generalizing to unseen datasets. Therefore, we decided to leverage the information available from several annotated source datasets with our Multi-Dataset Training for 3D Object Detection (MDT3D) method to increase the robustness of 3D object detection models when tested in a new environment with a different sensor configuration. To tackle the labelling gap between datasets, we used a new label mapping based on coarse labels. Furthermore, we show how we managed the mix of datasets during training and finally introduce a new cross-dataset augmentation method: cross-dataset object injection. We demonstrate that this training paradigm shows improvements for different types of 3D object detection models. The source code and additional results for this research project will be publicly available on GitHub for interested parties to access and utilize: https://github.com/LouisSF/MDT3D
CVAug 6, 2024
RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View SynthesisHugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic
Differentiable volumetric rendering-based methods made significant progress in novel view synthesis. On one hand, innovative methods have replaced the Neural Radiance Fields (NeRF) network with locally parameterized structures, enabling high-quality renderings in a reasonable time. On the other hand, approaches have used differentiable splatting instead of NeRF's ray casting to optimize radiance fields rapidly using Gaussian kernels, allowing for fine adaptation to the scene. However, differentiable ray casting of irregularly spaced kernels has been scarcely explored, while splatting, despite enabling fast rendering times, is susceptible to clearly visible artifacts. Our work closes this gap by providing a physically consistent formulation of the emitted radiance c and density σ, decomposed with Gaussian functions associated with Spherical Gaussians/Harmonics for all-frequency colorimetric representation. We also introduce a method enabling differentiable ray casting of irregularly distributed Gaussians using an algorithm that integrates radiance fields slab by slab and leverages a BVH structure. This allows our approach to finely adapt to the scene while avoiding splatting artifacts. As a result, we achieve superior rendering quality compared to the state-of-the-art while maintaining reasonable training times and achieving inference speeds of 25 FPS on the Blender dataset. Project page with videos and code: https://raygauss.github.io/
CVNov 6, 2023
COLA: COarse-LAbel multi-source LiDAR semantic segmentation for autonomous drivingJules Sanchez, Jean-Emmanuel Deschaud, François Goulette
LiDAR semantic segmentation for autonomous driving has been a growing field of interest in recent years. Datasets and methods have appeared and expanded very quickly, but methods have not been updated to exploit this new data availability and rely on the same classical datasets. Different ways of performing LIDAR semantic segmentation training and inference can be divided into several subfields, which include the following: domain generalization, source-to-source segmentation, and pre-training. In this work, we aim to improve results in all of these subfields with the novel approach of multi-source training. Multi-source training relies on the availability of various datasets at training time. To overcome the common obstacles in multi-source training, we introduce the coarse labels and call the newly created multi-source dataset COLA. We propose three applications of this new dataset that display systematic improvement over single-source strategies: COLA-DG for domain generalization (+10%), COLA-S2S for source-to-source segmentation (+5.3%), and COLA-PT for pre-training (+12%). We demonstrate that multi-source approaches bring systematic improvement over single-source approaches.
CVOct 25, 2023
ParisLuco3D: A high-quality target dataset for domain generalization of LiDAR perceptionJules Sanchez, Louis Soum-Fontez, Jean-Emmanuel Deschaud et al.
LiDAR is an essential sensor for autonomous driving by collecting precise geometric information regarding a scene. %Exploiting this information for perception is interesting as the amount of available data increases. As the performance of various LiDAR perception tasks has improved, generalizations to new environments and sensors has emerged to test these optimized models in real-world conditions. This paper provides a novel dataset, ParisLuco3D, specifically designed for cross-domain evaluation to make it easier to evaluate the performance utilizing various source datasets. Alongside the dataset, online benchmarks for LiDAR semantic segmentation, LiDAR object detection, and LiDAR tracking are provided to ensure a fair comparison across methods. The ParisLuco3D dataset, evaluation scripts, and links to benchmarks can be found at the following website:https://npm3d.fr/parisluco3d
CVMar 17, 2022
Unsigned Distance Field as an Accurate 3D Scene Representation for Neural Scene CompletionJean Pierre Richa, Jean-Emmanuel Deschaud, François Goulette et al.
Scene Completion is the task of completing missing geometry from a partial scan of a scene. Most previous methods compute an implicit representation from range data using a Truncated Signed Distance Function (T-SDF) computed on a 3D grid as input to neural networks. The truncation decreases but does not remove the border errors introduced by the sign of SDF for open surfaces. As an alternative, we present an Unsigned Distance Function (UDF) as an input representation to scene completion neural networks. The proposed UDF is simple, and efficient as a geometry representation, and can be computed on any point cloud. In contrast to usual Signed Distance Functions, our UDF does not require normal computation. To obtain the explicit geometry, we present a method for extracting a point cloud from discretized UDF values on a sparse grid. We compare different SDFs and UDFs for the scene completion task on indoor and outdoor point clouds collected using RGB-D and LiDAR sensors and show improved completion using the proposed UDF function.
CVOct 8, 2025Code
HARP-NeXt: High-Speed and Accurate Range-Point Fusion Network for 3D LiDAR Semantic SegmentationSamir Abou Haidar, Alexandre Chariot, Mehdi Darouich et al.
LiDAR semantic segmentation is crucial for autonomous vehicles and mobile robots, requiring high accuracy and real-time processing, especially on resource-constrained embedded systems. Previous state-of-the-art methods often face a trade-off between accuracy and speed. Point-based and sparse convolution-based methods are accurate but slow due to the complexity of neighbor searching and 3D convolutions. Projection-based methods are faster but lose critical geometric information during the 2D projection. Additionally, many recent methods rely on test-time augmentation (TTA) to improve performance, which further slows the inference. Moreover, the pre-processing phase across all methods increases execution time and is demanding on embedded platforms. Therefore, we introduce HARP-NeXt, a high-speed and accurate LiDAR semantic segmentation network. We first propose a novel pre-processing methodology that significantly reduces computational overhead. Then, we design the Conv-SE-NeXt feature extraction block to efficiently capture representations without deep layer stacking per network stage. We also employ a multi-scale range-point fusion backbone that leverages information at multiple abstraction levels to preserve essential geometric details, thereby enhancing accuracy. Experiments on the nuScenes and SemanticKITTI benchmarks show that HARP-NeXt achieves a superior speed-accuracy trade-off compared to all state-of-the-art methods, and, without relying on ensemble models or TTA, is comparable to the top-ranked PTv3, while running 24$\times$ faster. The code is available at https://github.com/SamirAbouHaidar/HARP-NeXt
CVNov 22, 2021Code
Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D MappingJean-Emmanuel Deschaud, David Duque, Jean Pierre Richa et al.
Paris-CARLA-3D is a dataset of several dense colored point clouds of outdoor environments built by a mobile LiDAR and camera system. The data are composed of two sets with synthetic data from the open source CARLA simulator (700 million points) and real data acquired in the city of Paris (60 million points), hence the name Paris-CARLA-3D. One of the advantages of this dataset is to have simulated the same LiDAR and camera platform in the open source CARLA simulator as the one used to produce the real data. In addition, manual annotation of the classes using the semantic tags of CARLA was performed on the real data, allowing the testing of transfer methods from the synthetic to the real data. The objective of this dataset is to provide a challenging dataset to evaluate and improve methods on difficult vision tasks for the 3D mapping of outdoor environments: semantic segmentation, instance segmentation, and scene completion. For each task, we describe the evaluation protocol as well as the experiments carried out to establish a baseline.
CVMar 26, 2021Code
3D Point Cloud Registration with Multi-Scale Architecture and Unsupervised Transfer LearningSofiane Horache, Jean-Emmanuel Deschaud, François Goulette
We propose a method for generalizing deep learning for 3D point cloud registration on new, totally different datasets. It is based on two components, MS-SVConv and UDGE. Using Multi-Scale Sparse Voxel Convolution, MS-SVConv is a fast deep neural network that outputs the descriptors from point clouds for 3D registration between two scenes. UDGE is an algorithm for transferring deep networks on unknown datasets in a unsupervised way. The interest of the proposed method appears while using the two components, MS-SVConv and UDGE, together as a whole, which leads to state-of-the-art results on real world registration datasets such as 3DMatch, ETH and TUM. The code is publicly available at https://github.com/humanpose1/MS-SVConv .
ROMar 17, 2021Code
What's in My LiDAR Odometry Toolbox?Pierre Dellenbach, Jean-Emmanuel Deschaud, Bastien Jacquet et al.
With the democratization of 3D LiDAR sensors, precise LiDAR odometries and SLAM are in high demand. New methods regularly appear, proposing solutions ranging from small variations in classical algorithms to radically new paradigms based on deep learning. Yet it is often difficult to compare these methods, notably due to the few datasets on which the methods can be evaluated and compared. Furthermore, their weaknesses are rarely examined, often letting the user discover the hard way whether a method would be appropriate for a use case. In this paper, we review and organize the main 3D LiDAR odometries into distinct categories. We implemented several approaches (geometric based, deep learning based, and hybrid methods) to conduct an in-depth analysis of their strengths and weaknesses on multiple datasets, guiding the reader through the different LiDAR odometries available. Implementation of the methods has been made publicly available at https://github.com/Kitware/pyLiDAR-SLAM.
CVSep 9, 2025
RayGaussX: Accelerating Gaussian-Based Ray Marching for Real-Time and High-Quality Novel View SynthesisHugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic
RayGauss has achieved state-of-the-art rendering quality for novel-view synthesis on synthetic and indoor scenes by representing radiance and density fields with irregularly distributed elliptical basis functions, rendered via volume ray casting using a Bounding Volume Hierarchy (BVH). However, its computational cost prevents real-time rendering on real-world scenes. Our approach, RayGaussX, builds on RayGauss by introducing key contributions that accelerate both training and inference. Specifically, we incorporate volumetric rendering acceleration strategies such as empty-space skipping and adaptive sampling, enhance ray coherence, and introduce scale regularization to reduce false-positive intersections. Additionally, we propose a new densification criterion that improves density distribution in distant regions, leading to enhanced graphical quality on larger scenes. As a result, RayGaussX achieves 5x to 12x faster training and 50x to 80x higher rendering speeds (FPS) on real-world datasets while improving visual quality by up to +0.56 dB in PSNR. Project page with videos and code: https://raygaussx.github.io/.
CVJan 24, 2025
3DLabelProp: Geometric-Driven Domain Generalization for LiDAR Semantic Segmentation in Autonomous DrivingJules Sanchez, Jean-Emmanuel Deschaud, François Goulette
Domain generalization aims to find ways for deep learning models to maintain their performance despite significant domain shifts between training and inference datasets. This is particularly important for models that need to be robust or are costly to train. LiDAR perception in autonomous driving is impacted by both of these concerns, leading to the emergence of various approaches. This work addresses the challenge by proposing a geometry-based approach, leveraging the sequential structure of LiDAR sensors, which sets it apart from the learning-based methods commonly found in the literature. The proposed method, called 3DLabelProp, is applied on the task of LiDAR Semantic Segmentation (LSS). Through extensive experimentation on seven datasets, it is demonstrated to be a state-of-the-art approach, outperforming both naive and other domain generalization methods.
CVOct 31, 2024
HD-OOD3D: Supervised and Unsupervised Out-of-Distribution object detection in LiDAR dataLouis Soum-Fontez, Jean-Emmanuel Deschaud, François Goulette
Autonomous systems rely on accurate 3D object detection from LiDAR data, yet most detectors are limited to a predefined set of known classes, making them vulnerable to unexpected out-of-distribution (OOD) objects. In this work, we present HD-OOD3D, a novel two-stage method for detecting unknown objects. We demonstrate the superiority of two-stage approaches over single-stage methods, achieving more robust detection of unknown objects while addressing key challenges in the evaluation protocol. Furthermore, we conduct an in-depth analysis of the standard evaluation protocol for OOD detection, revealing the critical impact of hyperparameter choices. To address the challenge of scaling the learning of unknown objects, we explore unsupervised training strategies to generate pseudo-labels for unknowns. Among the different approaches evaluated, our experiments show that top-5 auto-labelling offers more promising performance compared to simple resizing techniques.
CVFeb 14, 2022
COLA: COarse LAbel pre-training for 3D semantic segmentation of sparse LiDAR datasetsJules Sanchez, Jean-Emmanuel Deschaud, François Goulette
Transfer learning is a proven technique in 2D computer vision to leverage the large amount of data available and achieve high performance with datasets limited in size due to the cost of acquisition or annotation. In 3D, annotation is known to be a costly task; nevertheless, pre-training methods have only recently been investigated. Due to this cost, unsupervised pre-training has been heavily favored. In this work, we tackle the case of real-time 3D semantic segmentation of sparse autonomous driving LiDAR scans. Such datasets have been increasingly released, but each has a unique label set. We propose here an intermediate-level label set called coarse labels, which can easily be used on any existing and future autonomous driving datasets, thus allowing all the data available to be leveraged at once without any additional manual labeling. This way, we have access to a larger dataset, alongside a simple task of semantic segmentation. With it, we introduce a new pre-training task: coarse label pre-training, also called COLA. We thoroughly analyze the impact of COLA on various datasets and architectures and show that it yields a noticeable performance improvement, especially when only a small dataset is available for the finetuning task.
CVSep 30, 2021
Riedones3D: a celtic coin dataset for registration and fine-grained clusteringSofiane Horache, Jean-Emmanuel Deschaud, François Goulette et al.
Clustering coins with respect to their die is an important component of numismatic research and crucial for understanding the economic history of tribes (especially when literary production does not exist, in celtic culture). It is a very hard task that requires a lot of times and expertise. To cluster thousands of coins, automatic methods are becoming necessary. Nevertheless, public datasets for coin die clustering evaluation are too rare, though they are very important for the development of new methods. Therefore, we propose a new 3D dataset of 2 070 scans of coins. With this dataset, we propose two benchmarks, one for point cloud registration, essential for coin die recognition, and a benchmark of coin die clustering. We show how we automatically cluster coins to help experts, and perform a preliminary evaluation for these two tasks. The code of the baseline and the dataset will be publicly available at https://www.npm3d.fr/coins-riedones3d and https://www.chronocarto.eu/spip.php?article84&lang=fr
ROSep 27, 2021
CT-ICP: Real-time Elastic LiDAR Odometry with Loop ClosurePierre Dellenbach, Jean-Emmanuel Deschaud, Bastien Jacquet et al.
Multi-beam LiDAR sensors are increasingly used in robotics, particularly with autonomous cars for localization and perception tasks, both relying on the ability to build a precise map of the environment. For this, we propose a new real-time LiDAR-only odometry method called CT-ICP (for Continuous-Time ICP), completed into a full SLAM with a novel loop detection procedure. The core of this method, is the introduction of the combined continuity in the scan matching, and discontinuity between scans. It allows both the elastic distortion of the scan during the registration for increased precision, and the increased robustness to high frequency motions from the discontinuity. We build a complete SLAM on top of this odometry, using a fast pure LiDAR loop detection based on elevation image 2D matching, providing a pose graph with loop constraints. To show the robustness of the method, we tested it on seven datasets: KITTI, KITTI-raw, KITTI-360, KITTI-CARLA, ParisLuco, Newer College, and NCLT in driving and high-frequency motion scenarios. Both the CT-ICP odometry and the loop detection are made available online. CT-ICP is currently first, among those giving access to a public code, on the KITTI odometry leaderboard, with an average Relative Translation Error (RTE) of 0.59% and an average time per scan of 60ms on a CPU with a single thread.
CVAug 17, 2021
KITTI-CARLA: a KITTI-like dataset generated by CARLA SimulatorJean-Emmanuel Deschaud
KITTI-CARLA is a dataset built from the CARLA v0.9.10 simulator using a vehicle with sensors identical to the KITTI dataset. The vehicle thus has a Velodyne HDL64 LiDAR positioned in the middle of the roof and two color cameras similar to Point Grey Flea 2. The positions of the LiDAR and cameras are the same as the setup used in KITTI. The objective of this dataset is to test approaches of semantic segmentation LiDAR and/or images, odometry LiDAR and/or image in synthetic data and to compare with the results obtained on real data like KITTI. This dataset thus makes it possible to improve transfer learning methods from a synthetic dataset to a real dataset. We created 7 sequences with 5000 frames in each sequence in the 7 maps of CARLA providing different environments (city, suburban area, mountain, rural area, highway...). The dataset is available at: http://npm3d.fr
CVMay 12, 2020
Automatic clustering of Celtic coins based on 3D point cloud pattern analysisSofiane Horache, Jean-Emmanuel Deschaud, François Goulette et al.
The recognition and clustering of coins which have been struck by the same die is of interest for archeological studies. Nowadays, this work can only be performed by experts and is very tedious. In this paper, we propose a method to automatically cluster dies, based on 3D scans of coins. It is based on three steps: registration, comparison and graph-based clustering. Experimental results on 90 coins coming from a Celtic treasury from the II-Ith century BC show a clustering quality equivalent to expert's work.
CVApr 18, 2019
KPConv: Flexible and Deformable Convolution for Point CloudsHugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud et al.
We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv.
CVAug 1, 2018
Semantic Classification of 3D Point Clouds with Multiscale Spherical NeighborhoodsHugues Thomas, Jean-Emmanuel Deschaud, Beatriz Marcotegui et al.
This paper introduces a new definition of multiscale neighborhoods in 3D point clouds. This definition, based on spherical neighborhoods and proportional subsampling, allows the computation of features with a consistent geometrical meaning, which is not the case when using k-nearest neighbors. With an appropriate learning strategy, the proposed features can be used in a random forest to classify 3D points. In this semantic classification task, we show that our multiscale features outperform state-of-the-art features using the same experimental conditions. Furthermore, their classification power competes with more elaborate classification approaches including Deep Learning methods.
CVApr 10, 2018
Classification of Point Cloud Scenes with Multiscale Voxel Deep NetworkXavier Roynard, Jean-Emmanuel Deschaud, François Goulette
In this article we describe a new convolutional neural network (CNN) to classify 3D point clouds of urban or indoor scenes. Solutions are given to the problems encountered working on scene point clouds, and a network is described that allows for point classification using only the position of points in a multi-scale neighborhood. On the reduced-8 Semantic3D benchmark [Hackel et al., 2017], this network, ranked second, beats the state of the art of point classification methods (those not using a regularization step).
ROFeb 23, 2018
IMLS-SLAM: scan-to-model matching based on 3D dataJean-Emmanuel Deschaud
The Simultaneous Localization And Mapping (SLAM) problem has been well studied in the robotics community, especially using mono, stereo cameras or depth sensors. 3D depth sensors, such as Velodyne LiDAR, have proved in the last 10 years to be very useful to perceive the environment in autonomous driving, but few methods exist that directly use these 3D data for odometry. We present a new low-drift SLAM algorithm based only on 3D LiDAR data. Our method relies on a scan-to-model matching framework. We first have a specific sampling strategy based on the LiDAR scans. We then define our model as the previous localized LiDAR sweeps and use the Implicit Moving Least Squares (IMLS) surface representation. We show experiments with the Velodyne HDL32 with only 0.40% drift over a 4 km acquisition without any loop closure (i.e., 16 m drift after 4 km). We tested our solution on the KITTI benchmark with a Velodyne HDL64 and ranked among the best methods (against mono, stereo and LiDAR methods) with a global drift of only 0.69%.
LGNov 30, 2017
Paris-Lille-3D: a large and high-quality ground truth urban point cloud dataset for automatic segmentation and classificationXavier Roynard, Jean-Emmanuel Deschaud, François Goulette
This paper introduces a new Urban Point Cloud Dataset for Automatic Segmentation and Classification acquired by Mobile Laser Scanning (MLS). We describe how the dataset is obtained from acquisition to post-processing and labeling. This dataset can be used to learn classification algorithm, however, given that a great attention has been paid to the split between the different objects, this dataset can also be used to learn the segmentation. The dataset consists of around 2km of MLS point cloud acquired in two cities. The number of points and range of classes make us consider that it can be used to train Deep-Learning methods. Besides we show some results of automatic segmentation and classification. The dataset is available at: http://caor-mines-paristech.fr/fr/paris-lille-3d-dataset/
SYMar 4, 2015
Invariant EKF Design for Scan Matching-aided LocalizationMartin Barczyk, Silvère Bonnabel, Jean-Emmanuel Deschaud et al.
Localization in indoor environments is a technique which estimates the robot's pose by fusing data from onboard motion sensors with readings of the environment, in our case obtained by scan matching point clouds captured by a low-cost Kinect depth camera. We develop both an Invariant Extended Kalman Filter (IEKF)-based and a Multiplicative Extended Kalman Filter (MEKF)-based solution to this problem. The two designs are successfully validated in experiments and demonstrate the advantage of the IEKF design.
RODec 3, 2014
Colorisation et texturation temps réel d'environnements urbains par système mobile avec scanner laser et caméra fish-eyeJean-Emmanuel Deschaud, Xavier Brun, François Goulette
We present here a real time mobile mapping system mounted on a vehicle. The terrestrial acquisition system is based on a geolocation system and two sensors, namely, a laser scanner and a camera with a fish-eye lens. We produce 3D colored points cloud and textured models of the environment. Once the system has been calibrated, the data acquisition and processing are done "on the way". This article mainly presents our methods of colorization of point cloud, triangulation and texture mapping.
SYMar 20, 2014
Experimental Implementation of an Invariant Extended Kalman Filter-based Scan Matching SLAMMartin Barczyk, Silvère Bonnabel, Jean-Emmanuel Deschaud et al.
We describe an application of the Invariant Extended Kalman Filter (IEKF) design methodology to the scan matching SLAM problem. We review the theoretical foundations of the IEKF and its practical interest of guaranteeing robustness to poor state estimates, then implement the filter on a wheeled robot hardware platform. The proposed design is successfully validated in experimental testing.