Amine Kacete

4papers

76citations

Novelty54%

AI Score26

Ranked #169,456 of 205,806 authors (top 82%)#51,368 in CV (top 87%)

4 Papers

CVSep 16, 2022

TwistSLAM++: Fusing multiple modalities for accurate dynamic semantic SLAM

Mathieu Gonzalez, Eric Marchand, Amine Kacete et al.

Most classical SLAM systems rely on the static scene assumption, which limits their applicability in real world scenarios. Recent SLAM frameworks have been proposed to simultaneously track the camera and moving objects. However they are often unable to estimate the canonical pose of the objects and exhibit a low object tracking accuracy. To solve this problem we propose TwistSLAM++, a semantic, dynamic, SLAM system that fuses stereo images and LiDAR information. Using semantic information, we track potentially moving objects and associate them to 3D object detections in LiDAR scans to obtain their pose and size. Then, we perform registration on consecutive object scans to refine object pose estimation. Finally, object scans are used to estimate the shape of the object and constrain map points to lie on the estimated surface within the BA. We show on classical benchmarks that this fusion approach based on multimodal information improves the accuracy of object tracking.

ROFeb 24, 2022

TwistSLAM: Constrained SLAM in Dynamic Environment

Mathieu Gonzalez, Eric Marchand, Amine Kacete et al.

Classical visual simultaneous localization and mapping (SLAM) algorithms usually assume the environment to be rigid. This assumption limits the applicability of those algorithms as they are unable to accurately estimate the camera poses and world structure in real life scenes containing moving objects (e.g. cars, bikes, pedestrians, etc.). To tackle this issue, we propose TwistSLAM: a semantic, dynamic and stereo SLAM system that can track dynamic objects in the environment. Our algorithm creates clusters of points according to their semantic class. Thanks to the definition of inter-cluster constraints modeled by mechanical joints (function of the semantic class), a novel constrained bundle adjustment is then able to jointly estimate both poses and velocities of moving objects along with the classical world structure and camera trajectory. We evaluate our approach on several sequences from the public KITTI dataset and demonstrate quantitatively that it improves camera and object tracking compared to state-of-the-art approaches.

ROSep 15, 2021

S3LAM: Structured Scene SLAM

Mathieu Gonzalez, Eric Marchand, Amine Kacete et al.

We propose a new SLAM system that uses the semantic segmentation of objects and structures in the scene. Semantic information is relevant as it contains high level information which may make SLAM more accurate and robust. Our contribution is twofold: i) A new SLAM system based on ORB-SLAM2 that creates a semantic map made of clusters of points corresponding to objects instances and structures in the scene. ii) A modification of the classical Bundle Adjustment formulation to constrain each cluster using geometrical priors, which improves both camera localization and reconstruction and enables a better understanding of the scene. We evaluate our approach on sequences from several public datasets and show that it improves camera pose estimation with respect to state of the art.

CVFeb 3, 2020

L6DNet: Light 6 DoF Network for Robust and Precise Object Pose Estimation with Small Datasets

Mathieu Gonzalez, Amine Kacete, Albert Murienne et al.

Estimating the 3D pose of an object is a challenging task that can be considered within augmented reality or robotic applications. In this paper, we propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image. We adopt a hybrid pipeline in two stages: data-driven and geometric respectively. The data-driven step consists of a classification CNN to estimate the object 2D location in the image from local patches, followed by a regression CNN trained to predict the 3D location of a set of keypoints in the camera coordinate system. To extract the pose information, the geometric step consists in aligning the 3D points in the camera coordinate system with the corresponding 3D points in world coordinate system by minimizing a registration error, thus computing the pose. Our experiments on the standard dataset LineMod show that our approach is more robust and accurate than state-of-the-art methods. The approach is also validated to achieve a 6 DoF positioning task by visual servoing.