Baosheng Zhang

3papers

21citations

Novelty48%

AI Score23

Ranked #181,486 of 205,806 authors (top 88%)#54,296 in CV (top 92%)

3 Papers

CVMay 17, 2022

DynPL-SVO: A Robust Stereo Visual Odometry for Dynamic Scenes

Baosheng Zhang, Xiaoguang Ma, Hongjun Ma et al.

Most feature-based stereo visual odometry (SVO) approaches estimate the motion of mobile robots by matching and tracking point features along a sequence of stereo images. However, in dynamic scenes mainly comprising moving pedestrians, vehicles, etc., there are insufficient robust static point features to enable accurate motion estimation, causing failures when reconstructing robotic motion. In this paper, we proposed DynPL-SVO, a complete dynamic SVO method that integrated united cost functions containing information between matched point features and re-projection errors perpendicular and parallel to the direction of the line features. Additionally, we introduced a \textit{dynamic} \textit{grid} algorithm to enhance its performance in dynamic scenes. The stereo camera motion was estimated through Levenberg-Marquard minimization of the re-projection errors of both point and line features. Comprehensive experimental results on KITTI and EuRoC MAV datasets showed that accuracy of the DynPL-SVO was improved by over 20\% on average compared to other state-of-the-art SVO systems, especially in dynamic scenes.

CVApr 8, 2023

LSGDDN-LCD: An Appearance-based Loop Closure Detection using Local Superpixel Grid Descriptors and Incremental Dynamic Nodes

Baosheng Zhang

Loop Closure Detection (LCD) is an essential component of visual simultaneous localization and mapping (SLAM) systems. It enables the recognition of previously visited scenes to eliminate pose and map estimate drifts arising from long-term exploration. However, current appearance-based LCD methods face significant challenges, including high computational costs, viewpoint variance, and dynamic objects in scenes. This paper introduced an online appearance based LCD using local superpixel grids descriptor and dynamic node, i.e, LSGDDN-LCD, to find similarities between scenes via hand-crafted features extracted from LSGD. Unlike traditional Bag-of-Words (BoW) based LCD, which requires pre-training, we proposed an adaptive mechanism to group similar images called $\textbf{\textit{dynamic}}$ $\textbf{\textit{node}}$, which incrementally adjusted the database in an online manner, allowing for efficient and online retrieval of previously viewed images without need of the pre-training. Experimental results confirmed that the LSGDDN-LCD significantly improved LCD precision-recall and efficiency, and outperformed several state-of-the-art (SOTA) approaches on multiple typical datasets, indicating its great potential as a generic LCD framework.

CVJul 22, 2022

PLD-SLAM: A Real-Time Visual SLAM Using Points and Line Segments in Dynamic Scenes

BaoSheng Zhang

In this paper, we consider the problems in the practical application of visual simultaneous localization and mapping (SLAM). With the popularization and application of the technology in wide scope, the practicability of SLAM system has become a new hot topic after the accuracy and robustness, e.g., how to keep the stability of the system and achieve accurate pose estimation in the low-texture and dynamic environment, and how to improve the universality and real-time performance of the system in the real scenes, etc. This paper proposes a real-time stereo indirect visual SLAM system, PLD-SLAM, which combines point and line features, and avoid the impact of dynamic objects in highly dynamic environments. We also present a novel global gray similarity (GGS) algorithm to achieve reasonable keyframe selection and efficient loop closure detection (LCD). Benefiting from the GGS, PLD-SLAM can realize real-time accurate pose estimation in most real scenes without pre-training and loading a huge feature dictionary model. To verify the performance of the proposed system, we compare it with existing state-of-the-art (SOTA) methods on the public datasets KITTI, EuRoC MAV, and the indoor stereo datasets provided by us, etc. The experiments show that the PLD-SLAM has better real-time performance while ensuring stability and accuracy in most scenarios. In addition, through the analysis of the experimental results of the GGS, we can find it has excellent performance in the keyframe selection and LCD.