Haiming Gao

CV
6papers
45citations
Novelty53%
AI Score43

6 Papers

CVSep 15, 2022Code
PriorLane: A Prior Knowledge Enhanced Lane Detection Approach Based on Transformer

Qibo Qiu, Haiming Gao, Wei Hua et al.

Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper proposes a novel and general framework called PriorLane, which is used to enhance the segmentation performance of the fully vision transformer by introducing the low-cost local prior knowledge. Specifically, PriorLane utilizes an encoder-only transformer to fuse the feature extracted by a pre-trained segmentation model with prior knowledge embeddings. Note that a Knowledge Embedding Alignment (KEA) module is adapted to enhance the fusion performance by aligning the knowledge embedding. Extensive experiments on our Zjlab dataset show that PriorLane outperforms SOTA lane detection methods by a 2.82% mIoU when prior knowledge is employed, and the code will be released at: https://github.com/vincentqqb/PriorLane.

61.0ROMay 28
EXACT-MPPI: Exact Signed-Distance Navigation for Arbitrary-Footprint Robots from Point Clouds via Path Integral Control

Chen Peng, Zhikang Ge, Wenwu Lu et al.

Ground robots often carry payloads, implements, or other attachments that turn their effective footprint into complex, non-convex shapes. Navigating safely through clutter then requires reasoning about this true geometry, yet most local planners simplify it with convex or inflated proxies and rasterize sensor data into occupancy grids or distance fields. Both choices eliminate feasible motions when clearance is comparable to the footprint geometry. We present EXACT-MPPI, a training-free local navigation framework that maps local point-cloud observations and sparse guidance directly to motion commands, without any intermediate map representation. The framework embeds an analytic, exact signed-distance evaluator into a Model Predictive Path Integral (MPPI) controller. The footprint is represented as a simple polygon for general convex or concave planar shapes, with a rectangle-cover specialization for faster evaluation of rectilinear footprints, enabling footprint-aware collision costs without convex decomposition, inflation, or learned encoders. During each MPPI rollout, observed obstacle points are transformed into the predicted body frame and evaluated against the footprint. All operations are batched in JAX, leveraging GPU parallelism for real-time receding-horizon control. Experiments show that EXACT-MPPI accelerates batched distance evaluation over a learned point-to-robot baseline, preserves feasible motion where convex-footprint planners fail, and remains robust under dense static and moving obstacles. The same framework deploys on differential-drive, Ackermann, omnidirectional, and hybrid-mode platforms by changing only the footprint description and motion model without per-platform training. Pairing exact footprint geometry with sampling-based predictive control thus offers a practical, training-free path to footprint-aware local navigation across diverse robots.

CVJun 1, 2023
SelFLoc: Selective Feature Fusion for Large-scale Point Cloud-based Place Recognition

Qibo Qiu, Wenxiao Wang, Haochao Ying et al.

Point cloud-based place recognition is crucial for mobile robots and autonomous vehicles, especially when the global positioning sensor is not accessible. LiDAR points are scattered on the surface of objects and buildings, which have strong shape priors along different axes. To enhance message passing along particular axes, Stacked Asymmetric Convolution Block (SACB) is designed, which is one of the main contributions in this paper. Comprehensive experiments demonstrate that asymmetric convolution and its corresponding strategies employed by SACB can contribute to the more effective representation of point cloud feature. On this basis, Selective Feature Fusion Block (SFFB), which is formed by stacking point- and channel-wise gating layers in a predefined sequence, is proposed to selectively boost salient local features in certain key regions, as well as to align the features before fusion phase. SACBs and SFFBs are combined to construct a robust and accurate architecture for point cloud-based place recognition, which is termed SelFLoc. Comparative experimental results show that SelFLoc achieves the state-of-the-art (SOTA) performance on the Oxford and other three in-house benchmarks with an improvement of 1.6 absolute percentages on mean average recall@1.

CVSep 23, 2023
M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

Qibo Qiu, Honghui Yang, Wenxiao Wang et al.

Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and semantic contexts during pre-training. To this end, M$^3$CS is proposed to enable the model with the above abilities. Specifically, with masked point cloud as input, M$^3$CS introduces two decoders to predict masked representations and the original points simultaneously. While an extra decoder doubles parameters for the decoding process and may lead to overfitting, we propose siamese decoders to keep the amount of learnable parameters unchanged. Further, we propose an online codebook projecting continuous tokens into discrete ones before reconstructing masked points. In such way, we can enforce the decoder to take effect through the combinations of tokens rather than remembering each token. Comprehensive experiments show that M$^3$CS achieves superior performance at both classification and segmentation tasks, outperforming existing methods.

RODec 16, 2020
E$ \mathbf{^3} $MoP: Efficient Motion Planning Based on Heuristic-Guided Motion Primitives Pruning and Path Optimization With Sparse-Banded Structure

Jian Wen, Xuebo Zhang, Haiming Gao et al.

To solve the autonomous navigation problem in complex environments, an efficient motion planning approach is newly presented in this paper. Considering the challenges from large-scale, partially unknown complex environments, a three-layer motion planning framework is elaborately designed, including global path planning, local path optimization, and time-optimal velocity planning. Compared with existing approaches, the novelty of this work is twofold: 1) a novel heuristic-guided pruning strategy of motion primitives is proposed and fully integrated into the state lattice-based global path planner to further improve the computational efficiency of graph search, and 2) a new soft-constrained local path optimization approach is proposed, wherein the sparse-banded system structure of the underlying optimization problem is fully exploited to efficiently solve the problem. We validate the safety, smoothness, flexibility, and efficiency of our approach in various complex simulation scenarios and challenging real-world tasks. It is shown that the computational efficiency is improved by 66.21% in the global planning stage and the motion efficiency of the robot is improved by 22.87% compared with the recent quintic Bézier curve-based state space sampling approach. We name the proposed motion planning framework E$ \mathrm{^3} $MoP, where the number 3 not only means our approach is a three-layer framework but also means the proposed approach is efficient in three stages.

ROJan 7, 2019
CAE-RLSM: Consistent and Efficient Redundant Line Segment Merging for Online Feature Map Building

Jian Wen, Xuebo Zhang, Haiming Gao et al.

In order to obtain a compact line segment-based map representation for localization and planning of mobile robots, it is necessary to merge redundant line segments which physically represent the same part of the environment in different scans. In this paper, a consistent and efficient redundant line segment merging approach (CAE-RLSM) is proposed for online feature map building. The proposed CAE-RLSM is composed of two newly proposed modules: one-to-many incremental line segment merging (OTM-ILSM) and multi-processing global map adjustment (MP-GMA). Different from state-of-the-art offline merging approaches, the proposed CAE-RLSM can achieve real-time mapping performance, which not only reduces the redundancy of incremental merging with high efficiency, but also solves the problem of global map adjustment after loop closing to guarantee global consistency. Furthermore, a new correlation-based evaluation metric is proposed for the quality evaluation of line segment maps. This evaluation metric does not require manual measurement of the environmental metric information, instead it makes full use of globally consistent laser scans obtained by simultaneous localization and mapping (SLAM) systems to compare the performance of different line segment-based mapping approaches in an objective and fair manner. Comparative experimental results with respect to a mean shift-based offline redundant line segment merging approach (MS-RLSM) and an offline version of one-to-one incremental line segment merging approach (O$^2$TO-ILSM) on both public data sets and self-recorded data set are presented to show the superior performance of CAE-RLSM in terms of efficiency and map quality in different scenarios.