CVMay 8, 2025

PillarMamba: Learning Local-Global Context for Roadside Point Cloud via Hybrid State Space Model

arXiv:2505.05397v12 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses roadside perception for Intelligent Transport Systems to improve traffic safety, representing an incremental advance in applying Mamba to a specific domain.

The paper tackles 3D object detection in roadside point clouds by introducing PillarMamba, a framework that uses a hybrid state space model to capture local-global context, and it outperforms state-of-the-art methods on the DAIR-V2X-I benchmark.

Serving the Intelligent Transport System (ITS) and Vehicle-to-Everything (V2X) tasks, roadside perception has received increasing attention in recent years, as it can extend the perception range of connected vehicles and improve traffic safety. However, roadside point cloud oriented 3D object detection has not been effectively explored. To some extent, the key to the performance of a point cloud detector lies in the receptive field of the network and the ability to effectively utilize the scene context. The recent emergence of Mamba, based on State Space Model (SSM), has shaken up the traditional convolution and transformers that have long been the foundational building blocks, due to its efficient global receptive field. In this work, we introduce Mamba to pillar-based roadside point cloud perception and propose a framework based on Cross-stage State-space Group (CSG), called PillarMamba. It enhances the expressiveness of the network and achieves efficient computation through cross-stage feature fusion. However, due to the limitations of scan directions, state space model faces local connection disrupted and historical relationship forgotten. To address this, we propose the Hybrid State-space Block (HSB) to obtain the local-global context of roadside point cloud. Specifically, it enhances neighborhood connections through local convolution and preserves historical memory through residual attention. The proposed method outperforms the state-of-the-art methods on the popular large scale roadside benchmark: DAIR-V2X-I. The code will be released soon.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes