CVRONov 17, 2024

Person Segmentation and Action Classification for Multi-Channel Hemisphere Field of View LiDAR Sensors

arXiv:2411.11151v11 citationsh-index: 12SII
Originality Synthesis-oriented
AI Analysis

This addresses robot perception for safety and interaction, but it is incremental as it adapts existing methods to a new sensor and dataset.

The paper tackles person segmentation and action classification from 3D LiDAR scans using a hemisphere field of view sensor, achieving good performance on tasks like detecting walking, waving, and sitting actions.

Robots need to perceive persons in their surroundings for safety and to interact with them. In this paper, we present a person segmentation and action classification approach that operates on 3D scans of hemisphere field of view LiDAR sensors. We recorded a data set with an Ouster OSDome-64 sensor consisting of scenes where persons perform three different actions and annotated it. We propose a method based on a MaskDINO model to detect and segment persons and to recognize their actions from combined spherical projected multi-channel representations of the LiDAR data with an additional positional encoding. Our approach demonstrates good performance for the person segmentation task and further performs well for the estimation of the person action states walking, waving, and sitting. An ablation study provides insights about the individual channel contributions for the person segmentation task. The trained models, code and dataset are made publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes