Embedding Semantic Risk into Distance Fields and CBFs for Online Monocular Safe Control
For autonomous systems relying on monocular vision, this work addresses the problem of integrating semantic risk into safety-critical control, enabling more nuanced obstacle avoidance based on object class.
This paper proposes a monocular perception-to-control framework that embeds semantic risk into the Euclidean Signed Distance Field (ESDF) used by Control Barrier Functions (CBFs) for safe navigation. The method achieves online operation at 10-20 Hz and demonstrates semantic-aware safe behavior in both teleoperation and autonomous navigation.
We propose an online monocular perception-to-control framework that embeds semantic risk into the distance field used by Control Barrier Function (CBF)-based safe navigation and teleoperation. Many perception-based safety filters assign the same distance-based safety margin to all mapped obstacles or use semantics only as a downstream controller adjustment, rather than encoding semantic risk in the spatial representation. Our framework instead reasons online about obstacle geometry and class-dependent risk by embedding semantic information directly into the Euclidean Signed Distance Field (ESDF). This design encodes semantic risk before control optimization, so high-risk objects exert a larger spatial influence in the safety field while retaining efficient ESDF queries at runtime. Specifically, a foundation-model-based SLAM front end reconstructs dense 3-D geometry from monocular RGB video, while per-frame semantic segmentation provides pixel-level class labels that are fused into the reconstructed geometry. The resulting geometric-semantic representation is then converted into an ESDF, where semantic labels identify safety-relevant regions and impose class-dependent inflation before field computation. The semantic-aware ESDF provides the local distance values and spatial derivatives required by the CBF controller, while class-dependent gains further regulate the controller response. Extensive simulation and hardware experiments demonstrate online operation at 10--20 Hz and semantic-aware safe behavior in both teleoperation and autonomous navigation.