CVSep 17, 2025

CETUS: Causal Event-Driven Temporal Modeling With Unified Variable-Rate Scheduling

arXiv:2509.13784v11 citationsh-index: 10
Originality Highly original
AI Analysis

This addresses computational challenges for high-speed vision tasks using event cameras, representing a novel method for a known bottleneck.

The paper tackles the problem of window latency and computational inefficiency in event camera processing by proposing a novel architecture that directly processes raw event streams without intermediate representations, achieving an optimal balance between window latency and inference latency.

Event cameras capture asynchronous pixel-level brightness changes with microsecond temporal resolution, offering unique advantages for high-speed vision tasks. Existing methods often convert event streams into intermediate representations such as frames, voxel grids, or point clouds, which inevitably require predefined time windows and thus introduce window latency. Meanwhile, pointwise detection methods face computational challenges that prevent real-time efficiency due to their high computational cost. To overcome these limitations, we propose the Variable-Rate Spatial Event Mamba, a novel architecture that directly processes raw event streams without intermediate representations. Our method introduces a lightweight causal spatial neighborhood encoder to efficiently capture local geometric relations, followed by Mamba-based state space models for scalable temporal modeling with linear complexity. During inference, a controller adaptively adjusts the processing speed according to the event rate, achieving an optimal balance between window latency and inference latency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes