CVLGNEFeb 2, 2024

ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data

arXiv:2402.01393v39 citationsh-index: 3ICML
Originality Incremental advance
AI Analysis

This addresses real-time processing challenges for event-based sensor applications like robotics, though it appears incremental by combining existing methods.

The paper tackled processing continuous ultra-sparse spatiotemporal data from event-based sensors by proposing a hybrid pipeline with asynchronous sensing and synchronous processing, achieving state-of-the-art performance in object and gesture recognition with lower latency than competitors.

We seek to enable classic processing of continuous ultra-sparse spatiotemporal data generated by event-based sensors with dense machine learning models. We propose a novel hybrid pipeline composed of asynchronous sensing and synchronous processing that combines several ideas: (1) an embedding based on PointNet models -- the ALERT module -- that can continuously integrate new and dismiss old events thanks to a leakage mechanism, (2) a flexible readout of the embedded data that allows to feed any downstream model with always up-to-date features at any sampling rate, (3) exploiting the input sparsity in a patch-based approach inspired by Vision Transformer to optimize the efficiency of the method. These embeddings are then processed by a transformer model trained for object and gesture recognition. Using this approach, we achieve performances at the state-of-the-art with a lower latency than competitors. We also demonstrate that our asynchronous model can operate at any desired sampling rate.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes