49.5LGMay 27
CLANE: Continual Learning of Actions on Neuromorphic Hardware from Event CamerasElvin Hajizada, Michael Neumeier, Edward Paxon Frady et al.
Recognizing and continuously learning novel human actions without forgetting prior classes is a requirement for emerging AR/VR and robotics applications. For these applications, both on-device processing and learning are essential for privacy and low-latency adaptation. Event cameras address the efficiency of visual sensing with sparse, asynchronous output that is naturally compatible with neuromorphic processing. Yet no prior system has deployed a continual on-device learning pipeline for event-based action recognition using neuromorphic hardware. We present CLANE, Continual Learning of Actions on Neuromorphic Hardware from Event Cameras, deployed end-to-end on Intel Loihi 2. CLANE combines a spiking 2D CNN for spatiotemporal feature extraction with CLP-SNN as its on-chip learning head, extended to action clips via a Temporal Aggregation Layer and a fixed-point Normalization Layer, both novel Loihi 2 modules. On THU E-ACT-50, a 50-class dataset captured under real-world conditions, CLANE achieves 70.4% accuracy in a continual learning task while delivering more than 100x energy reduction and 16x lower latency over a sequential CNN+GRU+CLP edge GPU baseline, validated through iso-algorithm cross-platform benchmarking across three evaluation levels.
NEAug 28, 2024
Emulating Brain-like Rapid Learning in Neuromorphic Edge ComputingKenneth Stewart, Michael Neumeier, Sumit Bam Shrestha et al.
Achieving personalized intelligence at the edge with real-time learning capabilities holds enormous promise in enhancing our daily experiences and helping decision making, planning, and sensing. However, efficient and reliable edge learning remains difficult with current technology due to the lack of personalized data, insufficient hardware capabilities, and inherent challenges posed by online learning. Over time and across multiple developmental stages, the brain has evolved to efficiently incorporate new knowledge by gradually building on previous knowledge. In this work, we emulate the multiple stages of learning with digital neuromorphic technology that simulates the neural and synaptic processes of the brain using two stages of learning. First, a meta-training stage trains the hyperparameters of synaptic plasticity for one-shot learning using a differentiable simulation of the neuromorphic hardware. This meta-training process refines a hardware local three-factor synaptic plasticity rule and its associated hyperparameters to align with the trained task domain. In a subsequent deployment stage, these optimized hyperparameters enable fast, data-efficient, and accurate learning of new classes. We demonstrate our approach using event-driven vision sensor data and the Intel Loihi neuromorphic processor with its plasticity dynamics, achieving real-time one-shot learning of new classes that is vastly improved over transfer learning. Our methodology can be deployed with arbitrary plasticity models and can be applied to situations demanding quick learning and adaptation at the edge, such as navigating unfamiliar environments or learning unexpected categories of data through user engagement.
CVJul 10, 2025
EEvAct: Early Event-Based Action Recognition with High-Rate Two-Stream Spiking Neural NetworksMichael Neumeier, Jules Lecomte, Nils Kazinski et al.
Recognizing human activities early is crucial for the safety and responsiveness of human-robot and human-machine interfaces. Due to their high temporal resolution and low latency, event-based vision sensors are a perfect match for this early recognition demand. However, most existing processing approaches accumulate events to low-rate frames or space-time voxels which limits the early prediction capabilities. In contrast, spiking neural networks (SNNs) can process the events at a high-rate for early predictions, but most works still fall short on final accuracy. In this work, we introduce a high-rate two-stream SNN which closes this gap by outperforming previous work by 2% in final accuracy on the large-scale THU EACT-50 dataset. We benchmark the SNNs within a novel early event-based recognition framework by reporting Top-1 and Top-5 recognition scores for growing observation time. Finally, we exemplify the impact of these methods on a real-world task of early action triggering for human motion capture in sports.