LGDec 1, 2025

New Spiking Architecture for Multi-Modal Decision-Making in Autonomous Vehicles

arXiv:2512.01882v11 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses computational efficiency for autonomous vehicles, but it is incremental as it builds on existing transformer-based methods with a spiking adaptation.

The paper tackles the high computational cost of transformers in multi-modal autonomous vehicle decision-making by proposing a spiking temporal-aware transformer-like architecture with ternary spiking neurons, achieving effective and efficient real-time performance in the Highway Environment.

This work proposes an end-to-end multi-modal reinforcement learning framework for high-level decision-making in autonomous vehicles. The framework integrates heterogeneous sensory input, including camera images, LiDAR point clouds, and vehicle heading information, through a cross-attention transformer-based perception module. Although transformers have become the backbone of modern multi-modal architectures, their high computational cost limits their deployment in resource-constrained edge environments. To overcome this challenge, we propose a spiking temporal-aware transformer-like architecture that uses ternary spiking neurons for computationally efficient multi-modal fusion. Comprehensive evaluations across multiple tasks in the Highway Environment demonstrate the effectiveness and efficiency of the proposed approach for real-time autonomous decision-making.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes