CV AIJan 14

Hybrid guided variational autoencoder for visual place recognition

Ni Wang, Zihan You, Emre Neftci, Thorben Schoepe

arXiv:2601.09248v12.81 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses the need for compact and robust localization models for mobile robots and drones, offering an incremental improvement by integrating event-based sensors and neuromorphic hardware compatibility.

The paper tackles the problem of visual place recognition for autonomous agents in GPS-denied indoor environments by developing a hybrid guided variational autoencoder that combines event-based vision sensors and spiking neural networks, achieving classification performance comparable to state-of-the-art approaches on a new dataset of 16 distinct places with robust generalization to unknown scenes.

Autonomous agents such as cars, robots and drones need to precisely localize themselves in diverse environments, including in GPS-denied indoor environments. One approach for precise localization is visual place recognition (VPR), which estimates the place of an image based on previously seen places. State-of-the-art VPR models require high amounts of memory, making them unwieldy for mobile deployment, while more compact models lack robustness and generalization capabilities. This work overcomes these limitations for robotics using a combination of event-based vision sensors and an event-based novel guided variational autoencoder (VAE). The encoder part of our model is based on a spiking neural network model which is compatible with power-efficient low latency neuromorphic hardware. The VAE successfully disentangles the visual features of 16 distinct places in our new indoor VPR dataset with a classification performance comparable to other state-of-the-art approaches while, showing robust performance also under various illumination conditions. When tested with novel visual inputs from unknown scenes, our model can distinguish between these places, which demonstrates a high generalization capability by learning the essential features of location. Our compact and robust guided VAE with generalization capabilities poses a promising model for visual place recognition that can significantly enhance mobile robot navigation in known and unknown indoor environments.

View on arXiv PDF

Similar