CVJan 7, 2024
Amirkabir campus dataset: Real-world challenges and scenarios of Visual Inertial Odometry (VIO) for visually impaired peopleAli Samadzadeh, Mohammad Hassan Mojab, Heydar Soudani et al.
Visual Inertial Odometry (VIO) algorithms estimate the accurate camera trajectory by using camera and Inertial Measurement Unit (IMU) sensors. The applications of VIO span a diverse range, including augmented reality and indoor navigation. VIO algorithms hold the potential to facilitate navigation for visually impaired individuals in both indoor and outdoor settings. Nevertheless, state-of-the-art VIO algorithms encounter substantial challenges in dynamic environments, particularly in densely populated corridors. Existing VIO datasets, e.g., ADVIO, typically fail to effectively exploit these challenges. In this paper, we introduce the Amirkabir campus dataset (AUT-VI) to address the mentioned problem and improve the navigation systems. AUT-VI is a novel and super-challenging dataset with 126 diverse sequences in 17 different locations. This dataset contains dynamic objects, challenging loop-closure/map-reuse, different lighting conditions, reflections, and sudden camera movements to cover all extreme navigation scenarios. Moreover, in support of ongoing development efforts, we have released the Android application for data capture to the public. This allows fellow researchers to easily capture their customized VIO dataset variations. In addition, we evaluate state-of-the-art Visual Inertial Odometry (VIO) and Visual Odometry (VO) methods on our dataset, emphasizing the essential need for this challenging dataset.
CVJan 14, 2022
SRVIO: Super Robust Visual Inertial Odometry for dynamic environments and challenging Loop-closure conditionsAli Samadzadeh, Ahmad Nickabadi
There has been extensive research on visual localization and odometry for autonomous robots and virtual reality during the past decades. Traditionally, this problem has been solved with the help of expensive sensors, such as lidars. Nowadays, the focus of the leading research in this field is on robust localization using more economic sensors, such as cameras and IMUs. Consequently, geometric visual localization methods have become more accurate in time. However, these methods still suffer from significant loss and divergence in challenging environments, such as a room full of moving people. Scientists started using deep neural networks (DNNs) to mitigate this problem. The main idea behind using DNNs is to better understand challenging aspects of the data and overcome complex conditions such as the movement of a dynamic object in front of the camera that covers the full view of the camera, extreme lighting conditions, and high speed of the camera. Prior end-to-end DNN methods have overcome some of these challenges. However, no general and robust framework is available to overcome all challenges together. In this paper, we have combined geometric and DNN-based methods to have the generality and speed of geometric SLAM frameworks and overcome most of these challenging conditions with the help of DNNs and deliver the most robust framework so far. To do so, we have designed a framework based on Vins-Mono, and show that it is able to achieve state-of-the-art results on TUM-Dynamic, TUM-VI, ADVIO, and EuRoC datasets compared to geometric and end-to-end DNN based SLAMs. Our proposed framework could also achieve outstanding results on extreme simulated cases resembling the aforementioned challenges.
CVMar 27, 2020
Convolutional Spiking Neural Networks for Spatio-Temporal Feature ExtractionAli Samadzadeh, Fatemeh Sadat Tabatabaei Far, Ali Javadi et al.
Spiking neural networks (SNNs) can be used in low-power and embedded systems (such as emerging neuromorphic chips) due to their event-based nature. Also, they have the advantage of low computation cost in contrast to conventional artificial neural networks (ANNs), while preserving ANN's properties. However, temporal coding in layers of convolutional spiking neural networks and other types of SNNs has yet to be studied. In this paper, we provide insight into spatio-temporal feature extraction of convolutional SNNs in experiments designed to exploit this property. The shallow convolutional SNN outperforms state-of-the-art spatio-temporal feature extractor methods such as C3D, ConvLstm, and similar networks. Furthermore, we present a new deep spiking architecture to tackle real-world problems (in particular classification tasks) which achieved superior performance compared to other SNN methods on NMNIST (99.6%), DVS-CIFAR10 (69.2%) and DVS-Gesture (96.7%) and ANN methods on UCF-101 (42.1%) and HMDB-51 (21.5%) datasets. It is also worth noting that the training process is implemented based on variation of spatio-temporal backpropagation explained in the paper.