CVMar 22, 2023
Multiscale Attention via Wavelet Neural Operators for Vision TransformersAnahita Nekoozadeh, Mohammad Reza Ahmadzadeh, Zahra Mardani
Transformers have achieved widespread success in computer vision. At their heart, there is a Self-Attention (SA) mechanism, an inductive bias that associates each token in the input with every other token through a weighted basis. The standard SA mechanism has quadratic complexity with the sequence length, which impedes its utility to long sequences appearing in high resolution vision. Recently, inspired by operator learning for PDEs, Adaptive Fourier Neural Operators (AFNO) were introduced for high resolution attention based on global convolution that is efficiently implemented via FFT. However, the AFNO global filtering cannot well represent small and moderate scale structures that commonly appear in natural images. To leverage the coarse-to-fine scale structures we introduce a Multiscale Wavelet Attention (MWA) by leveraging wavelet neural operators which incurs linear complexity in the sequence size. We replace the attention in ViT with MWA and our experiments with CIFAR and Tiny-ImageNet classification demonstrate significant improvement over alternative Fourier-based attentions such as AFNO and Global Filter Network (GFN).
CVJan 2, 2023
An Event-based Algorithm for Simultaneous 6-DOF Camera Pose Tracking and MappingMasoud Dayani Najafabadi, Mohammad Reza Ahmadzadeh
Compared to regular cameras, Dynamic Vision Sensors or Event Cameras can output compact visual data based on a change in the intensity in each pixel location asynchronously. In this paper, we study the application of current image-based SLAM techniques to these novel sensors. To this end, the information in adaptively selected event windows is processed to form motion-compensated images. These images are then used to reconstruct the scene and estimate the 6-DOF pose of the camera. We also propose an inertial version of the event-only pipeline to assess its capabilities. We compare the results of different configurations of the proposed algorithm against the ground truth for sequences of two publicly available event datasets. We also compare the results of the proposed event-inertial pipeline with the state-of-the-art and show it can produce comparable or more accurate results provided the map estimate is reliable.
ROAug 23, 2018
Qualitative vision-based navigation based on sloped funnel lane conceptMohamad Mahdi Kassir, Maziar Palhang, Mohammad Reza Ahmadzadeh
Funnel lane concept is a qualitative visual navigation method which helps robots to autonomously navigate by using a recorded video. A visual path is extracted from the video by extracting some keyframes from the video. The robot uses this visual path for its navigation. Funnel lane unlike some other methods does not make use of traditional calculations of Jacobians, homographies, fundamental matrices, or the focus of expansion, and does not require any camera calibration. However, funnel lane has some shortcomings. One problem is that funnel lane gives no information about the radius of rotation, so in turnings, the robot turns by a constant radius of rotation along the path. This reduces the maneuverability and limits the robot from dealing with all turnings conditions. In addition, this problem makes the robot faces a serious problem in correcting its path when it deviates from the desired path. Another flaw is that in some situations the robot faces an ambiguity to understand whether a translation or a rotation should be followed in the visual path which leads the robot to deviate and to fail in following the desired path. This paper introduces the sloped funnel lane technique which does not have these shortcomings. The roll and pitch angles are added to the funnel lane, which help the robot to set its radius of rotation according to the turnings conditions it faces. Moreover, they help to reduce the ambiguity between translation and rotation. Therefore the robot can deal with different turnings conditions and the navigation method will be more robust and accurate. Experimental results on challenging scenarios on a real ground robot demonstrate the effectiveness of sloped funnel lane technique.