Yahya Zweiri

CV
h-index18
16papers
355citations
Novelty50%
AI Score37

16 Papers

CVMar 20, 2023Code
Bimodal SegNet: Instance Segmentation Fusing Events and RGB Frames for Robotic Grasping

Sanket Kachole, Xiaoqian Huang, Fariborz Baghaei Naeini et al.

Object segmentation for robotic grasping under dynamic conditions often faces challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Deep Learning network that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders, one for each signal input and a spatial pyramidal pooling with atrous convolutions. Encoders capture rich contextual information by pooling the concatenated features at different resolutions while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The evaluation results show a 6-10\% segmentation accuracy improvement over state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. The model code is available at https://github.com/sanket0707/Bimodal-SegNet.git

CVJun 15, 2023
E-Calib: A Fast, Robust and Accurate Calibration Toolbox for Event Cameras

Mohammed Salah, Abdulla Ayyad, Muhammad Humais et al.

Event cameras triggered a paradigm shift in the computer vision community delineated by their asynchronous nature, low latency, and high dynamic range. Calibration of event cameras is always essential to account for the sensor intrinsic parameters and for 3D perception. However, conventional image-based calibration techniques are not applicable due to the asynchronous, binary output of the sensor. The current standard for calibrating event cameras relies on either blinking patterns or event-based image reconstruction algorithms. These approaches are difficult to deploy in factory settings and are affected by noise and artifacts degrading the calibration performance. To bridge these limitations, we present E-Calib, a novel, fast, robust, and accurate calibration toolbox for event cameras utilizing the asymmetric circle grid, for its robustness to out-of-focus scenes. The proposed method is tested in a variety of rigorous experiments for different event camera models, on circle grids with different geometric properties, and under challenging illumination conditions. The results show that our approach outperforms the state-of-the-art in detection success rate, reprojection error, and estimation accuracy of extrinsic parameters.

CVJun 23, 2022
A Neuromorphic Vision-Based Measurement for Robust Relative Localization in Future Space Exploration Missions

Mohammed Salah, Mohammed Chehadah, Muhammed Humais et al.

Space exploration has witnessed revolutionary changes upon landing of the Perseverance Rover on the Martian surface and demonstrating the first flight beyond Earth by the Mars helicopter, Ingenuity. During their mission on Mars, Perseverance Rover and Ingenuity collaboratively explore the Martian surface, where Ingenuity scouts terrain information for rover's safe traversability. Hence, determining the relative poses between both the platforms is of paramount importance for the success of this mission. Driven by this necessity, this work proposes a robust relative localization system based on a fusion of neuromorphic vision-based measurements (NVBMs) and inertial measurements. The emergence of neuromorphic vision triggered a paradigm shift in the computer vision community, due to its unique working principle delineated with asynchronous events triggered by variations of light intensities occurring in the scene. This implies that observations cannot be acquired in static scenes due to illumination invariance. To circumvent this limitation, high frequency active landmarks are inserted in the scene to guarantee consistent event firing. These landmarks are adopted as salient features to facilitate relative localization. A novel event-based landmark identification algorithm using Gaussian Mixture Models (GMM) is developed for matching the landmarks correspondences formulating our NVBMs. The NVBMs are fused with inertial measurements in proposed state estimators, landmark tracking Kalman filter (LTKF) and translation decoupled Kalman filter (TDKF) for landmark tracking and relative localization, respectively. The proposed system was tested in a variety of experiments and has outperformed state-of-the-art approaches in accuracy and range.

CVFeb 13, 2023
A Neuromorphic Dataset for Object Segmentation in Indoor Cluttered Environment

Xiaoqian Huang, Kachole Sanket, Abdulla Ayyad et al.

Taking advantage of an event-based camera, the issues of motion blur, low dynamic range and low time sampling of standard cameras can all be addressed. However, there is a lack of event-based datasets dedicated to the benchmarking of segmentation algorithms, especially those that provide depth information which is critical for segmentation in occluded scenes. This paper proposes a new Event-based Segmentation Dataset (ESD), a high-quality 3D spatial and temporal dataset for object segmentation in an indoor cluttered environment. Our proposed dataset ESD comprises 145 sequences with 14,166 RGB frames that are manually annotated with instance masks. Overall 21.88 million and 20.80 million events from two event-based cameras in a stereo-graphic configuration are collected, respectively. To the best of our knowledge, this densely annotated and 3D spatial-temporal event-based segmentation benchmark of tabletop objects is the first of its kind. By releasing ESD, we expect to provide the community with a challenging segmentation benchmark with high quality.

NENov 20, 2023
Asynchronous Bioplausible Neuron for SNN for Event Vision

Sanket Kachole, Hussain Sajwani, Fariborz Baghaei Naeini et al.

Spiking Neural Networks (SNNs) offer a biologically inspired approach to computer vision that can lead to more efficient processing of visual data with reduced energy consumption. However, maintaining homeostasis within these networks is challenging, as it requires continuous adjustment of neural responses to preserve equilibrium and optimal processing efficiency amidst diverse and often unpredictable input signals. In response to these challenges, we propose the Asynchronous Bioplausible Neuron (ABN), a dynamic spike firing mechanism to auto-adjust the variations in the input signal. Comprehensive evaluation across various datasets demonstrates ABN's enhanced performance in image classification and segmentation, maintenance of neural equilibrium, and energy efficiency.

CVJun 9, 2025Code
Spatio-Temporal State Space Model For Efficient Event-Based Optical Flow

Muhammad Ahmed Humais, Xiaoqian Huang, Hussain Sajwani et al.

Event cameras unlock new frontiers that were previously unthinkable with standard frame-based cameras. One notable example is low-latency motion estimation (optical flow), which is critical for many real-time applications. In such applications, the computational efficiency of algorithms is paramount. Although recent deep learning paradigms such as CNN, RNN, or ViT have shown remarkable performance, they often lack the desired computational efficiency. Conversely, asynchronous event-based methods including SNNs and GNNs are computationally efficient; however, these approaches fail to capture sufficient spatio-temporal information, a powerful feature required to achieve better performance for optical flow estimation. In this work, we introduce Spatio-Temporal State Space Model (STSSM) module along with a novel network architecture to develop an extremely efficient solution with competitive performance. Our STSSM module leverages state-space models to effectively capture spatio-temporal correlations in event data, offering higher performance with lower complexity compared to ViT, CNN-based architectures in similar settings. Our model achieves 4.5x faster inference and 8x lower computations compared to TMA and 2x lower computations compared to EV-FlowNet with competitive performance on the DSEC benchmark. Our code will be available at https://github.com/AhmedHumais/E-STMFlow

CVMay 5, 2023Code
Asynchronous Events-based Panoptic Segmentation using Graph Mixer Neural Network

Sanket Kachole, Yusra Alkendi, Fariborz Baghaei Naeini et al.

In the context of robotic grasping, object segmentation encounters several difficulties when faced with dynamic conditions such as real-time operation, occlusion, low lighting, motion blur, and object size variability. In response to these challenges, we propose the Graph Mixer Neural Network that includes a novel collaborative contextual mixing layer, applied to 3D event graphs formed on asynchronous events. The proposed layer is designed to spread spatiotemporal correlation within an event graph at four nearest neighbor levels parallelly. We evaluate the effectiveness of our proposed method on the Event-based Segmentation (ESD) Dataset, which includes five unique image degradation challenges, including occlusion, blur, brightness, trajectory, scale variance, and segmentation of known and unknown objects. The results show that our proposed approach outperforms state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. Code available at: https://github.com/sanket0707/GNN-Mixer.git

CVApr 16, 2024
Neuromorphic Vision-based Motion Segmentation with Graph Transformer Neural Network

Yusra Alkendi, Rana Azzam, Sajid Javed et al.

Moving object segmentation is critical to interpret scene dynamics for robotic navigation systems in challenging environments. Neuromorphic vision sensors are tailored for motion perception due to their asynchronous nature, high temporal resolution, and reduced power consumption. However, their unconventional output requires novel perception paradigms to leverage their spatially sparse and temporally dense nature. In this work, we propose a novel event-based motion segmentation algorithm using a Graph Transformer Neural Network, dubbed GTNN. Our proposed algorithm processes event streams as 3D graphs by a series of nonlinear transformations to unveil local and global spatiotemporal correlations between events. Based on these correlations, events belonging to moving objects are segmented from the background without prior knowledge of the dynamic scene geometry. The algorithm is trained on publicly available datasets including MOD, EV-IMO, and \textcolor{black}{EV-IMO2} using the proposed training scheme to facilitate efficient training on extensive datasets. Moreover, we introduce the Dynamic Object Mask-aware Event Labeling (DOMEL) approach for generating approximate ground-truth labels for event-based motion segmentation datasets. We use DOMEL to label our own recorded Event dataset for Motion Segmentation (EMS-DOMEL), which we release to the public for further research and benchmarking. Rigorous experiments are conducted on several unseen publicly-available datasets where the results revealed that GTNN outperforms state-of-the-art methods in the presence of dynamic background variations, motion patterns, and multiple dynamic objects with varying sizes and velocities. GTNN achieves significant performance gains with an average increase of 9.4% and 4.5% in terms of motion segmentation accuracy (IoU%) and detection rate (DR%), respectively.

ROJan 5, 2022
Neuromorphic Vision Based Control for the Precise Positioning of Robotic Drilling Systems

Abdulla Ayyad, Mohamad Halwani, Dewald Swart et al.

The manufacturing industry is currently witnessing a paradigm shift with the unprecedented adoption of industrial robots, and machine vision is a key perception technology that enables these robots to perform precise operations in unstructured environments. However, the sensitivity of conventional vision sensors to lighting conditions and high-speed motion sets a limitation on the reliability and work-rate of production lines. Neuromorphic vision is a recent technology with the potential to address the challenges of conventional vision with its high temporal resolution, low latency, and wide dynamic range. In this paper and for the first time, we propose a novel neuromorphic vision based controller for faster and more reliable machining operations, and present a complete robotic system capable of performing drilling tasks with sub-millimeter accuracy. Our proposed system localizes the target workpiece in 3D using two perception stages that we developed specifically for the asynchronous output of neuromorphic cameras. The first stage performs multi-view reconstruction for an initial estimate of the workpiece's pose, and the second stage refines this estimate for a local region of the workpiece using circular hole detection. The robot then precisely positions the drilling end-effector and drills the target holes on the workpiece using a combined position-based and image-based visual servoing approach. The proposed solution is validated experimentally for drilling nutplate holes on workpieces placed arbitrarily in an unstructured environment with uncontrolled lighting. Experimental results prove the effectiveness of our solution with an average positional errors of less than 0.1 mm, and demonstrate that the use of neuromorphic vision overcomes the lighting and speed limitations of conventional cameras.

CVDec 17, 2021
Neuromorphic Camera Denoising using Graph Neural Network-driven Transformers

Yusra Alkendi, Rana Azzam, Abdulla Ayyad et al.

Neuromorphic vision is a bio-inspired technology that has triggered a paradigm shift in the computer-vision community and is serving as a key-enabler for a multitude of applications. This technology has offered significant advantages including reduced power consumption, reduced processing needs, and communication speed-ups. However, neuromorphic cameras suffer from significant amounts of measurement noise. This noise deteriorates the performance of neuromorphic event-based perception and navigation algorithms. In this paper, we propose a novel noise filtration algorithm to eliminate events which do not represent real log-intensity variations in the observed scene. We employ a Graph Neural Network (GNN)-driven transformer algorithm, called GNN-Transformer, to classify every active event pixel in the raw stream into real-log intensity variation or noise. Within the GNN, a message-passing framework, called EventConv, is carried out to reflect the spatiotemporal correlation among the events, while preserving their asynchronous nature. We also introduce the Known-object Ground-Truth Labeling (KoGTL) approach for generating approximate ground truth labels of event streams under various illumination conditions. KoGTL is used to generate labeled datasets, from experiments recorded in chalenging lighting conditions. These datasets are used to train and extensively test our proposed algorithm. When tested on unseen datasets, the proposed algorithm outperforms existing methods by 8.8% in terms of filtration accuracy. Additional tests are also conducted on publicly available datasets to demonstrate the generalization capabilities of the proposed algorithm in the presence of illumination variations and different motion dynamics. Compared to existing solutions, qualitative results verified the superior capability of the proposed algorithm to eliminate noise while preserving meaningful scene events.

ROJul 15, 2021
Real-Time Grasping Strategies Using Event Camera

Xiaoqian Huang, Mohamad Halwani, Rajkumar Muthusamy et al.

Robotic vision plays a key role for perceiving the environment in grasping applications. However, the conventional framed-based robotic vision, suffering from motion blur and low sampling rate, may not meet the automation needs of evolving industrial requirements. This paper, for the first time, proposes an event-based robotic grasping framework for multiple known and unknown objects in a cluttered scene. Compared with standard frame-based vision, neuromorphic vision has advantages of microsecond-level sampling rate and no motion blur. Building on that, the model-based and model-free approaches are developed for known and unknown objects' grasping respectively. For the model-based approach, event-based multi-view approach is used to localize the objects in the scene, and then point cloud processing allows for the clustering and registering of objects. Differently, the proposed model-free approach utilizes the developed event-based object segmentation, visual servoing and grasp planning to localize, align to, and grasp the targeting object. The proposed approaches are experimentally validated with objects of different sizes, using a UR10 robot with an eye-in-hand neuromorphic camera and a Barrett hand gripper. Moreover, the robustness of the two proposed event-based grasping approaches are validated in a low-light environment. This low-light operating ability shows a great advantage over the grasping using the standard frame-based vision. Furthermore, the developed model-free approach demonstrates the advantage of dealing with unknown object without prior knowledge compared to the proposed model-based approach.

ROJul 4, 2021
Noise Tolerant Identification and Tuning Approach Using Deep Neural Networks For Visual Servoing Applications

Oussama Abdul Hay, Mohamad Chehadeh, Abdulla Ayyad et al.

Vision based control of Unmanned Aerial Vehicles (UAVs) has been adopted by a wide range of applications due to the availability of low-cost on-board sensors and computers. Tuning such systems to work properly requires extensive domain specific experience, which limits the growth of emerging applications. Moreover, obtaining performance limits of UAV based visual servoing is difficult due to the complexity of the models used. In this paper, we propose a novel noise tolerant approach for real-time identification and tuning of visual servoing systems, based on deep neural networks (DNN) classification of system response generated by the modified relay feedback test (MRFT). The proposed method, called DNN with noise protected MRFT (DNN-NP-MRFT), can be used with a multitude of vision sensors and estimation algorithms despite the high levels of sensor's noise. Response of DNN-NP-MRFT to noise perturbations is investigated and its effect on identification and tuning performance is analyzed. The proposed DNN-NP-MRFT is able to detect performance changes due to the use of high latency vision sensors, or due to the integration of inertial measurement unit (IMU) measurements in the UAV states estimation. Experimental identification closely matches simulation results, which can be used to explain system behaviour and predict the closed loop performance limits for a given hardware and software setup. We also demonstrate the ability of DNN-NP-MRFT tuned UAVs to reject external disturbances like wind, or human push and pull. Finally, we discuss the advantages of the proposed DNN-NP-MRFT visual servoing design approach compared with other approaches in literature.

ROJun 14, 2021
Dynamic Based Estimator for UAVs with Real-time Identification Using DNN and the Modified Relay Feedback Test

Mohamad Wahbah, Mohamad Chehadeh, Yahya Zweiri

Control performance of Unmanned Aerial Vehicles (UAVs) is directly affected by their ability to estimate their states accurately. With the increasing popularity of autonomous UAV solutions in real world applications, it is imperative to develop robust adaptive estimators that can ameliorate sensor noises in low-cost UAVs. Utilizing the knowledge of UAV dynamics in estimation can provide significant advantages, but remains challenging due to the complex and expensive pre-flight experiments required to obtain UAV dynamic parameters. In this paper, we propose two decoupled dynamic model based Extended Kalman Filters for UAVs, that provide high rate estimates for position, and velocity of rotational and translational states, as well as filtered inertial acceleration. The dynamic model parameters are estimated online using the Deep Neural Network and Modified Relay Feedback Test (DNN-MRFT) framework, without requiring any prior knowledge of the UAV physical parameters. The designed filters with real-time identified process model parameters are tested experimentally and showed two advantages. Firstly, smooth and lag-free estimates of the UAV rotational speed and inertial acceleration are obtained, and used to improve the closed loop system performance, reducing the controller action by over 6 %. Secondly, the proposed approach enabled the UAV to track aggressive trajectories with low rate position measurements, a task usually infeasible under those conditions. The experimental data shows that we achieved estimation performance matching other methods that requires full knowledge of the UAV parameters.

ROJun 7, 2021
Systematic Online Tuning of Multirotor UAVs for Accurate Trajectory Tracking Under Wind Disturbances and In-Flight Dynamics Changes

Abdulaziz Y. Alkayas, Mohamad Chehadeh, Abdulla Ayyad et al.

The demand for accurate and fast trajectory tracking for multirotor Unmanned Aerial Vehicles (UAVs) have grown recently due to advances in UAV avionics technology and application domains. In many applications, the multirotor UAV is required to accurately perform aggressive maneuvers in challenging scenarios like the presence of external wind disturbances or in-flight payload changes. In this paper, we propose a systematic controller tuning approach based on identification results obtained by a recently developed Deep Neural Networks with the Modified Relay Feedback Test (DNN-MRFT) algorithm. We formulate a linear equivalent representation suitable for DNN-MRFT using feedback linearization. This representation enables the analytical investigation of different controller structures and tuning settings, and captures the non-linearity trends of the system. With this approach, the trade-off between performance and robustness in design was made possible which is convenient for the design of controllers of UAVs operating in uncertain environments. We demonstrate that our approach is adaptive and robust through a set of experiments, where accurate trajectory tracking is maintained despite significant changes to the UAV aerodynamic characteristics and the application of wind disturbance. Due to the model-based system design, it was possible to obtain low discrepancy between simulation and experimental results which is beneficial for potential use of the proposed approach for real-time model-based planning and fault detection tasks. We obtained RMSE of $3.59 \; cm$ when tracking aggressive trajectories in the presence of strong wind, which is on par with state-of-the-art.

ROApr 15, 2020
Neuromorphic Eye-in-Hand Visual Servoing

Rajkumar Muthusamy, Abdulla Ayyad, Mohamad Halwani et al.

Robotic vision plays a major role in factory automation to service robot applications. However, the traditional use of frame-based camera sets a limitation on continuous visual feedback due to their low sampling rate and redundant data in real-time image processing, especially in the case of high-speed tasks. Event cameras give human-like vision capabilities such as observing the dynamic changes asynchronously at a high temporal resolution ($1μs$) with low latency and wide dynamic range. In this paper, we present a visual servoing method using an event camera and a switching control strategy to explore, reach and grasp to achieve a manipulation task. We devise three surface layers of active events to directly process stream of events from relative motion. A purely event based approach is adopted to extract corner features, localize them robustly using heat maps and generate virtual features for tracking and alignment. Based on the visual feedback, the motion of the robot is controlled to make the temporal upcoming event features converge to the desired event in spatio-temporal space. The controller switches its strategy based on the sequence of operation to establish a stable grasp. The event based visual servoing (EVBS) method is validated experimentally using a commercial robot manipulator in an eye-in-hand configuration. Experiments prove the effectiveness of the EBVS method to track and grasp objects of different shapes without the need for re-tuning.

ROApr 15, 2020
Neuromorphic Event-Based Slip Detection and suppression in Robotic Grasping and Manipulation

Rajkumar Muthusamy, Xiaoqian Huang, Yahya Zweiri et al.

Slip detection is essential for robots to make robust grasping and fine manipulation. In this paper, a novel dynamic vision-based finger system for slip detection and suppression is proposed. We also present a baseline and feature based approach to detect object slips under illumination and vibration uncertainty. A threshold method is devised to autonomously sample noise in real-time to improve slip detection. Moreover, a fuzzy based suppression strategy using incipient slip feedback is proposed for regulating the grip force. A comprehensive experimental study of our proposed approaches under uncertainty and system for high-performance precision manipulation are presented. We also propose a slip metric to evaluate such performance quantitatively. Results indicate that the system can effectively detect incipient slip events at a sampling rate of 2kHz ($Δt = 500μs$) and suppress them before a gross slip occurs. The event-based approach holds promises to high precision manipulation task requirement in industrial manufacturing and household services.