AIOct 28, 2025
Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural NetworksKorneel Van den Berghe, Stein Stroobants, Vijay Janapa Reddi et al.
Neuromorphic computing systems are set to revolutionize energy-constrained robotics by achieving orders-of-magnitude efficiency gains, while enabling native temporal processing. Spiking Neural Networks (SNNs) represent a promising algorithmic approach for these systems, yet their application to complex control tasks faces two critical challenges: (1) the non-differentiable nature of spiking neurons necessitates surrogate gradients with unclear optimization properties, and (2) the stateful dynamics of SNNs require training on sequences, which in reinforcement learning (RL) is hindered by limited sequence lengths during early training, preventing the network from bridging its warm-up period. We address these challenges by systematically analyzing surrogate gradient slope settings, showing that shallower slopes increase gradient magnitude in deeper layers but reduce alignment with true gradients. In supervised learning, we find no clear preference for fixed or scheduled slopes. The effect is much more pronounced in RL settings, where shallower slopes or scheduled slopes lead to a 2.1x improvement in both training and final deployed performance. Next, we propose a novel training approach that leverages a privileged guiding policy to bootstrap the learning process, while still exploiting online environment interactions with the spiking policy. Combining our method with an adaptive slope schedule for a real-world drone position control task, we achieve an average return of 400 points, substantially outperforming prior techniques, including Behavioral Cloning and TD3BC, which achieve at most --200 points under the same conditions. This work advances both the theoretical understanding of surrogate gradient learning in SNNs and practical training methodologies for neuromorphic controllers demonstrated in real-world robotic systems.
NEFeb 24, 2022
Evolving-to-Learn Reinforcement Learning Tasks with Spiking Neural NetworksJ. Lu, J. J. Hagenaars, G. C. H. E. de Croon
Inspired by the natural nervous system, synaptic plasticity rules are applied to train spiking neural networks with local information, making them suitable for online learning on neuromorphic hardware. However, when such rules are implemented to learn different new tasks, they usually require a significant amount of work on task-dependent fine-tuning. This paper aims to make this process easier by employing an evolutionary algorithm that evolves suitable synaptic plasticity rules for the task at hand. More specifically, we provide a set of various local signals, a set of mathematical operators, and a global reward signal, after which a Cartesian genetic programming process finds an optimal learning rule from these components. Using this approach, we find learning rules that successfully solve an XOR and cart-pole task, and discover new learning rules that outperform the baseline rules from literature.
CVSep 17, 2020
Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric ConstancyF. Paredes-Vallés, G. C. H. E. de Croon
Event cameras are novel vision sensors that sample, in an asynchronous fashion, brightness increments with low latency and high temporal resolution. The resulting streams of events are of high value by themselves, especially for high speed motion estimation. However, a growing body of work has also focused on the reconstruction of intensity frames from the events, as this allows bridging the gap with the existing literature on appearance- and frame-based computer vision. Recent work has mostly approached this problem using neural networks trained with synthetic, ground-truth data. In this work we approach, for the first time, the intensity reconstruction problem from a self-supervised learning perspective. Our method, which leverages the knowledge of the inner workings of event cameras, combines estimated optical flow and the event-based photometric constancy to train neural networks without the need for any ground-truth or synthetic data. Results across multiple datasets show that the performance of the proposed self-supervised approach is in line with the state-of-the-art. Additionally, we propose a novel, lightweight neural network for optical flow estimation that achieves high speed inference with only a minor drop in performance.
CVApr 20, 2020
How Do Neural Networks Estimate Optical Flow? A Neuropsychology-Inspired StudyD. B. de Jong, F. Paredes-Vallés, G. C. H. E. de Croon
End-to-end trained convolutional neural networks have led to a breakthrough in optical flow estimation. The most recent advances focus on improving the optical flow estimation by improving the architecture and setting a new benchmark on the publicly available MPI-Sintel dataset. Instead, in this article, we investigate how deep neural networks estimate optical flow. A better understanding of how these networks function is important for (i) assessing their generalization capabilities to unseen inputs, and (ii) suggesting changes to improve their performance. For our investigation, we focus on FlowNetS, as it is the prototype of an encoder-decoder neural network for optical flow estimation. Furthermore, we use a filter identification method that has played a major role in uncovering the motion filters present in animal brains in neuropsychological research. The method shows that the filters in the deepest layer of FlowNetS are sensitive to a variety of motion patterns. Not only do we find translation filters, as demonstrated in animal brains, but thanks to the easier measurements in artificial neural networks, we even unveil dilation, rotation, and occlusion filters. Furthermore, we find similarities in the refinement part of the network and the perceptual filling-in process which occurs in the mammal primary visual cortex.
ROMar 6, 2020
Evolved Neuromorphic Control for High Speed Divergence-based Landings of MAVsJ. J. Hagenaars, F. Paredes-Vallés, S. M. Bohté et al.
Flying insects are capable of vision-based navigation in cluttered environments, reliably avoiding obstacles through fast and agile maneuvers, while being very efficient in the processing of visual stimuli. Meanwhile, autonomous micro air vehicles still lag far behind their biological counterparts, displaying inferior performance at a much higher energy consumption. In light of this, we want to mimic flying insects in terms of their processing capabilities, and consequently show the efficiency of this approach in the real world. This letter does so through evolving spiking neural networks for controlling landings of micro air vehicles using optical flow divergence from a downward-looking camera. We demonstrate that the resulting neuromorphic controllers transfer robustly from a highly abstracted simulation to the real world, performing fast and safe landings while keeping network spike rate minimal. Furthermore, we provide insight into the resources required for successfully solving the problem of divergence-based landing, showing that high-resolution control can be learned with only a single spiking neuron. To the best of our knowledge, this work is the first to integrate spiking neural networks in the control loop of a real-world flying robot. Videos of the experiments can be found at https://bit.ly/neuro-controller .
ROSep 16, 2018
Autonomous drone race: A computationally efficient vision-based navigation and control strategyS. Li, M. M. O. I. Ozo, C. De Wagter et al.
Drone racing is becoming a popular sport where human pilots have to control their drones to fly at high speed through complex environments and pass a number of gates in a pre-defined sequence. In this paper, we develop an autonomous system for drones to race fully autonomously using only onboard resources. Instead of commonly used visual navigation methods, such as simultaneous localization and mapping and visual inertial odometry, which are computationally expensive for micro aerial vehicles (MAVs), we developed the highly efficient snake gate detection algorithm for visual navigation, which can detect the gate at 20HZ on a Parrot Bebop drone. Then, with the gate detection result, we developed a robust pose estimation algorithm which has better tolerance to detection noise than a state-of-the-art perspective-n-point method. During the race, sometimes the gates are not in the drone's field of view. For this case, a state prediction-based feed-forward control strategy is developed to steer the drone to fly to the next gate. Experiments show that the drone can fly a half-circle with 1.5m radius within 2 seconds with only 30cm error at the end of the circle without any position feedback. Finally, the whole system is tested in a complex environment (a showroom in the faculty of Aerospace Engineering, TU Delft). The result shows that the drone can complete the track of 15 gates with a speed of 1.5m/s which is faster than the speeds exhibited at the 2016 and 2017 IROS autonomous drone races.
ROFeb 2, 2018
Incremental Control and Guidance of Hybrid Aircraft Applied to a Tailsitter UAVE. J. J. Smeur, M. Bronz, G. C. H. E. de Croon
Hybrid unmanned aircraft can significantly increase the potential of micro air vehicles, because they combine hovering capability with a wing for fast and efficient forward flight. However, these vehicles are very difficult to control, because their aerodynamics are hard to model and they are susceptible to wind gusts. This often leads to composite and complex controllers, with different modes for hover, transition and forward flight. In this paper, we propose incremental nonlinear dynamic inversion control for the attitude and position control. The result is a single, continuous controller, that is able to track the desired acceleration of the vehicle across the flight envelope. The proposed controller is implemented on the Cyclone hybrid UAV. Multiple outdoor experiments are performed, showing that unmodeled forces and moments are effectively compensated by the incremental control structure. Finally, we provide a comprehensive procedure for the implementation of the controller on other types of hybrid UAVs.
ROSep 23, 2017
Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?G. C. H. E. de Croon
Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors.
ROOct 23, 2016
Efficient Global Indoor Localization for Micro Aerial VehiclesV. Strobel, R. Meertens, G. C. H. E. de Croon
Indoor localization for autonomous micro aerial vehicles (MAVs) requires specific localization techniques, since the Global Positioning System (GPS) is usually not available. We present an efficient onboard computer vision approach that estimates 2D positions of an MAV in real-time. This global localization system does not suffer from error accumulation over time and uses a $k$-Nearest Neighbors ($k$-NN) algorithm to predict positions based on textons---small characteristic image patches that capture the texture of an environment. A particle filter aggregates the estimates and resolves positional ambiguities. To predict the performance of the approach in a given setting, we developed an evaluation technique that compares environments and identifies critical areas within them. We conducted flight tests to demonstrate the applicability of our approach. The algorithm has a localization accuracy of approximately 0.6 m on a 5 m$\times$5 m area at a runtime of 32 ms on board of an MAV. Based on random sampling, its computational effort is scalable to different platforms, trading off speed and accuracy.
ROSep 21, 2016
Adaptive Control Strategy for Constant Optical Flow Divergence LandingH. W. Ho, G. C. H. E. de Croon, E. van Kampen et al.
Bio-inspired methods can provide efficient solutions to perform autonomous landing for Micro Air Vehicles (MAVs). Flying insects such as honeybees perform vertical landings by keeping flow divergence constant. This leads to an exponential decay of both height and vertical velocity, and allows for smooth and safe landings. However, the presence of noise and delay in obtaining flow divergence estimates will cause instability of the landing when the control gains are not adapted to the height. In this paper, we propose a strategy that deals with this fundamental problem of optical flow control. The key to the strategy lies in the use of a recent theory that allows the MAV to see distance by means of its control instability. At the start of a landing, the MAV detects the height by means of an oscillating movement and sets the control gains accordingly. Then, during descent, the gains are reduced exponentially, with mechanisms in place to reduce or increase the gains if the actual trajectory deviates too much from an ideal constant divergence landing. Real-world experiments demonstrate stable landings of the MAV in both indoor and windy outdoor environments.
ROSep 4, 2015
Optical-Flow based Self-Supervised Learning of Obstacle Appearance applied to MAV LandingH. W. Ho, C. De Wagter, B. D. W. Remes et al.
Monocular optical flow has been widely used to detect obstacles in Micro Air Vehicles (MAVs) during visual navigation. However, this approach requires significant movement, which reduces the efficiency of navigation and may even introduce risks in narrow spaces. In this paper, we introduce a novel setup of self-supervised learning (SSL), in which optical flow cues serve as a scaffold to learn the visual appearance of obstacles in the environment. We apply it to a landing task, in which initially 'surface roughness' is estimated from the optical flow field in order to detect obstacles. Subsequently, a linear regression function is learned that maps appearance features represented by texton distributions to the roughness estimate. After learning, the MAV can detect obstacles by just analyzing a still image. This allows the MAV to search for a landing spot without moving. We first demonstrate this principle to work with offline tests involving images captured from an on-board camera, and then demonstrate the principle in flight. Although surface roughness is a property of the entire flow field in the global image, the appearance learning even allows for the pixel-wise segmentation of obstacles.
ROJun 3, 2015
Distance estimation with efference copies and optical flow maneuvers: a stability-based strategyG. C. H. E. de Croon
The visual cue of optical flow plays a major role in the navigation of flying insects, and is increasingly studied for use by small flying robots as well. A major problem is that successful optical flow control seems to require distance estimates, while optical flow is known to provide only the ratio of velocity to distance. In this article, a novel, stability-based strategy is proposed to estimate distances with monocular optical flow and knowledge of the control inputs (efference copies). It is shown analytically that given a fixed control gain, the stability of a constant divergence control loop only depends on the distance to the approached surface. At close distances, the control loop first starts to exhibit self-induced oscillations, eventually leading to instability. The proposed stability-based strategy for estimating distances has two major attractive characteristics. First, self-induced oscillations are easy for the robot to detect and are hardly influenced by wind. Second, the distance can be estimated during a zero divergence maneuver, i.e., around hover. The stability-based strategy is implemented and tested both in simulation and with a Parrot AR drone 2.0. It is shown that it can be used to: (1) trigger a final approach response during a constant divergence landing with fixed gain, (2) estimate the distance in hover, and (3) estimate distances during an entire landing if the robot uses adaptive gain control to continuously stay on the 'edge of oscillation'.