Fernando Cladera Ojeda

4papers

148citations

Novelty51%

AI Score26

Ranked #165,721 of 205,806 authors (top 81%)#50,424 in CV (top 86%)

4 Papers

ROApr 14, 2023

EV-Catcher: High-Speed Object Catching Using Low-latency Event-based Neural Networks

Ziyun Wang, Fernando Cladera Ojeda, Anthony Bisulco et al.

Event-based sensors have recently drawn increasing interest in robotic perception due to their lower latency, higher dynamic range, and lower bandwidth requirements compared to standard CMOS-based imagers. These properties make them ideal tools for real-time perception tasks in highly dynamic environments. In this work, we demonstrate an application where event cameras excel: accurately estimating the impact location of fast-moving objects. We introduce a lightweight event representation called Binary Event History Image (BEHI) to encode event data at low latency, as well as a learning-based approach that allows real-time inference of a confidence-enabled control signal to the robot. To validate our approach, we present an experimental catching system in which we catch fast-flying ping-pong balls. We show that the system is capable of achieving a success rate of 81% in catching balls targeted at different locations, with a velocity of up to 13 m/s even on compute-constrained embedded platforms such as the Nvidia Jetson NX.

ROSep 14, 2021

Large-scale Autonomous Flight with Real-time Semantic SLAM under Dense Forest Canopy

Xu Liu, Guilherme V. Nardari, Fernando Cladera Ojeda et al.

Semantic maps represent the environment using a set of semantically meaningful objects. This representation is storage-efficient, less ambiguous, and more informative, thus facilitating large-scale autonomy and the acquisition of actionable information in highly unstructured, GPS-denied environments. In this letter, we propose an integrated system that can perform large-scale autonomous flights and real-time semantic mapping in challenging under-canopy environments. We detect and model tree trunks and ground planes from LiDAR data, which are associated across scans and used to constrain robot poses as well as tree trunk models. The autonomous navigation module utilizes a multi-level planning and mapping framework and computes dynamically feasible trajectories that lead the UAV to build a semantic map of the user-defined region of interest in a computationally and storage efficient manner. A drift-compensation mechanism is designed to minimize the odometry drift using semantic SLAM outputs in real time, while maintaining planner optimality and controller stability. This leads the UAV to execute its mission accurately and safely at scale.

CVNov 18, 2020

Fast Motion Understanding with Spatiotemporal Neural Networks and Dynamic Vision Sensors

Anthony Bisulco, Fernando Cladera Ojeda, Volkan Isler et al.

This paper presents a Dynamic Vision Sensor (DVS) based system for reasoning about high speed motion. As a representative scenario, we consider the case of a robot at rest reacting to a small, fast approaching object at speeds higher than 15m/s. Since conventional image sensors at typical frame rates observe such an object for only a few frames, estimating the underlying motion presents a considerable challenge for standard computer vision systems and algorithms. In this paper we present a method motivated by how animals such as insects solve this problem with their relatively simple vision systems. Our solution takes the event stream from a DVS and first encodes the temporal events with a set of causal exponential filters across multiple time scales. We couple these filters with a Convolutional Neural Network (CNN) to efficiently extract relevant spatiotemporal features. The combined network learns to output both the expected time to collision of the object, as well as the predicted collision point on a discretized polar grid. These critical estimates are computed with minimal delay by the network in order to react appropriately to the incoming object. We highlight the results of our system to a toy dart moving at 23.4m/s with a 24.73° error in $θ$, 18.4mm average discretized radius prediction error, and 25.03% median time to collision prediction error.

CVApr 3, 2020

Near-chip Dynamic Vision Filtering for Low-Bandwidth Pedestrian Detection

Anthony Bisulco, Fernando Cladera Ojeda, Volkan Isler et al.

This paper presents a novel end-to-end system for pedestrian detection using Dynamic Vision Sensors (DVSs). We target applications where multiple sensors transmit data to a local processing unit, which executes a detection algorithm. Our system is composed of (i) a near-chip event filter that compresses and denoises the event stream from the DVS, and (ii) a Binary Neural Network (BNN) detection module that runs on a low-computation edge computing device (in our case a STM32F4 microcontroller). We present the system architecture and provide an end-to-end implementation for pedestrian detection in an office environment. Our implementation reduces transmission size by up to 99.6% compared to transmitting the raw event stream. The average packet size in our system is only 1397 bits, while 307.2 kb are required to send an uncompressed DVS time window. Our detector is able to perform a detection every 450 ms, with an overall testing F1 score of 83%. The low bandwidth and energy properties of our system make it ideal for IoT applications.