SYMay 8
Bluetooth Phased-array Aided Inertial Navigation Using Factor Graphs: Experimental VerificationGlen Hjelmerud Mørkbak Sørensen, Torleiv H. Bryne, Kristoffer Gryte et al.
Phased-array Bluetooth systems have emerged as a low-cost alternative for performing aided inertial navigation in GNSS-denied use cases such as warehouse logistics, drone landings, and autonomous docking. Basing a navigation system off of commercial-off-the-shelf components may reduce the barrier of entry for phased-array radio navigation systems, albeit at the cost of significantly noisier measurements and relatively short feasible range. In this paper, we compare robust estimation strategies for a factor graph optimisation-based estimator using experimental data collected from multirotor drone flight. We evaluate performance in loss-of-GNSS scenarios when aided by Bluetooth angular measurements, as well as range or barometric pressure.
CVApr 18
Hyperspectral Unmixing HierarchiesJoseph L. Garrett, P. S. Vishnu, Pauliina Salmi et al.
Unmixing reveals the spatial distribution and spectral details of different constituents, called endmembers, in a hyperspectral image. Because unmixing has limited ground truth requirements, can accommodate mixed pixels, and is closely tied to light propagation, it is a uniquely powerful tool for analyzing hyperspectral images. However, spectral variability inhibits unmixing performance, the proper way to determine the number of endmembers is ambiguous, and the clarity of the endmembers degrades as more are included. Hierarchical structure is a possible solution to all three problems. Here, hierarchical unmixing is defined by imposing a hierarchical abundance sum constraint on Deep Nonnegative Matrix Factorization. Binary Linear Unmixing Tactile Hierarchies (BLUTHs) solve the hierarchical unmixing problem with a simple network architecture. Sparsity modulation unmixing growth tailors the topology of a BLUTH to each scene. The structure imposed by BLUTHs allows endmembers with varying levels of spectral contrast to be revealed, mitigating the challenge of spectral variability. The performance of BLUTHs exceeds state-of-the-art unmixing algorithms on laboratory scenes, particularly with regard to abundance estimation, while their performance remains competitive on remote sensing scenes. In addition, ocean color unmixing by BLUTHs is demonstrated on hyperspectral scenes from the HYPSO and PACE satellites.
CVOct 24, 2023
Semantic Segmentation in Satellite Hyperspectral Imagery by Deep LearningJon Alvarez Justo, Alexandru Ghita, Daniel Kovac et al.
Satellites are increasingly adopting on-board AI to optimize operations and increase autonomy through in-orbit inference. The use of Deep Learning (DL) models for segmentation in hyperspectral imagery offers advantages for remote sensing applications. In this work, we train and test 20 models for multi-class segmentation in hyperspectral imagery, selected for their potential in future space deployment. These models include 1D and 2D Convolutional Neural Networks (CNNs) and the latest vision transformers (ViTs). We propose a lightweight 1D-CNN model, 1D-Justo-LiuNet, which outperforms state-of-the-art models in the hypespectral domain. 1D-Justo-LiuNet exceeds the performance of 2D-CNN UNets and outperforms Apple's lightweight vision transformers designed for mobile inference. 1D-Justo-LiuNet achieves the highest accuracy (0.93) with the smallest model size (4,563 parameters) among all tested models, while maintaining fast inference. Unlike 2D-CNNs and ViTs, which encode both spectral and spatial information, 1D-Justo-LiuNet focuses solely on the rich spectral features in hyperspectral data, benefitting from the high-dimensional feature space. Our findings are validated across various satellite datasets, with the HYPSO-1 mission serving as the primary case study for sea, land, and cloud segmentation. We further confirm our conclusions through generalization tests on other hyperspectral missions, such as NASA's EO-1. Based on its superior performance and compact size, we conclude that 1D-Justo-LiuNet is highly suitable for in-orbit deployment, providing an effective solution for optimizing and automating satellite operations at edge.
ROApr 21
Multi-Step Gaussian Process Propagation for Adaptive Path PlanningAlex Beaudin, Bjørn Andreas Kristiansen, Kristoffer Gryte et al.
Efficient and robust path planning hinges on combining all accessible information sources. In particular, the task of path planning for robotic environmental exploration and monitoring depends highly on the current belief of the world. To capture the uncertainty in the belief, we present a Gaussian process based path planning method that adapts to multi-modal environmental sensing data and incorporates state and input constraints. To solve the path planning problem, we optimize over future waypoints in a receding horizon fashion, and our cost is thus a function of the Gaussian process posterior over all these waypoints. We demonstrate this method, dubbed OLAhGP, on an autonomous surface vessel using oceanic algal bloom data from both a high-fidelity model and in-situ sensing data in a monitoring scenario. Our simulated and experimental results demonstrate significant improvement over existing methods. With the same number of samples, our method generates more informative paths and achieves greater accuracy in identifying algal blooms in chlorophyll a rich waters, measured with respect to total misclassification probability and binary misclassification rate over the domain of interest.
CVMar 13, 2024
Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite DataDaniel Kovac, Jan Mucha, Jon Alvarez Justo et al.
This article explores the latest Convolutional Neural Networks (CNNs) for cloud detection aboard hyperspectral satellites. The performance of the latest 1D CNN (1D-Justo-LiuNet) and two recent 2D CNNs (nnU-net and 2D-Justo-UNet-Simple) for cloud segmentation and classification is assessed. Evaluation criteria include precision and computational efficiency for in-orbit deployment. Experiments utilize NASA's EO-1 Hyperion data, with varying spectral channel numbers after Principal Component Analysis. Results indicate that 1D-Justo-LiuNet achieves the highest accuracy, outperforming 2D CNNs, while maintaining compactness with larger spectral channel sets, albeit with increased inference times. However, the performance of 1D CNN degrades with significant channel reduction. In this context, the 2D-Justo-UNet-Simple offers the best balance for in-orbit deployment, considering precision, memory, and time costs. While nnU-net is suitable for on-ground processing, deployment of lightweight 1D-Justo-LiuNet is recommended for high-precision applications. Alternatively, lightweight 2D-Justo-UNet-Simple is recommended for balanced costs between timing and precision in orbit.
CVJan 26, 2024
A Comparative Study of Compressive Sensing Algorithms for Hyperspectral Imaging ReconstructionJon Alvarez Justo, Daniela Lupu, Milica Orlandic et al.
Hyperspectral Imaging comprises excessive data consequently leading to significant challenges for data processing, storage and transmission. Compressive Sensing has been used in the field of Hyperspectral Imaging as a technique to compress the large amount of data. This work addresses the recovery of hyperspectral images 2.5x compressed. A comparative study in terms of the accuracy and the performance of the convex FISTA/ADMM in addition to the greedy gOMP/BIHT/CoSaMP recovery algorithms is presented. The results indicate that the algorithms recover successfully the compressed data, yet the gOMP algorithm achieves superior accuracy and faster recovery in comparison to the other algorithms at the expense of high dependence on unknown sparsity level of the data to recover.
SYNov 7, 2021
Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field ExperimentsEivind Bøhn, Erlend M. Coates, Dirk Reinhardt et al.
Attitude control of fixed-wing unmanned aerial vehicles (UAVs) is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. Deep reinforcement learning (DRL) is a machine learning method to automatically discover optimal control laws through interaction with the controlled system, which can handle complex nonlinear dynamics. We show in this paper that DRL can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics, requiring as little as three minutes of flight data. We initially train our model in a simulation environment and then deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlane proportional-integral-derivative (PID) attitude controller with no further online learning required. Learning with significant actuation delay and diversified simulated dynamics were found to be crucial for successful transfer to control of the real UAV. In addition to a qualitative comparison with the ArduPlane autopilot, we present a quantitative assessment based on linear analysis to better understand the learning controller's behavior.
SYNov 7, 2021
Optimization of the Model Predictive Control Meta-Parameters Through Reinforcement LearningEivind Bøhn, Sebastien Gros, Signe Moe et al.
Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC parameters must be tuned, which is largely a trial-and-error process that affects the control performance, the robustness and the computational complexity of the controller to a high degree. In this paper, we propose a novel framework in which any parameter of the control algorithm can be jointly tuned using reinforcement learning(RL), with the goal of simultaneously optimizing the control performance and the power usage of the control algorithm. We propose the novel idea of optimizing the meta-parameters of MPCwith RL, i.e. parameters affecting the structure of the MPCproblem as opposed to the solution to a given problem. Our control algorithm is based on an event-triggered MPC where we learn when the MPC should be re-computed, and a dual mode MPC and linear state feedback control law applied in between MPC computations. We formulate a novel mixture-distribution policy and show that with joint optimization we achieve improvements that do not present themselves when optimizing the same parameters in isolation. We demonstrate our framework on the inverted pendulum control task, reducing the total computation time of the control system by 36% while also improving the control performance by 18.4% over the best-performing MPC baseline.
SYFeb 22, 2021
Reinforcement Learning of the Prediction Horizon in Model Predictive ControlEivind Bøhn, Sebastien Gros, Signe Moe et al.
Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC's capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and the necessity of tuning its parameters to the specific control application. The main tunable parameter affecting the computational complexity is the prediction horizon length, controlling how far into the future the MPC predicts the system response and thus evaluates the optimality of its computed trajectory. A longer horizon generally increases the control performance, but requires an increasingly powerful computing platform, excluding certain control applications.The performance sensitivity to the prediction horizon length varies over the state space, and this motivated the adaptive horizon model predictive control (AHMPC), which adapts the prediction horizon according to some criteria. In this paper we propose to learn the optimal prediction horizon as a function of the state using reinforcement learning (RL). We show how the RL learning problem can be formulated and test our method on two control tasks, showing clear improvements over the fixed horizon MPC scheme, while requiring only minutes of learning.
SYNov 26, 2020
Optimization of the Model Predictive Control Update Interval Using Reinforcement LearningEivind Bøhn, Sebastien Gros, Signe Moe et al.
In control applications there is often a compromise that needs to be made with regards to the complexity and performance of the controller and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery powered applications the control system can account for a significant portion of the energy consumption. We propose a controller architecture in which the computational cost is explicitly optimized along with the control objective. This is achieved by a three-part architecture where a high-level, computationally expensive controller generates plans, which a computationally simpler controller executes by compensating for prediction errors, while a recomputation policy decides when the plan should be recomputed. In this paper, we employ model predictive control (MPC) as the high-level plan-generating controller, a linear state feedback controller as the simpler compensating controller, and reinforcement learning (RL) to learn the recomputation policy. Simulation results for two examples showcase the architecture's ability to improve upon the MPC approach and find reasonable compromises weighing the performance on the control objective and the computational resources expended.
LGNov 21, 2019
Accelerating Reinforcement Learning with Suboptimal GuidanceEivind Bøhn, Signe Moe, Tor Arne Johansen
Reinforcement Learning in domains with sparse rewards is a difficult problem, and a large part of the training process is often spent searching the state space in a more or less random fashion for any learning signals. For control problems, we often have some controller readily available which might be suboptimal but nevertheless solves the problem to some degree. This controller can be used to guide the initial exploration phase of the learning controller towards reward yielding states, reducing the time before refinement of a viable policy can be initiated. In our work, the agent is guided through an auxiliary behaviour cloning loss which is made conditional on a Q-filter, i.e. it is only applied in situations where the critic deems the guiding controller to be better than the agent. The Q-filter provides a natural way to adjust the guidance throughout the training process, allowing the agent to exceed the guiding controller in a manner that is adaptive to the task at hand and the proficiency of the guiding controller. The contribution of this paper lies in identifying shortcomings in previously proposed implementations of the Q-filter concept, and in suggesting some ways these issues can be mitigated. These modifications are tested on the OpenAI Gym Fetch environments, showing clear improvements in adaptivity and yielding increased performance in all robotic environments tested.
RONov 13, 2019
Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using Proximal Policy OptimizationEivind Bøhn, Erlend M. Coates, Signe Moe et al.
Contemporary autopilot systems for unmanned aerial vehicles (UAVs) are far more limited in their flight envelope as compared to experienced human pilots, thereby restricting the conditions UAVs can operate in and the types of missions they can accomplish autonomously. This paper proposes a deep reinforcement learning (DRL) controller to handle the nonlinear attitude control problem, enabling extended flight envelopes for fixed-wing UAVs. A proof-of-concept controller using the proximal policy optimization (PPO) algorithm is developed, and is shown to be capable of stabilizing a fixed-wing UAV from a large set of initial conditions to reference roll, pitch and airspeed values. The training process is outlined and key factors for its progression rate are considered, with the most important factor found to be limiting the number of variables in the observation vector, and including values for several previous time steps for these variables. The trained reinforcement learning (RL) controller is compared to a proportional-integral-derivative (PID) controller, and is found to converge in more cases than the PID controller, with comparable performance. Furthermore, the RL controller is shown to generalize well to unseen disturbances in the form of wind and turbulence, even in severe disturbance conditions.
ROMar 14, 2019
Cooperative decentralized circumnavigation with application to algal bloom trackingJoana Fonseca, Jieqiang Wei, Karl H. Johansson et al.
Harmful algal blooms occur frequently and deteriorate water quality. A reliable method is proposed in this paper to track algal blooms using a set of autonomous surface robots. A satellite image indicates the existence and initial location of the algal bloom for the deployment of the robot system. The algal bloom area is approximated by a circle with time varying location and size. This circle is estimated and circumnavigated by the robots which are able to locally sense its boundary. A multi-agent control algorithm is proposed for the continuous monitoring of the dynamic evolution of the algal bloom. Such algorithm comprises of a decentralized least squares estimation of the target and a controller for circumnavigation. We prove the convergence of the robots to the circle and in equally spaced positions around it. Simulation results with data provided by the SINMOD ocean model are used to illustrate the theoretical results.
ROFeb 8, 2019
Towards autonomous ocean observing systems using Miniature Underwater Gliders with UAV deployment and recovery capabilitiesErik Sollesnes, Ole Martin Brokstad, Rolf Klæboe et al.
This paper presents preliminary results towards the development of an autonomous ocean observing system using Miniature Underwater Gliders (MUGs) that can operate with the support of Unmanned Aerial Vehicles (UAVs) and Unmanned Surface Vessels (USVs) for deployment, recovery, battery charging, and communication relay. The system reduces human intervention to the minimum, revolutionizing the affordability of a broad range of surveillance and data collection operations. The MUGs are equipped with a small Variable Buoyancy System (VBS) composed of a gas filled piston and a linear actuator powered by brushless DC motor and a rechargable lithium ion battery in an oil filled flexible enclosure. By using a fully pressure tolerant electronic design the aim is to reduce the total complexity, weight, and cost of the overall system. A first prototype of the VBS was built and demonstrated in a small aquarium. The electronic components were tested in a pressure testing facility to a minimum of 20bar. Preliminary results are promising and future work will focus on system and weight optimization, UAV deployment/recovery strategies, as well as sea trials to an operating depth of 200m.