LGApr 12, 2021Code
How Sensitive are Meta-Learners to Dataset Imbalance?Mateusz Ochal, Massimiliano Patacchiola, Amos Storkey et al.
Meta-Learning (ML) has proven to be a useful tool for training Few-Shot Learning (FSL) algorithms by exposure to batches of tasks sampled from a meta-dataset. However, the standard training procedure overlooks the dynamic nature of the real-world where object classes are likely to occur at different frequencies. While it is generally understood that imbalanced tasks harm the performance of supervised methods, there is no significant research examining the impact of imbalanced meta-datasets on the FSL evaluation task. This study exposes the magnitude and extent of this problem. Our results show that ML methods are more robust against meta-dataset imbalance than imbalance at the task-level with a similar imbalance ratio ($ρ<20$), with the effect holding even in long-tail datasets under a larger imbalance ($ρ=65$). Overall, these results highlight an implicit strength of ML algorithms, capable of learning generalizable features under dataset imbalance and domain-shift. The code to reproduce the experiments is released under an open-source license.
LGJan 7, 2021
Few-Shot Learning with Class ImbalanceMateusz Ochal, Massimiliano Patacchiola, Amos Storkey et al.
Few-Shot Learning (FSL) algorithms are commonly trained through Meta-Learning (ML), which exposes models to batches of tasks sampled from a meta-dataset to mimic tasks seen during evaluation. However, the standard training procedures overlook the real-world dynamics where classes commonly occur at different frequencies. While it is generally understood that class imbalance harms the performance of supervised methods, limited research examines the impact of imbalance on the FSL evaluation task. Our analysis compares 10 state-of-the-art meta-learning and FSL methods on different imbalance distributions and rebalancing techniques. Our results reveal that 1) some FSL methods display a natural disposition against imbalance while most other approaches produce a performance drop by up to 17\% compared to the balanced task without the appropriate mitigation; 2) contrary to popular belief, many meta-learning algorithms will not automatically learn to balance from exposure to imbalanced training tasks; 3) classical rebalancing strategies, such as random oversampling, can still be very effective, leading to state-of-the-art performances and should not be overlooked; 4) FSL methods are more robust against meta-dataset imbalance than imbalance at the task-level with a similar imbalance ratio ($ρ<20$), with the effect holding even in long-tail datasets under a larger imbalance ($ρ=65$).
CVSep 16, 2020
Similarity-based data mining for online domain adaptation of a sonar ATR systemJean de Bodinat, Thomas Guerneve, Jose Vazquez et al.
Due to the expensive nature of field data gathering, the lack of training data often limits the performance of Automatic Target Recognition (ATR) systems. This problem is often addressed with domain adaptation techniques, however the currently existing methods fail to satisfy the constraints of resource and time-limited underwater systems. We propose to address this issue via an online fine-tuning of the ATR algorithm using a novel data-selection method. Our proposed data-mining approach relies on visual similarity and outperforms the traditionally employed hard-mining methods. We present a comparative performance analysis in a wide range of simulated environments and highlight the benefits of using our method for the rapid adaptation to previously unseen environments.
CVMay 10, 2020
A Comparison of Few-Shot Learning Methods for Underwater Optical and Sonar Image ClassificationMateusz Ochal, Jose Vazquez, Yvan Petillot et al.
Deep convolutional neural networks generally perform well in underwater object recognition tasks on both optical and sonar images. Many such methods require hundreds, if not thousands, of images per class to generalize well to unseen examples. However, obtaining and labeling sufficiently large volumes of data can be relatively costly and time-consuming, especially when observing rare objects or performing real-time operations. Few-Shot Learning (FSL) efforts have produced many promising methods to deal with low data availability. However, little attention has been given in the underwater domain, where the style of images poses additional challenges for object recognition algorithms. To the best of our knowledge, this is the first paper to evaluate and compare several supervised and semi-supervised Few-Shot Learning (FSL) methods using underwater optical and side-scan sonar imagery. Our results show that FSL methods offer a significant advantage over the traditional transfer learning methods that fine-tune pre-trained models. We hope that our work will help apply FSL to autonomous underwater systems and expand their learning capabilities.
IVMar 2, 2020
Unlimited Resolution Image Generation with R2D2-GANsMarija Jegorova, Antti Ilari Karjalainen, Jose Vazquez et al.
In this paper we present a novel simulation technique for generating high quality images of any predefined resolution. This method can be used to synthesize sonar scans of size equivalent to those collected during a full-length mission, with across track resolutions of any chosen magnitude. In essence, our model extends Generative Adversarial Networks (GANs) based architecture into a conditional recursive setting, that facilitates the continuity of the generated images. The data produced is continuous, realistically-looking, and can also be generated at least two times faster than the real speed of acquisition for the sonars with higher resolutions, such as EdgeTech. The seabed topography can be fully controlled by the user. The visual assessment tests demonstrate that humans cannot distinguish the simulated images from real. Moreover, experimental results suggest that in the absence of real data the autonomous recognition systems can benefit greatly from training with the synthetic data, produced by the R2D2-GANs.
LGOct 15, 2019
Full-Scale Continuous Synthetic Sonar Data Generation with Markov Conditional Generative Adversarial NetworksMarija Jegorova, Antti Ilari Karjalainen, Jose Vazquez et al.
Deployment and operation of autonomous underwater vehicles is expensive and time-consuming. High-quality realistic sonar data simulation could be of benefit to multiple applications, including training of human operators for post-mission analysis, as well as tuning and validation of autonomous target recognition (ATR) systems for underwater vehicles. Producing realistic synthetic sonar imagery is a challenging problem as the model has to account for specific artefacts of real acoustic sensors, vehicle altitude, and a variety of environmental factors. We propose a novel method for generating realistic-looking sonar side-scans of full-length missions, called Markov Conditional pix2pix (MC-pix2pix). Quantitative assessment results confirm that the quality of the produced data is almost indistinguishable from real. Furthermore, we show that bootstrapping ATR systems with MC-pix2pix data can improve the performance. Synthetic data is generated 18 times faster than real acquisition speed, with full user control over the topography of the generated data.