Towards Robust Deep Active Learning for Scientific Computing
This addresses robustness issues in DAL for scientific computing regression, offering a more reliable method for data-efficient deep learning in this domain.
The paper tackles the problem of deep active learning (DAL) methods being sensitive to an untunable hyperparameter (pool ratio) in scientific computing regression tasks, showing that this reduces performance and can make them worse than random sampling. It proposes NA-QBC, a query synthesis method that removes this hyperparameter and outperforms other DAL methods on benchmarks while always beating random sampling.
Deep learning (DL) is revolutionizing the scientific computing community. To reduce the data gap, active learning has been identified as a promising solution for DL in the scientific computing community. However, the deep active learning (DAL) literature is dominated by image classification problems and pool-based methods. Here we investigate the robustness of pool-based DAL methods for scientific computing problems (dominated by regression) where DNNs are increasingly used. We show that modern pool-based DAL methods all share an untunable hyperparameter, termed the pool ratio, denoted $γ$, which is often assumed to be known apriori in the literature. We evaluate the performance of five state-of-the-art DAL methods on six benchmark problems if we assume $γ$ is \textit{not} known - a more realistic assumption for scientific computing problems. Our results indicate that this reduces the performance of modern DAL methods and that they sometimes can even perform worse than random sampling, creating significant uncertainty when used in real-world settings. To overcome this limitation we propose, to our knowledge, the first query synthesis DAL method for regression, termed NA-QBC. NA-QBC removes the sensitive $γ$ hyperparameter and we find that, on average, it outperforms the other DAL methods on our benchmark problems. Crucially, NA-QBC always outperforms random sampling, providing more robust performance benefits.