Efficient Parameter Sampling for Neural Network Construction
This addresses the challenge of expensive and infeasible data collection for deep learning in specific domains like high energy density physics, though it is incremental as it builds on existing active learning and ensemble methods.
The paper tackles the problem of redundant information in large training datasets by introducing an algorithm that dynamically selects instances from uncertain parameter regions using multiple CNNs, reducing training dataset sizes by almost 90% while maintaining predictive power in high energy density physics diagnostics.
The customizable nature of deep learning models have allowed them to be successful predictors in various disciplines. These models are often trained with respect to thousands or millions of instances for complicated problems, but the gathering of such an immense collection may be infeasible and expensive. However, what often occurs is the pollution of redundant information from these instances to the deep learning models. This paper outlines an algorithm that dynamically selects and appends instances to a training dataset from uncertain regions of the parameter space based on differences in predictions from multiple convolutional neural networks (CNNs). These CNNs are also simultaneously trained on this growing dataset to construct more accurate and knowledgable models. The methodology presented has reduced training dataset sizes by almost 90% and maintained predictive power in two diagnostics of high energy density physics.