Measuring the Data Efficiency of Deep Learning Methods
This work addresses data efficiency for machine learning practitioners, but it is incremental as it compares existing methods on standard datasets.
The paper tackled the problem of measuring data efficiency in deep learning by benchmarking convolutional neural networks (CNNs) and hierarchical information-preserving graph-based slow feature analysis (HiGSFA) on MNIST and Omniglot datasets, finding that HiGSFA outperformed CNNs with 50 and 200 samples per class for MNIST classification.
In this paper, we propose a new experimental protocol and use it to benchmark the data efficiency --- performance as a function of training set size --- of two deep learning algorithms, convolutional neural networks (CNNs) and hierarchical information-preserving graph-based slow feature analysis (HiGSFA), for tasks in classification and transfer learning scenarios. The algorithms are trained on different-sized subsets of the MNIST and Omniglot data sets. HiGSFA outperforms standard CNN networks when the models are trained on 50 and 200 samples per class for MNIST classification. In other cases, the CNNs perform better. The results suggest that there are cases where greedy, locally optimal bottom-up learning is equally or more powerful than global gradient-based learning.