LG AISep 12, 2024

A framework for measuring the training efficiency of a neural architecture

arXiv:2409.07925v17.98 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This work addresses the incremental challenge of quantifying training efficiency for researchers and practitioners in machine learning, though it is incremental as it builds on existing methods without introducing a new paradigm.

The paper tackles the problem of measuring training efficiency in neural networks by proposing an experimental framework and applying it to CNNs and Bayesian CNNs on MNIST and CIFAR-10. The results show that training efficiency decays over time, varies with stopping criteria, and that CNNs are more efficient than Bayesian CNNs, with differences becoming more pronounced on complex tasks.

Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more efficient than BCNNs on both datasets. More generally, as a learning task becomes more complex, the relative difference in training efficiency between different architectures becomes more pronounced.

View on arXiv PDF

Similar