LGFeb 5, 2021

Evaluating Deep Learning in SystemML using Layer-wise Adaptive Rate Scaling(LARS) Optimizer

arXiv:2102.03018v1
Originality Incremental advance
AI Analysis

This work addresses the problem of maintaining test accuracy with large batch sizes in distributed deep learning systems for users of frameworks like SystemML.

This paper investigates the performance of the LARS optimizer within the SystemML distributed machine learning framework. It found that LARS significantly outperforms Stochastic Gradient Descent when using large batch sizes, addressing the common issue of accuracy loss with increased batch size.

Increasing the batch size of a deep learning model is a challenging task. Although it might help in utilizing full available system memory during training phase of a model, it results in significant loss of test accuracy most often. LARS solved this issue by introducing an adaptive learning rate for each layer of a deep learning model. However, there are doubts on how popular distributed machine learning systems such as SystemML or MLlib will perform with this optimizer. In this work, we apply LARS optimizer to a deep learning model implemented using SystemML.We perform experiments with various batch sizes and compare the performance of LARS optimizer with \textit{Stochastic Gradient Descent}. Our experimental results show that LARS optimizer performs significantly better than Stochastic Gradient Descent for large batch sizes even with the distributed machine learning framework, SystemML.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes