LG AIAug 29, 2025

Principled Approximation Methods for Efficient and Scalable Deep Learning

arXiv:2509.00174v24.1h-index: 11

Originality Incremental advance

AI Analysis

This work addresses efficiency barriers for deploying deep learning technologies, though it appears incremental as it builds on existing methods like pruning and neural architecture search.

The thesis tackled the computational and energy inefficiency of large deep learning models by developing principled approximation methods for model compression, architecture design, and optimization, resulting in highly compact models with maintained or improved performance on tasks like image classification and language modeling.

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep learning technologies. This thesis investigates principled approximation methods for improving the efficiency of deep learning systems, with a particular focus on settings that involve discrete constraints and non-differentiability. We study three main approaches toward improved efficiency: architecture design, model compression, and optimization. For model compression, we propose novel approximations for pruning and quantization that frame the underlying discrete problem as continuous and differentiable, enabling gradient-based training of compression schemes alongside the model's parameters. These approximations allow for fine-grained sparsity and precision configurations, leading to highly compact models without significant fine-tuning. In the context of architecture design, we design an algorithm for neural architecture search that leverages parameter sharing across layers to efficiently explore implicitly recurrent architectures. Finally, we study adaptive optimization, revisiting theoretical properties of widely used methods and proposing an adaptive optimizer that allows for quick hyperparameter tuning. Our contributions center on tackling computationally hard problems via scalable and principled approximations. Experimental results on image classification, language modeling, and generative modeling tasks show that the proposed methods provide significant improvements in terms of training and inference efficiency while maintaining, or even improving, the model's performance.

View on arXiv PDF

Similar