LGOCDec 3, 2025

Convergence for Discrete Parameter Updates

arXiv:2512.04051v1h-index: 5
Originality Incremental advance
AI Analysis

This work addresses the problem of high computational costs in deep learning training for researchers and practitioners, offering a novel perspective that is incremental in advancing quantised training methods.

The paper tackles the computational inefficiency of deep learning by proposing a discrete update rule approach for low-precision training, establishing convergence guarantees and demonstrating empirical results with a multinomial update example.

Modern deep learning models require immense computational resources, motivating research into low-precision training. Quantised training addresses this by representing training components in low-bit integers, but typically relies on discretising real-valued updates. We introduce an alternative approach where the update rule itself is discrete, avoiding the quantisation of continuous updates by design. We establish convergence guarantees for a general class of such discrete schemes, and present a multinomial update rule as a concrete example, supported by empirical evaluation. This perspective opens new avenues for efficient training, particularly for models with inherently discrete structure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes