MLITLGJan 5, 2022

Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician's Perspective

arXiv:2201.01741v216 citationsHas Code
AI Analysis

This is an incremental educational resource aimed at machine learning researchers to help them understand and utilize ANS for data compression.

The paper tackles the challenge of making Asymmetric Numeral Systems (ANS) accessible to machine learning researchers by presenting it from a statistician's perspective, using latent variable models and bits-back coding, and provides a complete Python implementation and an open-source library for entropy coding.

Entropy coding is the backbone data compression. Novel machine-learning based compression methods often use a new entropy coder called Asymmetric Numeral Systems (ANS) [Duda et al., 2015], which provides very close to optimal bitrates and simplifies [Townsend et al., 2019] advanced compression techniques such as bits-back coding. However, researchers with a background in machine learning often struggle to understand how ANS works, which prevents them from exploiting its full versatility. This paper is meant as an educational resource to make ANS more approachable by presenting it from a new perspective of latent variable models and the so-called bits-back trick. We guide the reader step by step to a complete implementation of ANS in the Python programming language, which we then generalize for more advanced use cases. We also present and empirically evaluate an open-source library of various entropy coders designed for both research and production use. Related teaching videos and problem sets are available online.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes