DSDBLGOct 31, 2025

Learned Static Function Data Structures

arXiv:2510.27588v11 citationsh-index: 8
Originality Highly original
AI Analysis

This work addresses the need for more compact data storage in applications handling static key-value sets, representing a novel method for a known bottleneck.

The paper tackled the problem of constructing memory-efficient static function data structures for key-value associations by introducing learned static functions that use machine learning to predict value distributions and encode values with key-specific prefix codes, achieving up to one order of magnitude space savings on real data and three orders on synthetic data.

We consider the task of constructing a data structure for associating a static set of keys with values, while allowing arbitrary output values for queries involving keys outside the set. Compared to hash tables, these so-called static function data structures do not need to store the key set and thus use significantly less memory. Several techniques are known, with compressed static functions approaching the zero-order empirical entropy of the value sequence. In this paper, we introduce learned static functions, which use machine learning to capture correlations between keys and values. For each key, a model predicts a probability distribution over the values, from which we derive a key-specific prefix code to compactly encode the true value. The resulting codeword is stored in a classic static function data structure. This design allows learned static functions to break the zero-order entropy barrier while still supporting point queries. Our experiments show substantial space savings: up to one order of magnitude on real data, and up to three orders of magnitude on synthetic data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes