LGMay 26, 2025

Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

arXiv:2505.19561v13 citationsh-index: 7Has CodeICML
Originality Highly original
AI Analysis

This addresses scalability issues in probabilistic data structures for streaming applications, offering a flexible solution for various domains and space budgets.

The paper tackled the problem of scaling neural sketches for estimating item frequencies in data streams by introducing a modular memory-augmented neural network architecture, achieving superior space-accuracy trade-offs compared to existing sketches.

Sketches, probabilistic structures for estimating item frequencies in infinite data streams with limited space, are widely used across various domains. Recent studies have shifted the focus from handcrafted sketches to neural sketches, leveraging memory-augmented neural networks (MANNs) to enhance the streaming compression capabilities and achieve better space-accuracy trade-offs.However, existing neural sketches struggle to scale across different data domains and space budgets due to inflexible MANN configurations. In this paper, we introduce a scalable MANN architecture that brings to life the {\it Lego sketch}, a novel sketch with superior scalability and accuracy. Much like assembling creations with modular Lego bricks, the Lego sketch dynamically coordinates multiple memory bricks to adapt to various space budgets and diverse data domains. Our theoretical analysis guarantees its high scalability and provides the first error bound for neural sketch. Furthermore, extensive experimental evaluations demonstrate that the Lego sketch exhibits superior space-accuracy trade-offs, outperforming existing handcrafted and neural sketches. Our code is available at https://github.com/FFY0/LegoSketch_ICML.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes