IRDSFeb 17, 2015

Count-Min-Log sketch: Approximately counting with approximate counters

arXiv:1502.04885v137 citations
Originality Incremental advance
AI Analysis

This work addresses a specific deficiency in approximate counting for large-scale processing, particularly beneficial for text-mining tasks, but it is incremental as it builds on existing Count-Min Sketch variants.

The paper tackles the problem of high relative error for low-frequency events in the Count-Min Sketch algorithm, proposing the Count-Min-Log sketch that uses logarithm-based approximate counters to improve average relative error while maintaining constant memory footprint.

Count-Min Sketch is a widely adopted algorithm for approximate event counting in large scale processing. However, the original version of the Count-Min-Sketch (CMS) suffers of some deficiences, especially if one is interested by the low-frequency items, such as in text-mining related tasks. Several variants of CMS have been proposed to compensate for the high relative error for low-frequency events, but the proposed solutions tend to correct the errors instead of preventing them. In this paper, we propose the Count-Min-Log sketch, which uses logarithm-based, approximate counters instead of linear counters to improve the average relative error of CMS at constant memory footprint.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes