Improving Compressed Counting
This work addresses the need for more practical entropy estimation in network monitoring, such as for DDoS detection, but it is incremental as it builds on existing CC methods.
The paper tackles the problem of estimating Shannon entropy for anomaly detection in large networks by improving Compressed Counting (CC) for estimating ath frequency moments, resulting in a 100-fold reduction in estimation variance when a=0.99 and statistical optimality at a=0.5.
Compressed Counting (CC) [22] was recently proposed for estimating the ath frequency moments of data streams, where 0 < a <= 2. CC can be used for estimating Shannon entropy, which can be approximated by certain functions of the ath frequency moments as a -> 1. Monitoring Shannon entropy for anomaly detection (e.g., DDoS attacks) in large networks is an important task. This paper presents a new algorithm for improving CC. The improvement is most substantial when a -> 1--. For example, when a = 0:99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when a = 0.5.