Cardinality is Not Enough: Super Host Detection via Segmented Cardinality Estimation
For network security practitioners, this provides a more accurate and memory-efficient method to detect malicious or victim super hosts that communicate within subnets.
SegSketch improves super host detection by estimating flow cardinality within subnets using a lightweight halved-segment hashing strategy, achieving up to 8.04x higher F1-Score than state-of-the-art methods under constrained memory.
Accurately detecting super host that establishes connections to a large number of distinct peers is significant for mitigating web attacks and ensuring high quality of web service. Existing sketch-based approaches estimate the number of distinct connections called flow cardinality according to full IP addresses, while ignoring the fact that a malicious or victim super host often communicates with hosts within the same subnet, resulting in high false positive rates and low accuracy. Though hierarchical-structure based approaches could capture flow cardinality in subnet, they inherently suffer from high memory usage. To address these limitations, we propose SegSketch, a segmented cardinality estimation approach that employs a lightweight halved-segment hashing strategy to infer common prefix lengths of IP addresses, and estimates cardinality within subnet to enhance detection accuracy under constrained memory size. Experiments driven by real-world traces demonstrate that, SegSketch improves F1-Score by up to 8.04x compared to state-of-the-art solutions, particularly under small memory budgets.