CRCVLGAug 14, 2024

Sonic: Fast and Transferable Data Poisoning on Clustering Algorithms

arXiv:2408.07558v11 citationsh-index: 48
Originality Incremental advance
AI Analysis

This addresses a scalability problem for researchers and practitioners in machine learning security, though it is incremental as it builds on existing poisoning methods.

The paper tackles the scalability issue in data poisoning attacks on clustering algorithms by introducing Sonic, a genetic attack that uses incremental clustering surrogates to accelerate poisoning against methods like HDBSCAN, achieving faster performance without specifying concrete numbers.

Data poisoning attacks on clustering algorithms have received limited attention, with existing methods struggling to scale efficiently as dataset sizes and feature counts increase. These attacks typically require re-clustering the entire dataset multiple times to generate predictions and assess the attacker's objectives, significantly hindering their scalability. This paper addresses these limitations by proposing Sonic, a novel genetic data poisoning attack that leverages incremental and scalable clustering algorithms, e.g., FISHDBC, as surrogates to accelerate poisoning attacks against graph-based and density-based clustering methods, such as HDBSCAN. We empirically demonstrate the effectiveness and efficiency of Sonic in poisoning the target clustering algorithms. We then conduct a comprehensive analysis of the factors affecting the scalability and transferability of poisoning attacks against clustering algorithms, and we conclude by examining the robustness of hyperparameters in our attack strategy Sonic.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes