Provably effective detection of effective data poisoning attacks
This addresses the security issue of data poisoning for machine learning systems, providing a provable detection method.
The paper tackles the problem of dataset poisoning attacks by establishing a precise mathematical definition and proving that effective poisoning ensures detectability, with experimental evidence showing adequate detection in real-world scenarios.
This paper establishes a mathematically precise definition of dataset poisoning attack and proves that the very act of effectively poisoning a dataset ensures that the attack can be effectively detected. On top of a mathematical guarantee that dataset poisoning is identifiable by a new statistical test that we call the Conformal Separability Test, we provide experimental evidence that we can adequately detect poisoning attempts in the real world.