Feature Partitioning for Robust Tree Ensembles and their Certification in Adversarial Scenarios
This addresses the problem of adversarial robustness in machine learning for security-critical applications, offering a novel approach to enhance model resilience.
The paper tackles the vulnerability of machine learning models to evasion attacks by proposing a model-agnostic strategy that builds robust ensembles through feature partitioning, guaranteeing that most models remain unaffected by attackers. Experimental results on public datasets show it outperforms state-of-the-art adversarial learning algorithms against evasion attacks.
Machine learning algorithms, however effective, are known to be vulnerable in adversarial scenarios where a malicious user may inject manipulated instances. In this work we focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time. The attacker aims at finding a minimal perturbation of a test instance that changes the model outcome. We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset. Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker. We experimented the proposed strategy on decision tree ensembles, and we also propose an approximate certification method for tree ensembles that efficiently assess the minimal accuracy of a forest on a given dataset avoiding the costly computation of evasion attacks. Experimental evaluation on publicly available datasets shows that proposed strategy outperforms state-of-the-art adversarial learning algorithms against evasion attacks.