LGCRLOMLMay 5, 2023

Verifiable Learning for Robust Tree Ensembles

arXiv:2305.03626v44 citations
Originality Highly original
AI Analysis

This addresses the challenge of ensuring security for machine learning models in adversarial settings, particularly for decision tree ensembles, though it is incremental as it focuses on a specific restricted class rather than solving the general NP-hard problem.

The paper tackles the problem of verifying robustness against evasion attacks for decision tree ensembles, which is NP-hard in general, by introducing a restricted class called large-spread ensembles that can be verified in polynomial time, and experimental results show these ensembles can be verified in seconds and are more robust with acceptable accuracy loss.

Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes