MLCRLGOct 24, 2025

Differentially Private High-dimensional Variable Selection via Integer Programming

arXiv:2510.22062v1h-index: 8
Originality Highly original
AI Analysis

This work addresses the challenge of ensuring privacy in high-dimensional variable selection, which is crucial for applications like healthcare or finance where data sensitivity is a concern, representing a novel extension of existing methods rather than a foundational breakthrough.

The paper tackled the problem of differentially private sparse variable selection in high-dimensional learning by introducing two new pure differentially private estimators that leverage mixed integer programming techniques, achieving state-of-the-art empirical support recovery with up to 10,000 variables.

Sparse variable selection improves interpretability and generalization in high-dimensional learning by selecting a small subset of informative features. Recent advances in Mixed Integer Programming (MIP) have enabled solving large-scale non-private sparse regression - known as Best Subset Selection (BSS) - with millions of variables in minutes. However, extending these algorithmic advances to the setting of Differential Privacy (DP) has remained largely unexplored. In this paper, we introduce two new pure differentially private estimators for sparse variable selection, levering modern MIP techniques. Our framework is general and applies broadly to problems like sparse regression or classification, and we provide theoretical support recovery guarantees in the case of BSS. Inspired by the exponential mechanism, we develop structured sampling procedures that efficiently explore the non-convex objective landscape, avoiding the exhaustive combinatorial search in the exponential mechanism. We complement our theoretical findings with extensive numerical experiments, using both least squares and hinge loss for our objective function, and demonstrate that our methods achieve state-of-the-art empirical support recovery, outperforming competing algorithms in settings with up to $p=10^4$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes