ML LG CO MEOct 11, 2023

On the Computational Complexity of Private High-dimensional Model Selection

arXiv:2310.07852v54.31 citationsh-index: 54Has Code

Originality Incremental advance

AI Analysis

This work addresses privacy-preserving model selection for high-dimensional data, representing an incremental improvement by combining existing methods with computational enhancements.

The authors tackled the problem of high-dimensional sparse linear regression model selection under privacy constraints by proposing a differentially private best subset selection method, achieving polynomial mixing time for computational efficiency and demonstrating effective feature identification in simulations.

We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private (DP) best subset selection method with strong statistical utility properties by adopting the well-known exponential mechanism for selecting the best model. To achieve computational expediency, we propose an efficient Metropolis-Hastings algorithm and under certain regularity conditions, we establish that it enjoys polynomial mixing time to its stationary distribution. As a result, we also establish both approximate differential privacy and statistical utility for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments on simulated data showing that our algorithm can quickly identify active features under reasonable privacy budget constraints.

View on arXiv PDF Code

Similar