LG AI STFeb 4, 2025

Sample Complexity of Bias Detection with Subsampled Point-to-Subspace Distances

arXiv:2502.02623v17.11 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses computational challenges in bias detection for regulatory compliance, though it appears incremental as it builds on existing PAC frameworks.

The paper tackles the problem of exponential runtime in bias detection for all subgroups by reformulating it as a point-to-subspace problem and showing efficient subsampling for supremum norm, with results supported by tests on known instances.

Sample complexity of bias estimation is a lower bound on the runtime of any bias detection method. Many regulatory frameworks require the bias to be tested for all subgroups, whose number grows exponentially with the number of protected attributes. Unless one wishes to run a bias detection with a doubly-exponential run-time, one should like to have polynomial complexity of bias detection for a single subgroup. At the same time, the reference data may be based on surveys, and thus come with non-trivial uncertainty. Here, we reformulate bias detection as a point-to-subspace problem on the space of measures and show that, for supremum norm, it can be subsampled efficiently. In particular, our probabilistically approximately correct (PAC) results are corroborated by tests on well-known instances.

View on arXiv PDF

Similar