Learning Smooth and Fair Representations
This addresses the issue of data owners facing liability for downstream discriminatory use, offering a practical solution for ensuring fairness in data representations, though it is incremental as it builds upon existing fair representation learning approaches.
The paper tackles the problem of legal liability for discriminatory data use by proposing a method to preemptively remove correlations between features and sensitive attributes through fair representation learning, showing that fairness can be certified from finite samples if chi-squared mutual information is finite and that smoothing the representation distribution improves fairness generalization without degrading downstream accuracy compared to state-of-the-art methods.
Organizations that own data face increasing legal liability for its discriminatory use against protected demographic groups, extending to contractual transactions involving third parties access and use of the data. This is problematic, since the original data owner cannot ex-ante anticipate all its future uses by downstream users. This paper explores the upstream ability to preemptively remove the correlations between features and sensitive attributes by mapping features to a fair representation space. Our main result shows that the fairness measured by the demographic parity of the representation distribution can be certified from a finite sample if and only if the chi-squared mutual information between features and representations is finite. Empirically, we find that smoothing the representation distribution provides generalization guarantees of fairness certificates, which improves upon existing fair representation learning approaches. Moreover, we do not observe that smoothing the representation distribution degrades the accuracy of downstream tasks compared to state-of-the-art methods in fair representation learning.