LG CYJun 24, 2024

Learning Interpretable Fair Representations

arXiv:2406.16698v17.92 citations

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable fair representations to enhance data utility for third parties in machine learning, though it is incremental as it builds on existing fair representation methods.

The paper tackles the problem that current fair representations are not interpretable, limiting their utility beyond prediction tasks, and proposes a framework for learning interpretable fair representations, achieving slightly higher accuracy and fairer outcomes in downstream classification compared to state-of-the-art methods.

Numerous approaches have been recently proposed for learning fair representations that mitigate unfair outcomes in prediction tasks. A key motivation for these methods is that the representations can be used by third parties with unknown objectives. However, because current fair representations are generally not interpretable, the third party cannot use these fair representations for exploration, or to obtain any additional insights, besides the pre-contracted prediction tasks. Thus, to increase data utility beyond prediction tasks, we argue that the representations need to be fair, yet interpretable. We propose a general framework for learning interpretable fair representations by introducing an interpretable "prior knowledge" during the representation learning process. We implement this idea and conduct experiments with ColorMNIST and Dsprite datasets. The results indicate that in addition to being interpretable, our representations attain slightly higher accuracy and fairer outcomes in a downstream classification task compared to state-of-the-art fair representations.

View on arXiv PDF

Similar