A Principled Approach for Data Bias Mitigation
This addresses bias mitigation in data-driven algorithms, which is crucial for fair decision-making in various applications, though it appears incremental by extending existing methods to handle intersectional cases.
The paper tackles the problem of data bias in machine learning decision-making by introducing a new mitigation strategy that works with non-binary labels and multiple sensitive attributes, including intersectional bias, and demonstrates its effectiveness on publicly available datasets with mathematical guarantees.
The widespread use of machine learning and data-driven algorithms for decision making has been steadily increasing over many years. \emph{Bias} in the data can adversely affect this decision-making. We present a new mitigation strategy to address data bias. Our methods are explainable and come with mathematical guarantees of correctness. They can take advantage of new work on table discovery to find new tuples that can be added to a dataset to create real datasets that are unbiased or less biased. Our framework covers data with non-binary labels and with multiple sensitive attributes. Hence, we are able to measure and mitigate bias that does not appear over a single attribute (or feature), but only intersectionally, when considering a combination of attributes. We evaluate our techniques on publicly available datasets and provide a theoretical analysis of our results, highlighting novel insights into data bias.