Achieving Counterfactual Fairness with Imperfect Structural Causal Model
This work addresses fairness in AI for real-world applications where causal models are imperfect, offering a more robust solution to prevent discrimination in sensitive groups.
The paper tackles the problem of achieving counterfactual fairness in machine learning when the underlying structural causal model is unknown or misspecified, which can lead to poor and unfair decisions. It proposes a novel minimax game-theoretic model that relaxes strong assumptions, theoretically proves an error bound, and demonstrates superior performance in accuracy and fairness on multiple real-world datasets.
Counterfactual fairness alleviates the discrimination between the model prediction toward an individual in the actual world (observational data) and that in counterfactual world (i.e., what if the individual belongs to other sensitive groups). The existing studies need to pre-define the structural causal model that captures the correlations among variables for counterfactual inference; however, the underlying causal model is usually unknown and difficult to be validated in real-world scenarios. Moreover, the misspecification of the causal model potentially leads to poor performance in model prediction and thus makes unfair decisions. In this research, we propose a novel minimax game-theoretic model for counterfactual fairness that can produce accurate results meanwhile achieve a counterfactually fair decision with the relaxation of strong assumptions of structural causal models. In addition, we also theoretically prove the error bound of the proposed minimax model. Empirical experiments on multiple real-world datasets illustrate our superior performance in both accuracy and fairness. Source code is available at \url{https://github.com/tridungduong16/counterfactual_fairness_game_theoretic}.