ML LGJan 27

Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making

Zeyu Bian, Lan Wang, Chengchun Shi, Zhengling Qi

arXiv:2601.19186v1h-index: 3

Originality Highly original

AI Analysis

This addresses fairness in decision-making systems for domains like insurance and training programs, representing a novel integration of multiple fairness dimensions in policy learning.

The paper tackles the problem of fairness in policy learning by proposing a double fairness learning framework that simultaneously addresses action fairness and outcome fairness, demonstrating in applications to insurance and entrepreneurship datasets that it substantially improves both fairness metrics with only modest value reduction.

Fairness is a central pillar of trustworthy machine learning, especially in domains where accuracy- or profit-driven optimization is insufficient. While most fairness research focuses on supervised learning, fairness in policy learning remains less explored. Because policy learning is interventional, it induces two distinct fairness targets: action fairness (equitable action assignments) and outcome fairness (equitable downstream consequences). Crucially, equalizing actions does not generally equalize outcomes when groups face different constraints or respond differently to the same action. We propose a novel double fairness learning (DFL) framework that explicitly manages the trade-off among three objectives: action fairness, outcome fairness, and value maximization. We integrate fairness directly into a multi-objective optimization problem for policy learning and employ a lexicographic weighted Tchebyshev method that recovers Pareto solutions beyond convex settings, with theoretical guarantees on the regret bounds. Our framework is flexible and accommodates various commonly used fairness notions. Extensive simulations demonstrate improved performance relative to competing methods. In applications to a motor third-party liability insurance dataset and an entrepreneurship training dataset, DFL substantially improves both action and outcome fairness while incurring only a modest reduction in overall value.

View on arXiv PDF

Similar