Trustworthy AI Must Account for Interactions
This position paper highlights a foundational issue for AI practitioners and researchers, emphasizing the need to move beyond incremental improvements in isolated aspects of Trustworthy AI.
The paper tackles the problem of unintended trade-offs between different aspects of Trustworthy AI, such as fairness and privacy, by reviewing approaches and detailing negative interactions, and argues that research must adopt a holistic view to address these interactions simultaneously.
Trustworthy AI encompasses many aspirational aspects for aligning AI systems with human values, including fairness, privacy, robustness, explainability, and uncertainty quantification. Ultimately the goal of Trustworthy AI research is to achieve all aspects simultaneously. However, efforts to enhance one aspect often introduce unintended trade-offs that negatively impact others. In this position paper, we review notable approaches to these five aspects and systematically consider every pair, detailing the negative interactions that can arise. For example, applying differential privacy to model training can amplify biases, undermining fairness. Drawing on these findings, we take the position that current research practices of improving one or two aspects in isolation are insufficient. Instead, research on Trustworthy AI must account for interactions between aspects and adopt a holistic view across all relevant axes at once. To illustrate our perspective, we provide guidance on how practitioners can work towards integrated trust, examples of how interactions affect the financial industry, and alternative views.