LGAICVCYNov 15, 2020

Debiasing Convolutional Neural Networks via Meta Orthogonalization

arXiv:2011.07453v13 citations
AI Analysis

This addresses fairness issues in deep learning for applications where models might use sensitive attributes, though it appears incremental by building on existing debiasing techniques.

The paper tackles the problem of debiasing convolutional neural networks to reduce reliance on spurious correlations like protected attributes, and demonstrates that their Meta Orthogonalization method significantly mitigates bias while maintaining competitive task performance.

While deep learning models often achieve strong task performance, their successes are hampered by their inability to disentangle spurious correlations from causative factors, such as when they use protected attributes (e.g., race, gender, etc.) to make decisions. In this work, we tackle the problem of debiasing convolutional neural networks (CNNs) in such instances. Building off of existing work on debiasing word embeddings and model interpretability, our Meta Orthogonalization method encourages the CNN representations of different concepts (e.g., gender and class labels) to be orthogonal to one another in activation space while maintaining strong downstream task performance. Through a variety of experiments, we systematically test our method and demonstrate that it significantly mitigates model bias and is competitive against current adversarial debiasing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes