Controlling Directions Orthogonal to a Classifier
This work addresses the need for controlling classifier-invariant variations in machine learning applications, offering incremental advances in style transfer, domain adaptation, and fairness.
The paper tackles the problem of identifying and controlling directions invariant to a classifier, enabling applications like style transfer, domain adaptation, and fairness. It proposes a method for defining orthogonality in non-linear cases and demonstrates improvements in these tasks, such as mitigating unfairness as a predictor.
We propose to identify directions invariant to a given classifier so that these directions can be controlled in tasks such as style transfer. While orthogonal decomposition is directly identifiable when the given classifier is linear, we formally define a notion of orthogonality in the non-linear case. We also provide a surprisingly simple method for constructing the orthogonal classifier (a classifier utilizing directions other than those of the given classifier). Empirically, we present three use cases where controlling orthogonal variation is important: style transfer, domain adaptation, and fairness. The orthogonal classifier enables desired style transfer when domains vary in multiple aspects, improves domain adaptation with label shifts and mitigates the unfairness as a predictor. The code is available at http://github.com/Newbeeer/orthogonal_classifier