Bias in Machine Learning -- What is it Good for?
This work addresses the need for clearer terminology in ML research and practice, but is incremental as it synthesizes existing literature.
The paper tackles the problem of inconsistent terminology and definitions of 'bias' in machine learning by proposing a taxonomy based on a literature survey, and concludes that there is a complex relationship between biases in the ML pipeline and model biases related to social discrimination.
In public media as well as in scientific publications, the term \emph{bias} is used in conjunction with machine learning in many different contexts, and with many different meanings. This paper proposes a taxonomy of these different meanings, terminology, and definitions by surveying the, primarily scientific, literature on machine learning. In some cases, we suggest extensions and modifications to promote a clear terminology and completeness. The survey is followed by an analysis and discussion on how different types of biases are connected and depend on each other. We conclude that there is a complex relation between bias occurring in the machine learning pipeline that leads to a model, and the eventual bias of the model (which is typically related to social discrimination). The former bias may or may not influence the latter, in a sometimes bad, and sometime good way.