AIAug 17, 2017

On Ensuring that Intelligent Machines Are Well-Behaved

Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Emma Brunskill

arXiv:1708.05448v112.016 citations

Originality Incremental advance

AI Analysis

This addresses a pressing issue for society by providing a framework to improve the safety and responsibility of machine learning applications, though it appears incremental as it builds on existing concerns without claiming a paradigm shift.

The paper tackles the problem of ensuring machine learning algorithms are well-behaved to prevent harmful behaviors like racism or sexism, proposing a new framework that simplifies specifying and regulating such behaviors, and demonstrates its viability by creating algorithms that preclude sexist and harmful behaviors in experiments.

Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve super-human performance on various tasks. Ensuring that they are well-behaved---that they do not, for example, cause harm to humans or act in a racist or sexist way---is therefore not a hypothetical problem to be dealt with in the future, but a pressing one that we address here. We propose a new framework for designing machine learning algorithms that simplifies the problem of specifying and regulating undesirable behaviors. To show the viability of this new framework, we use it to create new machine learning algorithms that preclude the sexist and harmful behaviors exhibited by standard machine learning algorithms in our experiments. Our framework for designing machine learning algorithms simplifies the safe and responsible application of machine learning.

View on arXiv PDF

Similar