When to Trust AI: Advances and Challenges for Certification of Neural Networks
It addresses safety assurance for AI deployment in critical applications like autonomous systems and medical diagnosis, but is incremental as it provides an overview rather than new methods.
This paper tackles the problem of ensuring trustworthiness in neural networks by reviewing certification and explainability techniques to address instability and adversarial vulnerabilities, aiming to reduce harm from system failures.
Artificial intelligence (AI) has been advancing at a fast pace and it is now poised for deployment in a wide range of applications, such as autonomous systems, medical diagnosis and natural language processing. Early adoption of AI technology for real-world applications has not been without problems, particularly for neural networks, which may be unstable and susceptible to adversarial examples. In the longer term, appropriate safety assurance techniques need to be developed to reduce potential harm due to avoidable system failures and ensure trustworthiness. Focusing on certification and explainability, this paper provides an overview of techniques that have been developed to ensure safety of AI decisions and discusses future challenges.