Trustworthiness Preservation by Copies of Machine Learning Systems
This addresses the need for computational tools to ensure responsible AI by verifying trustworthiness in system copies, though it appears incremental as it builds on existing practices of model and data copying.
The paper tackles the problem of verifying trustworthiness preservation in copies of machine learning systems, introducing a calculus to model and verify probabilistic queries and defining four distinct notions of trustworthiness that can be checked by analyzing copy behavior relative to the original.
A common practice of ML systems development concerns the training of the same model under different data sets, and the use of the same (training and test) sets for different learning models. The first case is a desirable practice for identifying high quality and unbiased training conditions. The latter case coincides with the search for optimal models under a common dataset for training. These differently obtained systems have been considered akin to copies. In the quest for responsible AI, a legitimate but hardly investigated question is how to verify that trustworthiness is preserved by copies. In this paper we introduce a calculus to model and verify probabilistic complex queries over data and define four distinct notions: Justifiably, Equally, Weakly and Almost Trustworthy which can be checked analysing the (partial) behaviour of the copy with respect to its original. We provide a study of the relations between these notions of trustworthiness, and how they compose with each other and under logical operations. The aim is to offer a computational tool to check the trustworthiness of possibly complex systems copied from an original whose behavour is known.