QI2 -- an Interactive Tool for Data Quality Assurance
This addresses the need for data quality assurance in ML systems, particularly for safety-relevant applications under regulatory pressures like the EU AI Act, though it appears incremental as an interactive tool for existing processes.
The paper tackles the challenge of verifying quantitative data quality requirements for ML systems, introducing a novel interactive tool called QI2 that supports multiple data quality aspects, demonstrated on the MNIST dataset.
The importance of high data quality is increasing with the growing impact and distribution of ML systems and big data. Also the planned AI Act from the European commission defines challenging legal requirements for data quality especially for the market introduction of safety relevant ML systems. In this paper we introduce a novel approach that supports the data quality assurance process of multiple data quality aspects. This approach enables the verification of quantitative data quality requirements. The concept and benefits are introduced and explained on small example data sets. How the method is applied is demonstrated on the well known MNIST data set based an handwritten digits.