From Data Quality for AI to AI for Data Quality: A Systematic Review of Tools for AI-Augmented Data Quality Management in Data Warehouses
It addresses the resource-intensive challenge of data quality management for data warehouse users, but is incremental as it reviews existing tools rather than proposing new methods.
This study systematically reviewed 151 data quality tools to assess their support for AI-augmented data quality management in data warehouses, finding that only 10 tools met the criteria and most focus on data cleansing for AI rather than using AI to improve data quality itself.
While high data quality (DQ) is critical for analytics, compliance, and AI performance, data quality management (DQM) remains a complex, resource-intensive, and often manual process. This study investigates the extent to which existing tools support AI-augmented data quality management (DQM) in data warehouse environments. To this end, we conduct a systematic review of 151 DQ tools to evaluate their automation capabilities, particularly in detecting and recommending DQ rules in data warehouses -- a key component of modern data ecosystems. Using a multi-phase screening process based on functionality, trialability, regulatory compliance (e.g., GDPR), and architectural compatibility with data warehouses, only 10 tools met the criteria for AI-augmented DQM. The analysis reveals that most tools emphasize data cleansing and preparation for AI, rather than leveraging AI to improve DQ itself. Although metadata- and ML-based rule detection techniques are present, features such as SQL-based rule specification, reconciliation logic, and explainability of AI-driven recommendations remain scarce. This study offers practical guidance for tool selection and outlines critical design requirements for next-generation AI-driven DQ solutions -- advocating a paradigm shift from ``data quality for AI'' to ``AI for data quality management''.