AILGJan 12, 2023

Data-centric AI: Perspectives and Challenges

arXiv:2301.04819v398 citationsh-index: 73Has Code
AI Analysis

This perspective addresses the AI community's need for systematic approaches to data quality, though it is incremental as it synthesizes existing ideas rather than introducing new methods.

The paper advocates for data-centric AI (DCAI), shifting focus from model improvements to enhancing data quality and reliability, and outlines three general missions: training data development, inference data development, and data maintenance.

The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability. Although our community has continuously invested efforts into enhancing data in different aspects, they are often isolated initiatives on specific tasks. To facilitate the collective initiative in our community and push forward DCAI, we draw a big picture and bring together three general missions: training data development, inference data development, and data maintenance. We provide a top-level discussion on representative DCAI tasks and share perspectives. Finally, we list open challenges. More resources are summarized at https://github.com/daochenzha/data-centric-AI

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes