Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era
It offers a tutorial for data mining researchers and practitioners to leverage synthetic data, but it is incremental as it summarizes existing methods without new results.
This tutorial addresses the problem of data scarcity, privacy, and annotation challenges in data mining by introducing generative models for synthetic data, providing actionable insights to enhance research and practice.
Generative models such as Large Language Models, Diffusion Models, and generative adversarial networks have recently revolutionized the creation of synthetic data, offering scalable solutions to data scarcity, privacy, and annotation challenges in data mining. This tutorial introduces the foundations and latest advances in synthetic data generation, covers key methodologies and practical frameworks, and discusses evaluation strategies and applications. Attendees will gain actionable insights into leveraging generative synthetic data to enhance data mining research and practice. More information can be found on our website: https://syndata4dm.github.io/.