LGAIJan 18, 2023

Synthcity: facilitating innovative use cases of synthetic data in different data modalities

arXiv:2301.07573v1114 citationsh-index: 74Has Code
Originality Synthesis-oriented
AI Analysis

This package addresses the need for flexible synthetic data generation in various data modalities, facilitating research and applications in machine learning, but it is incremental as it builds on existing methods without introducing a new paradigm.

Synthcity is an open-source software package that tackles the challenge of generating synthetic data across diverse tabular modalities, such as static data, time series, and multi-source data, to support use cases in ML fairness, privacy, and augmentation, providing practitioners with access to cutting-edge tools and benchmarks.

Synthcity is an open-source software package for innovative use cases of synthetic data in ML fairness, privacy and augmentation across diverse tabular data modalities, including static data, regular and irregular time series, data with censoring, multi-source data, composite data, and more. Synthcity provides the practitioners with a single access point to cutting edge research and tools in synthetic data. It also offers the community a playground for rapid experimentation and prototyping, a one-stop-shop for SOTA benchmarks, and an opportunity for extending research impact. The library can be accessed on GitHub (https://github.com/vanderschaarlab/synthcity) and pip (https://pypi.org/project/synthcity/). We warmly invite the community to join the development effort by providing feedback, reporting bugs, and contributing code.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes