LGDBNov 14, 2022

Row Conditional-TGAN for generating synthetic relational databases

arXiv:2211.07588v120 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the need for high-quality synthetic relational databases for data privacy or testing, but it is incremental as it builds on existing tabular GAN methods.

The paper tackles the problem of generating synthetic relational databases by modeling relationships between tables, proposing the Row Conditional-TGAN (RC-TGAN) that incorporates parent row data into child table generation and extends to grandparent influences. Experimental results on eight real databases show significant improvements in synthesis quality compared to benchmarks.

Besides reproducing tabular data properties of standalone tables, synthetic relational databases also require modeling the relationships between related tables. In this paper, we propose the Row Conditional-Tabular Generative Adversarial Network (RC-TGAN), a novel generative adversarial network (GAN) model that extends the tabular GAN to support modeling and synthesizing relational databases. The RC-TGAN models relationship information between tables by incorporating conditional data of parent rows into the design of the child table's GAN. We further extend the RC-TGAN to model the influence that grandparent table rows may have on their grandchild rows, in order to prevent the loss of this connection when the rows of the parent table fail to transfer this relationship information. The experimental results, using eight real relational databases, show significant improvements in the quality of the synthesized relational databases when compared to the benchmark system, demonstrating the effectiveness of the RC-TGAN in preserving relationships between tables of the original database.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes