LGDec 19, 2024

Generative AI for Banks: Benchmarks and Algorithms for Synthetic Financial Transaction Data

arXiv:2412.14730v15 citationsh-index: 57
Originality Synthesis-oriented
AI Analysis

This work addresses data sensitivity and regulatory challenges in banking by providing benchmarks for synthetic data generation, offering incremental improvements in algorithm selection for domain-specific applications.

The study evaluated five generative AI models for creating synthetic financial transaction data, finding that CTGAN balanced fidelity, quality, efficiency, privacy, and graph structure best, while other models excelled in specific areas like privacy or data replication.

The banking sector faces challenges in using deep learning due to data sensitivity and regulatory constraints, but generative AI may offer a solution. Thus, this study identifies effective algorithms for generating synthetic financial transaction data and evaluates five leading models - Conditional Tabular Generative Adversarial Networks (CTGAN), DoppelGANger (DGAN), Wasserstein GAN, Financial Diffusion (FinDiff), and Tabular Variational AutoEncoders (TVAE) - across five criteria: fidelity, synthesis quality, efficiency, privacy, and graph structure. While none of the algorithms is able to replicate the real data's graph structure, each excels in specific areas: DGAN is ideal for privacy-sensitive tasks, FinDiff and TVAE excel in data replication and augmentation, and CTGAN achieves a balance across all five criteria, making it suitable for general applications with moderate privacy concerns. As a result, our findings offer valuable insights for choosing the most suitable algorithm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes