LG AIJul 19, 2025

Fraud is Not Just Rarity: A Causal Prototype Attention Approach to Realistic Synthetic Oversampling

Claudio Giusti, Luca Guarnera, Mirko Casu, Sebastiano Battiato

arXiv:2507.14706v14.1

Originality Incremental advance

AI Analysis

This addresses fraud detection in credit card transactions, which is a critical domain-specific issue, but the approach appears incremental as it builds on existing generative models and oversampling techniques.

The paper tackled the problem of detecting fraudulent credit card transactions by proposing the Causal Prototype Attention Classifier (CPAC) to improve latent cluster separation and classification performance, achieving an F1-score of 93.14% and recall of 90.18%.

Detecting fraudulent credit card transactions remains a significant challenge, due to the extreme class imbalance in real-world data and the often subtle patterns that separate fraud from legitimate activity. Existing research commonly attempts to address this by generating synthetic samples for the minority class using approaches such as GANs, VAEs, or hybrid generative models. However, these techniques, particularly when applied only to minority-class data, tend to result in overconfident classifiers and poor latent cluster separation, ultimately limiting real-world detection performance. In this study, we propose the Causal Prototype Attention Classifier (CPAC), an interpretable architecture that promotes class-aware clustering and improved latent space structure through prototype-based attention mechanisms and we will couple it with the encoder in a VAE-GAN allowing it to offer a better cluster separation moving beyond post-hoc sample augmentation. We compared CPAC-augmented models to traditional oversamplers, such as SMOTE, as well as to state-of-the-art generative models, both with and without CPAC-based latent classifiers. Our results show that classifier-guided latent shaping with CPAC delivers superior performance, achieving an F1-score of 93.14\% percent and recall of 90.18\%, along with improved latent cluster separation. Further ablation studies and visualizations provide deeper insight into the benefits and limitations of classifier-driven representation learning for fraud detection. The codebase for this work will be available at final submission.

View on arXiv PDF

Similar