Federated and Privacy-Preserving Learning of Accounting Data in Financial Statement Audits
This work addresses data privacy concerns for audit firms using deep learning, enabling compliance with regulations while improving anomaly detection in financial statements.
The paper tackles the challenge of applying deep learning in financial audits while complying with strict data confidentiality regulations by proposing a federated learning framework with differential privacy and split learning. The results show that auditors can detect accounting anomalies using models trained on proprietary client data from multiple sources, as demonstrated on three real-world datasets of city payments.
The ongoing 'digital transformation' fundamentally changes audit evidence's nature, recording, and volume. Nowadays, the International Standards on Auditing (ISA) requires auditors to examine vast volumes of a financial statement's underlying digital accounting records. As a result, audit firms also 'digitize' their analytical capabilities and invest in Deep Learning (DL), a successful sub-discipline of Machine Learning. The application of DL offers the ability to learn specialized audit models from data of multiple clients, e.g., organizations operating in the same industry or jurisdiction. In general, regulations require auditors to adhere to strict data confidentiality measures. At the same time, recent intriguing discoveries showed that large-scale DL models are vulnerable to leaking sensitive training data information. Today, it often remains unclear how audit firms can apply DL models while complying with data protection regulations. In this work, we propose a Federated Learning framework to train DL models on auditing relevant accounting data of multiple clients. The framework encompasses Differential Privacy and Split Learning capabilities to mitigate data confidentiality risks at model inference. We evaluate our approach to detect accounting anomalies in three real-world datasets of city payments. Our results provide empirical evidence that auditors can benefit from DL models that accumulate knowledge from multiple sources of proprietary client data.