The geometry of financial institutions -- Wasserstein clustering of financial data
This provides a method for financial regulators to process complex data, but it is incremental as it adapts existing clustering techniques to a specific domain.
The paper tackles the challenge of condensing granular financial data for regulatory monitoring by proposing a variant of Lloyd's algorithm using Wasserstein barycenters to create a metric space for clustering, demonstrating its application in financial regulation to handle missing values and feature-based clustering.
The increasing availability of granular and big data on various objects of interest has made it necessary to develop methods for condensing this information into a representative and intelligible map. Financial regulation is a field that exemplifies this need, as regulators require diverse and often highly granular data from financial institutions to monitor and assess their activities. However, processing and analyzing such data can be a daunting task, especially given the challenges of dealing with missing values and identifying clusters based on specific features. To address these challenges, we propose a variant of Lloyd's algorithm that applies to probability distributions and uses generalized Wasserstein barycenters to construct a metric space which represents given data on various objects in condensed form. By applying our method to the financial regulation context, we demonstrate its usefulness in dealing with the specific challenges faced by regulators in this domain. We believe that our approach can also be applied more generally to other fields where large and complex data sets need to be represented in concise form.