LGAug 1, 2023

Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

arXiv:2308.00273v24 citationsh-index: 45
Originality Highly original
AI Analysis

This work addresses the challenge of efficiently learning distance functions for complex objects like point sets in machine learning, with potential applications in geometric optimization problems.

The paper tackles the problem of approximating Wasserstein distance between point sets with a neural network that is invariant to group actions like permutations, introducing a universal architecture for symmetric and factorwise group invariant functions and achieving model complexity independent of input sizes. The result shows competitive or better performance compared to state-of-the-art methods, with significantly better generalization and faster training.

Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes