LGAIApr 10, 2024

Private Wasserstein Distance

arXiv:2404.06787v22 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses privacy concerns for entities needing to compare datasets securely, though it appears incremental as it builds on existing privacy methods.

The paper tackled the problem of computing Wasserstein distance in privacy-sensitive environments without sharing raw data, and introduced TriangleWad, which ensures data remain hidden while achieving high estimation accuracy across image and text tasks.

Wasserstein distance is a key metric for quantifying data divergence from a distributional perspective. However, its application in privacy-sensitive environments, where direct sharing of raw data is prohibited, presents significant challenges. Existing approaches, such as Differential Privacy and Federated Optimization, have been employed to estimate the Wasserstein distance under such constraints. However, these methods often fall short when both accuracy and security are required. In this study, we explore the inherent triangular properties within the Wasserstein space, leading to a novel solution named TriangleWad. This approach facilitates the fast computation of the Wasserstein distance between datasets stored across different entities, ensuring that raw data remain completely hidden. TriangleWad not only strengthens resistance to potential attacks but also preserves high estimation accuracy. Through extensive experiments across various tasks involving both image and text data, we demonstrate its superior performance and significant potential for real-world applications.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes