Ruyu Zhou

h-index40
2papers

2 Papers

CVJan 14, 2024
Efficient approximation of Earth Mover's Distance Based on Nearest Neighbor Search

Guangyu Meng, Ruyu Zhou, Liu Liu et al.

Earth Mover's Distance (EMD) is an important similarity measure between two distributions, used in computer vision and many other application domains. However, its exact calculation is computationally and memory intensive, which hinders its scalability and applicability for large-scale problems. Various approximate EMD algorithms have been proposed to reduce computational costs, but they suffer lower accuracy and may require additional memory usage or manual parameter tuning. In this paper, we present a novel approach, NNS-EMD, to approximate EMD using Nearest Neighbor Search (NNS), in order to achieve high accuracy, low time complexity, and high memory efficiency. The NNS operation reduces the number of data points compared in each NNS iteration and offers opportunities for parallel processing. We further accelerate NNS-EMD via vectorization on GPU, which is especially beneficial for large datasets. We compare NNS-EMD with both the exact EMD and state-of-the-art approximate EMD algorithms on image classification and retrieval tasks. We also apply NNS-EMD to calculate transport mapping and realize color transfer between images. NNS-EMD can be 44x to 135x faster than the exact EMD implementation, and achieves superior accuracy, speedup, and memory efficiency over existing approximate EMD methods.

MLDec 20, 2023
Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST)

Xingyuan Zhao, Ruyu Zhou, Fang Liu

Applying a randomized algorithm to a subset rather than the entire dataset amplifies privacy guarantees. We propose a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification (PA) in the context of differential privacy (DP). We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures through newly introduced concept including strong vs weak PA effects and aligned privacy profile. We provide the privacy loss composition analysis over repeated applications of MUST via the Fourier accountant algorithm. Our theoretical and empirical results suggest that MUST offers stronger PA in $ε$ than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on $δ$ vary case by case. Our experiments show that MUST is non-inferior in the utility and stability of privacy-preserving (PP) outputs to one-stage subsampling methods at similar privacy loss while enhancing the computational efficiency of algorithms that require complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures that involve parallel or simultaneous subsampling when DP guarantees are necessary.