SYAIJun 4, 2024

Fast networked data selection via distributed smoothed quantile estimation

arXiv:2406.01929v1
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient data selection in distributed systems for fields like control and machine learning, though it appears incremental as it builds on existing smoothing techniques for optimization.

The paper tackles the problem of selecting the most informative data from a distributed network by connecting it to top-k selection and formulating it as a distributed nonsmooth convex optimization for quantile estimation, resulting in an accelerated method with improved convergence and scalability validated by numerical results.

Collecting the most informative data from a large dataset distributed over a network is a fundamental problem in many fields, including control, signal processing and machine learning. In this paper, we establish a connection between selecting the most informative data and finding the top-$k$ elements of a multiset. The top-$k$ selection in a network can be formulated as a distributed nonsmooth convex optimization problem known as quantile estimation. Unfortunately, the lack of smoothness in the local objective functions leads to extremely slow convergence and poor scalability with respect to the network size. To overcome the deficiency, we propose an accelerated method that employs smoothing techniques. Leveraging the piecewise linearity of the local objective functions in quantile estimation, we characterize the iteration complexity required to achieve top-$k$ selection, a challenging task due to the lack of strong convexity. Several numerical results are provided to validate the effectiveness of the algorithm and the correctness of the theory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes