MLLGFeb 22, 2023

A Note on "Towards Efficient Data Valuation Based on the Shapley Value''

arXiv:2302.11431v114 citationsh-index: 30
Originality Synthesis-oriented
AI Analysis

This work provides incremental technical improvements for researchers developing efficient data valuation methods, focusing on computational challenges.

The paper addresses the computational expense of Shapley value estimation for data valuation by analyzing and improving upon an existing Group Testing-based estimator, highlighting inefficiencies in sample reuse and offering insights for more efficient algorithms.

The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes