LGAIGTApr 2, 2023

Optimizing Data Shapley Interaction Calculation from O(2^n) to O(t n^2) for KNN models

arXiv:2304.01224v13 citationsh-index: 69
Originality Incremental advance
AI Analysis

This addresses the computational bottleneck in data valuation for AI practitioners, enabling faster training set optimization, though it is incremental as it builds on existing Shapley methods.

The paper tackled the problem of efficiently calculating exact pair-interaction Shapley values for data valuation in KNN models, reducing time complexity from O(2^n) to O(t n^2) with the STI-KNN algorithm.

With the rapid growth of data availability and usage, quantifying the added value of each training data point has become a crucial process in the field of artificial intelligence. The Shapley values have been recognized as an effective method for data valuation, enabling efficient training set summarization, acquisition, and outlier removal. In this paper, we introduce "STI-KNN", an innovative algorithm that calculates the exact pair-interaction Shapley values for KNN models in O(t n^2) time, which is a significant improvement over the O(2^n)$ time complexity of baseline methods. By using STI-KNN, we can efficiently and accurately evaluate the value of individual data points, leading to improved training outcomes and ultimately enhancing the effectiveness of artificial intelligence applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes