MLLGSTDec 8, 2022

Statistical and Computational Guarantees for Influence Diagnostics

UW
arXiv:2212.04014v21 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work provides theoretical guarantees for influence diagnostics, which are important for identifying influential data points in machine learning and AI applications, but it is incremental as it builds on existing methods.

The paper establishes finite-sample statistical bounds and computational complexity bounds for influence diagnostics, such as influence functions and approximate maximum influence perturbations, using efficient inverse-Hessian-vector product implementations, with results illustrated on generalized linear models and large attention-based models.

Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approximate maximum influence perturbations using efficient inverse-Hessian-vector product implementations. We illustrate our results with generalized linear models and large attention based models on synthetic and real data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes