ML LGApr 11, 2015

Quick sensitivity analysis for incremental data modification and its application to leave-one-out CV in linear classification problems

Shota Okumura, Yoshiki Suzuki, Ichiro Takeuchi

arXiv:1504.02870v17.018 citations

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks for researchers and practitioners in machine learning dealing with large datasets and incremental updates, though it is incremental as it builds on existing incremental learning methods.

The authors tackled the computational inefficiency of incremental learning for large-scale classification when only a few instances are modified, by proposing a sensitivity analysis framework that provides bounds on the updated classifier without re-optimization, with computational cost depending only on the number of updated instances and bounds often being sufficiently tight.

We introduce a novel sensitivity analysis framework for large scale classification problems that can be used when a small number of instances are incrementally added or removed. For quickly updating the classifier in such a situation, incremental learning algorithms have been intensively studied in the literature. Although they are much more efficient than solving the optimization problem from scratch, their computational complexity yet depends on the entire training set size. It means that, if the original training set is large, completely solving an incremental learning problem might be still rather expensive. To circumvent this computational issue, we propose a novel framework that allows us to make an inference about the updated classifier without actually re-optimizing it. Specifically, the proposed framework can quickly provide a lower and an upper bounds of a quantity on the unknown updated classifier. The main advantage of the proposed framework is that the computational cost of computing these bounds depends only on the number of updated instances. This property is quite advantageous in a typical sensitivity analysis task where only a small number of instances are updated. In this paper we demonstrate that the proposed framework is applicable to various practical sensitivity analysis tasks, and the bounds provided by the framework are often sufficiently tight for making desired inferences.

View on arXiv PDF

Similar