Model Repair: Robust Recovery of Over-Parameterized Statistical Models
This addresses a robust estimation challenge for machine learning practitioners dealing with model corruption, though it appears incremental as it builds on existing robust estimation concepts.
The paper tackles the problem of recovering a corrupted statistical model after estimation, proposing methods to 'repair' it using only the design data without response values, with results showing that over-parameterized models and redundant estimators like stochastic gradient descent enable effective repair, while sparse estimators generally do not.
A new type of robust estimation problem is introduced where the goal is to recover a statistical model that has been corrupted after it has been estimated from data. Methods are proposed for "repairing" the model using only the design and not the response values used to fit the model in a supervised learning setting. Theory is developed which reveals that two important ingredients are necessary for model repair---the statistical model must be over-parameterized, and the estimator must incorporate redundancy. In particular, estimators based on stochastic gradient descent are seen to be well suited to model repair, but sparse estimators are not in general repairable. After formulating the problem and establishing a key technical lemma related to robust estimation, a series of results are presented for repair of over-parameterized linear models, random feature models, and artificial neural networks. Simulation studies are presented that corroborate and illustrate the theoretical findings.