LGDSOCJul 14, 2025

Distributionally Robust Optimization with Adversarial Data Contamination

arXiv:2507.10718v22 citationsh-index: 5
Originality Highly original
AI Analysis

This work addresses the dual challenges of data contamination and distributional shifts for decision-making under uncertainty, providing the first rigorous guarantees with efficient computation.

The paper tackles the problem of distributionally robust optimization being compromised by adversarial data contamination, and introduces a framework that integrates robustness against both contamination and distributional shifts, achieving an estimation error of O(√ε) for the true objective value.

Distributionally Robust Optimization (DRO) provides a framework for decision-making under distributional uncertainty, yet its effectiveness can be compromised by outliers in the training data. This paper introduces a principled approach to simultaneously address both challenges. We focus on optimizing Wasserstein-1 DRO objectives for generalized linear models with convex Lipschitz loss functions, where an $ε$-fraction of the training data is adversarially corrupted. Our primary contribution lies in a novel modeling framework that integrates robustness against training data contamination with robustness against distributional shifts, alongside an efficient algorithm inspired by robust statistics to solve the resulting optimization problem. We prove that our method achieves an estimation error of $O(\sqrtε)$ for the true DRO objective value using only the contaminated data under the bounded covariance assumption. This work establishes the first rigorous guarantees, supported by efficient computation, for learning under the dual challenges of data contamination and distributional shifts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes