MLLGJan 7, 2025

Generation from Noisy Examples

arXiv:2501.04179v216 citationsh-index: 2ICML
Originality Incremental advance
AI Analysis

This work addresses a theoretical problem in machine learning for researchers studying generation from noisy data, but it is incremental as it extends existing noiseless frameworks.

The paper tackles the problem of generating new positive examples from a binary hypothesis class when the example stream includes a finite number of noisy negative examples, extending prior noiseless results. It provides necessary and sufficient conditions for noisy generatability, showing that for finite and countable classes, generatability is largely unaffected by the noise.

We continue to study the learning-theoretic foundations of generation by extending the results from Kleinberg and Mullainathan [2024] and Li et al. [2024] to account for noisy example streams. In the noiseless setting of Kleinberg and Mullainathan [2024] and Li et al. [2024], an adversary picks a hypothesis from a binary hypothesis class and provides a generator with a sequence of its positive examples. The goal of the generator is to eventually output new, unseen positive examples. In the noisy setting, an adversary still picks a hypothesis and a sequence of its positive examples. But, before presenting the stream to the generator, the adversary inserts a finite number of negative examples. Unaware of which examples are noisy, the goal of the generator is to still eventually output new, unseen positive examples. In this paper, we provide necessary and sufficient conditions for when a binary hypothesis class can be noisily generatable. We provide such conditions with respect to various constraints on the number of distinct examples that need to be seen before perfect generation of positive examples. Interestingly, for finite and countable classes we show that generatability is largely unaffected by the presence of a finite number of noisy examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes