Bayes classifier cannot be learned from noisy responses with unknown noise rates
This addresses a fundamental limitation in machine learning for practitioners dealing with noisy data, indicating that relaxing noise distribution requirements is often impossible.
The paper shows that the Bayes decision rule is generally unidentifiable in classification problems with noisy labels, meaning it cannot be learned without knowledge of the noise distribution, and provides a simple algorithm for the special cases where it is identifiable.
Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requirement. In the special cases in which the Bayes decision rule is identified, we develop a simple algorithm to learn the Bayes decision rule, that does not require knowledge of the noise distribution.