LGMay 3, 2013

Learning from Imprecise and Fuzzy Observations: Data Disambiguation through Generalized Loss Minimization

arXiv:1305.0698v16 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of handling imprecise and fuzzy observations in machine learning, which is an incremental improvement for researchers and practitioners dealing with uncertain data.

The paper tackles the problem of learning from fuzzy data by distinguishing between ontic and epistemic interpretations and arguing against the extension principle, proposing instead a method based on generalized loss functions in empirical risk minimization that simultaneously performs model identification and data disambiguation, with connections to well-known loss functions in regression and classification and an illustration in logistic regression for binary classification.

Methods for analyzing or learning from "fuzzy data" have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without carefully considering the interpretation of a fuzzy set when being used for modeling data. Distinguishing between an ontic and an epistemic interpretation of fuzzy set-valued data, and focusing on the latter, we argue that a "fuzzification" of learning algorithms based on an application of the generic extension principle is not appropriate. In fact, the extension principle fails to properly exploit the inductive bias underlying statistical and machine learning methods, although this bias, at least in principle, offers a means for "disambiguating" the fuzzy data. Alternatively, we therefore propose a method which is based on the generalization of loss functions in empirical risk minimization, and which performs model identification and data disambiguation simultaneously. Elaborating on the fuzzification of specific types of losses, we establish connections to well-known loss functions in regression and classification. We compare our approach with related methods and illustrate its use in logistic regression for binary classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes