LGMLOct 22, 2018

Introducing Curvature to the Label Space

arXiv:1810.09549v11 citations
Originality Incremental advance
AI Analysis

This addresses a foundational issue in supervised categorical classification for machine learning, though it appears incremental as it builds on existing encoding methods.

The paper tackles the problem of one-hot encoding's geometric implication that all classes are equally different, which is inconsistent with real-world tasks due to varying morphological similarities. It introduces curvature to the label space using a metric tensor as a learning-algorithm agnostic solution, proposing general constraints and specific parameterizations.

One-hot encoding is a labelling system that embeds classes as standard basis vectors in a label space. Despite seeing near-universal use in supervised categorical classification tasks, the scheme is problematic in its geometric implication that, as all classes are equally distant, all classes are equally different. This is inconsistent with most, if not all, real-world tasks due to the prevalence of ancestral and convergent relationships generating a varying degree of morphological similarity across classes. We address this issue by introducing curvature to the label-space using a metric tensor as a self-regulating method that better represents these relationships as a bolt-on, learning-algorithm agnostic solution. We propose both general constraints and specific statistical parameterizations of the metric and identify a direction for future research using autoencoder-based parameterizations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes