LGFeb 23, 2023

The Geometry of Mixability

arXiv:2302.11905v14 citationsh-index: 49
Originality Incremental advance
AI Analysis

This work addresses a foundational issue in online learning theory for researchers, offering a unified geometric perspective on mixability, though it is incremental as it builds on existing concepts.

The paper tackles the problem of characterizing mixable loss functions, which are crucial for fast learning rates in online prediction with expert advice, by providing a geometric condition for mixability using differential geometry. The result reconciles previous findings for binary and multi-class cases under general differentiability assumptions.

Mixable loss functions are of fundamental importance in the context of prediction with expert advice in the online setting since they characterize fast learning rates. By re-interpreting properness from the point of view of differential geometry, we provide a simple geometric characterization of mixability for the binary and multi-class cases: a proper loss function $\ell$ is $η$-mixable if and only if the superpredition set $\textrm{spr}(η\ell)$ of the scaled loss function $η\ell$ slides freely inside the superprediction set $\textrm{spr}(\ell_{\log})$ of the log loss $\ell_{\log}$, under fairly general assumptions on the differentiability of $\ell$. Our approach provides a way to treat some concepts concerning loss functions (like properness) in a ''coordinate-free'' manner and reconciles previous results obtained for mixable loss functions for the binary and the multi-class cases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes