LGDec 10, 2022

How to select an objective function using information theory

arXiv:2212.06566v41 citationsh-index: 22
Originality Highly original
AI Analysis

This work addresses a foundational issue for researchers and practitioners in fields like climate modeling, where models lack a definite utility, though it presents a theoretical paradigm rather than incremental improvements.

The paper tackles the problem of selecting objective functions in machine learning and scientific computing by proposing an information-theoretic approach, arguing that the objective should be chosen to maximize information and minimize uncertainty, measured in bits, rather than based on specific utility.

In machine learning or scientific computing, model performance is measured with an objective function. But why choose one objective over another? Information theory gives one answer: To maximize the information in the model, select the objective function that represents the error in the fewest bits. To evaluate different objectives, transform them into likelihood functions. As likelihoods, their relative magnitude represents how strongly we should prefer one objective versus another, and the log of that relation represents the difference in their bit-length, as well as the difference in their uncertainty. In other words, prefer whichever objective minimizes the uncertainty. Under the information-theoretic paradigm, the ultimate objective is to maximize information (and minimize uncertainty), as opposed to any specific utility. We argue that this paradigm is well-suited to models that have many uses and no definite utility, like the large Earth system models used to understand the effects of climate change.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes