LGMLFeb 19, 2024

Bayesian Active Learning for Censored Regression

arXiv:2402.11973v13 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses a specific problem in active learning for researchers dealing with censored data, but it is incremental as it adapts an existing method to a new setting.

The paper tackles the challenge of applying Bayesian active learning to censored regression, where data points have clipped target values, by deriving a new objective called C-BALD and a modeling approach to estimate it. The result shows that C-BALD outperforms other Bayesian active learning methods across various datasets and models.

Bayesian active learning is based on information theoretical approaches that focus on maximising the information that new observations provide to the model parameters. This is commonly done by maximising the Bayesian Active Learning by Disagreement (BALD) acquisitions function. However, we highlight that it is challenging to estimate BALD when the new data points are subject to censorship, where only clipped values of the targets are observed. To address this, we derive the entropy and the mutual information for censored distributions and derive the BALD objective for active learning in censored regression ($\mathcal{C}$-BALD). We propose a novel modelling approach to estimate the $\mathcal{C}$-BALD objective and use it for active learning in the censored setting. Across a wide range of datasets and models, we demonstrate that $\mathcal{C}$-BALD outperforms other Bayesian active learning methods in censored regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes