MLAILGJun 18, 2021

On the benefits of maximum likelihood estimation for Regression and Forecasting

arXiv:2106.10370v214 citations
AI Analysis

This work addresses the challenge of incorporating inductive biases and domain knowledge into loss functions for researchers and practitioners in machine learning, though it appears incremental as it builds on existing MLE methods.

The paper tackles the problem of designing loss functions for regression and forecasting by advocating for a maximum likelihood estimation (MLE) approach as an alternative to direct empirical risk minimization, showing that it can output post-hoc estimators to optimize different target metrics and achieves better excess risk bounds in examples like Poisson and Pareto regression.

We advocate for a practical Maximum Likelihood Estimation (MLE) approach towards designing loss functions for regression and forecasting, as an alternative to the typical approach of direct empirical risk minimization on a specific target metric. The MLE approach is better suited to capture inductive biases such as prior domain knowledge in datasets, and can output post-hoc estimators at inference time that can optimize different types of target metrics. We present theoretical results to demonstrate that our approach is competitive with any estimator for the target metric under some general conditions. In two example practical settings, Poisson and Pareto regression, we show that our competitive results can be used to prove that the MLE approach has better excess risk bounds than directly minimizing the target metric. We also demonstrate empirically that our method instantiated with a well-designed general purpose mixture likelihood family can obtain superior performance for a variety of tasks across time-series forecasting and regression datasets with different data distributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes