LGAIMLNov 10, 2022

Regression as Classification: Influence of Task Formulation on Neural Network Features

arXiv:2211.05641v239 citationsh-index: 108
Originality Incremental advance
AI Analysis

This addresses a practical problem for machine learning practitioners by explaining performance differences between regression and classification formulations, though it is incremental as it builds on existing neural network theory.

The paper investigates why reformulating regression as classification often yields better performance by analyzing the implicit bias of gradient-based optimization in two-layer ReLU networks. It provides theoretical evidence that regression and classification lead to different feature supports in one-dimensional data, with empirical results showing optimization difficulties for the square loss.

Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, which can be fully characterized by measures over their feature space, we explore how the implicit bias induced by gradient-based optimization could partly explain the above phenomenon. We provide theoretical evidence that the regression formulation yields a measure whose support can differ greatly from that for classification, in the case of one-dimensional data. Our proposed optimal supports correspond directly to the features learned by the input layer of the network. The different nature of these supports sheds light on possible optimization difficulties the square loss could encounter during training, and we present empirical results illustrating this phenomenon.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes