LG MLOct 11, 2022

C-Mixup: Improving Generalization in Regression

Huaxiu Yao, Yiping Wang, Linjun Zhang, James Zou, Chelsea Finn

arXiv:2210.05775v126.496 citationsh-index: 93Has Code

Originality Highly original

AI Analysis

This addresses the underexplored challenge of applying mixup to regression, which is important for domains with limited data.

The paper tackles the problem of improving generalization in regression tasks by proposing C-Mixup, a variant of mixup that adjusts sampling probability based on label similarity. The method achieves improvements of 6.56%, 4.76%, and 5.82% in in-distribution generalization, task generalization, and out-of-distribution robustness respectively on eleven datasets.

Improving the generalization of deep networks is an important open challenge, particularly in domains without plentiful data. The mixup algorithm improves generalization by linearly interpolating a pair of examples and their corresponding labels. These interpolated examples augment the original training set. Mixup has shown promising results in various classification tasks, but systematic analysis of mixup in regression remains underexplored. Using mixup directly on regression labels can result in arbitrarily incorrect labels. In this paper, we propose a simple yet powerful algorithm, C-Mixup, to improve generalization on regression tasks. In contrast with vanilla mixup, which picks training examples for mixing with uniform probability, C-Mixup adjusts the sampling probability based on the similarity of the labels. Our theoretical analysis confirms that C-Mixup with label similarity obtains a smaller mean square error in supervised regression and meta-regression than vanilla mixup and using feature similarity. Another benefit of C-Mixup is that it can improve out-of-distribution robustness, where the test distribution is different from the training distribution. By selectively interpolating examples with similar labels, it mitigates the effects of domain-associated information and yields domain-invariant representations. We evaluate C-Mixup on eleven datasets, ranging from tabular to video data. Compared to the best prior approach, C-Mixup achieves 6.56%, 4.76%, 5.82% improvements in in-distribution generalization, task generalization, and out-of-distribution robustness, respectively. Code is released at https://github.com/huaxiuyao/C-Mixup.

View on arXiv PDF Code

Similar