LGJul 2, 2021

Mitigating deep double descent by concatenating inputs

John Chen, Qihan Wang, Anastasios Kyrillidis

arXiv:2107.00797v14.43 citations

Originality Incremental advance

AI Analysis

This addresses a theoretical challenge in deep learning for researchers, though it appears incremental as it builds on existing work.

The authors tackled the deep double descent phenomenon in neural networks by proposing a dataset augmentation method that artificially increases sample count, which empirically mitigated the double descent curve, leading to a smooth descent into the overparameterized region.

The double descent curve is one of the most intriguing properties of deep neural networks. It contrasts the classical bias-variance curve with the behavior of modern neural networks, occurring where the number of samples nears the number of parameters. In this work, we explore the connection between the double descent phenomena and the number of samples in the deep neural network setting. In particular, we propose a construction which augments the existing dataset by artificially increasing the number of samples. This construction empirically mitigates the double descent curve in this setting. We reproduce existing work on deep double descent, and observe a smooth descent into the overparameterized region for our construction. This occurs both with respect to the model size, and with respect to the number epochs.

View on arXiv PDF

Similar