Unveiling Multiple Descents in Unsupervised Autoencoders
This addresses the problem of understanding generalization in unsupervised learning for researchers and practitioners, revealing new phenomena that challenge traditional views and have practical implications in real-world applications.
The study tackled the phenomenon of double descent in unsupervised learning, demonstrating that it does not occur in linear autoencoders but showing for the first time that double and triple descent can be observed in nonlinear autoencoders across various data models and architectures, with over-parameterized models improving reconstruction and downstream task performance.
The phenomenon of double descent has challenged the traditional bias-variance trade-off in supervised learning but remains unexplored in unsupervised learning, with some studies arguing for its absence. In this study, we first demonstrate analytically that double descent does not occur in linear unsupervised autoencoders (AEs). In contrast, we show for the first time that both double and triple descent can be observed with nonlinear AEs across various data models and architectural designs. We examine the effects of partial sample and feature noise and highlight the importance of bottleneck size in influencing the double descent curve. Through extensive experiments on both synthetic and real datasets, we uncover model-wise, epoch-wise, and sample-wise double descent across several data types and architectures. Our findings indicate that over-parameterized models not only improve reconstruction but also enhance performance in downstream tasks such as anomaly detection and domain adaptation, highlighting their practical value in complex real-world scenarios.