How Good is a Single Basin?
This work addresses a theoretical problem in deep learning for researchers, showing that while extra-basin knowledge is present in single basins, it requires learning from other basins, making it incremental in understanding ensemble methods.
The study investigated whether the multi-modal nature of neural loss landscapes is essential for deep ensembles by creating connected ensembles within a single basin, finding that increased connectivity reduces performance, but distillation can mitigate this gap by re-discovering multi-basin knowledge within a single basin.
The multi-modal nature of neural loss landscapes is often considered to be the main driver behind the empirical success of deep ensembles. In this work, we probe this belief by constructing various "connected" ensembles which are restricted to lie in the same basin. Through our experiments, we demonstrate that increased connectivity indeed negatively impacts performance. However, when incorporating the knowledge from other basins implicitly through distillation, we show that the gap in performance can be mitigated by re-discovering (multi-basin) deep ensembles within a single basin. Thus, we conjecture that while the extra-basin knowledge is at least partially present in any given basin, it cannot be easily harnessed without learning it from other basins.