Boxhead: A Dataset for Learning Hierarchical Representations
This addresses the limitation of current disentanglement evaluations for real-world data, though it is incremental as it focuses on dataset creation and benchmarking.
The authors tackled the problem of evaluating disentanglement methods in hierarchical settings by introducing the Boxhead dataset with hierarchical ground-truth generative factors, and found that hierarchical models outperform single-layer VAEs in disentangling these factors.
Disentanglement is hypothesized to be beneficial towards a number of downstream tasks. However, a common assumption in learning disentangled representations is that the data generative factors are statistically independent. As current methods are almost solely evaluated on toy datasets where this ideal assumption holds, we investigate their performance in hierarchical settings, a relevant feature of real-world data. In this work, we introduce Boxhead, a dataset with hierarchically structured ground-truth generative factors. We use this novel dataset to evaluate the performance of state-of-the-art autoencoder-based disentanglement models and observe that hierarchical models generally outperform single-layer VAEs in terms of disentanglement of hierarchically arranged factors.