Blocked and Hierarchical Disentangled Representation From Information Theory Perspective
This work addresses the challenge of interpretable feature learning in machine learning, offering a theoretical model that could improve representation learning for tasks like classification, though it appears incremental as it builds on existing variational autoencoder frameworks.
The authors tackled the problem of learning disentangled representations by proposing a blocked and hierarchical variational autoencoder (BHiVAE) based on information theory principles, achieving excellent disentanglement results and superior classification accuracy in experiments.
We propose a novel and theoretical model, blocked and hierarchical variational autoencoder (BHiVAE), to get better-disentangled representation. It is well known that information theory has an excellent explanatory meaning for the network, so we start to solve the disentanglement problem from the perspective of information theory. BHiVAE mainly comes from the information bottleneck theory and information maximization principle. Our main idea is that (1) Neurons block not only one neuron node is used to represent attribute, which can contain enough information; (2) Create a hierarchical structure with different attributes on different layers, so that we can segment the information within each layer to ensure that the final representation is disentangled. Furthermore, we present supervised and unsupervised BHiVAE, respectively, where the difference is mainly reflected in the separation of information between different blocks. In supervised BHiVAE, we utilize the label information as the standard to separate blocks. In unsupervised BHiVAE, without extra information, we use the Total Correlation (TC) measure to achieve independence, and we design a new prior distribution of the latent space to guide the representation learning. It also exhibits excellent disentanglement results in experiments and superior classification accuracy in representation learning.