CV LG MLJul 7, 2020

Hierarchical nucleation in deep neural networks

Diego Doimo, Aldo Glielmo, Alessio Ansuini, Alessandro Laio

arXiv:2007.03506v212.435 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work provides insights into how deep neural networks form hierarchical representations, which is important for researchers in machine learning and AI to understand model interpretability and internal mechanisms.

The study analyzed the evolution of probability density across hidden layers in state-of-the-art deep convolutional networks on ImageNet, finding that initial layers create a unimodal density by removing irrelevant structure, while subsequent layers develop hierarchical density peaks that mirror semantic categories, with a sharp transition near the output resembling nucleation.

Deep convolutional networks (DCNs) learn meaningful representations where data that share the same abstract characteristics are positioned closer and closer. Understanding these representations and how they are generated is of unquestioned practical and theoretical interest. In this work we study the evolution of the probability density of the ImageNet dataset across the hidden layers in some state-of-the-art DCNs. We find that the initial layers generate a unimodal probability density getting rid of any structure irrelevant for classification. In subsequent layers density peaks arise in a hierarchical fashion that mirrors the semantic hierarchy of the concepts. Density peaks corresponding to single categories appear only close to the output and via a very sharp transition which resembles the nucleation process of a heterogeneous liquid. This process leaves a footprint in the probability density of the output layer where the topography of the peaks allows reconstructing the semantic relationships of the categories.

View on arXiv PDF Code

Similar