CVMar 12, 2019

Paradox in Deep Neural Networks: Similar yet Different while Different yet Similar

arXiv:1903.04772v12 citations
Originality Incremental advance
AI Analysis

This reveals a paradox in neural network behavior that challenges assumptions about model similarity and has implications for transfer learning and understanding artificial intelligence.

The paper investigates the relationship between kernel weight similarity and performance in deep neural networks, finding that networks with over 99.9% correlated weights can have significantly different performances, while uncorrelated networks can achieve similar performance levels.

Machine learning is advancing towards a data-science approach, implying a necessity to a line of investigation to divulge the knowledge learnt by deep neuronal networks. Limiting the comparison among networks merely to a predefined intelligent ability, according to ground truth, does not suffice, it should be associated with innate similarity of these artificial entities. Here, we analysed multiple instances of an identical architecture trained to classify objects in static images (CIFAR and ImageNet data sets). We evaluated the performance of the networks under various distortions and compared it to the intrinsic similarity between their constituent kernels. While we expected a close correspondence between these two measures, we observed a puzzling phenomenon. Pairs of networks whose kernels' weights are over 99.9% correlated can exhibit significantly different performances, yet other pairs with no correlation can reach quite compatible levels of performance. We show implications of this for transfer learning, and argue its importance in our general understanding of what intelligence is, whether natural or artificial.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes