Hannah Pinson

LG
3papers
9citations
Novelty53%
AI Score37

3 Papers

CVMar 3, 2023
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies

Hannah Pinson, Joeri Lenaerts, Vincent Ginis

We here present a stepping stone towards a deeper understanding of convolutional neural networks (CNNs) in the form of a theory of learning in linear CNNs. Through analyzing the gradient descent equations, we discover that the evolution of the network during training is determined by the interplay between the dataset structure and the convolutional network structure. We show that linear CNNs discover the statistical structure of the dataset with non-linear, ordered, stage-like transitions, and that the speed of discovery changes depending on the relationship between the dataset and the convolutional network structure. Moreover, we find that this interplay lies at the heart of what we call the ``dominant frequency bias'', where linear CNNs arrive at these discoveries using only the dominant frequencies of the different structural parts present in the dataset. We furthermore provide experiments that show how our theory relates to deep, non-linear CNNs used in practice. Our findings shed new light on the inner working of CNNs, and can help explain their shortcut learning and their tendency to rely on texture instead of shape.

LGFeb 4
It's not a Lottery, it's a Race: Understanding How Gradient Descent Adapts the Network's Capacity to the Task

Hannah Pinson

Our theoretical understanding of neural networks is lagging behind their empirical success. One of the important unexplained phenomena is why and how, during the process of training with gradient descent, the theoretical capacity of neural networks is reduced to an effective capacity that fits the task. We here investigate the mechanism by which gradient descent achieves this through analyzing the learning dynamics at the level of individual neurons in single hidden layer ReLU networks. We identify three dynamical principles -- mutual alignment, unlocking and racing -- that together explain why we can often successfully reduce capacity after training through the merging of equivalent neurons or the pruning of low norm weights. We specifically explain the mechanism behind the lottery ticket conjecture, or why the specific, beneficial initial conditions of some neurons lead them to obtain higher weight norms.

LGDec 21, 2021
Data driven design of optical resonators

Joeri Lenaerts, Hannah Pinson, Vincent Ginis

Optical devices lie at the heart of most of the technology we see around us. When one actually wants to make such an optical device, one can predict its optical behavior using computational simulations of Maxwell's equations. If one then asks what the optimal design would be in order to obtain a certain optical behavior, the only way to go further would be to try out all of the possible designs and compute the electromagnetic spectrum they produce. When there are many design parameters, this brute force approach quickly becomes too computationally expensive. We therefore need other methods to create optimal optical devices. An alternative to the brute force approach is inverse design. In this paradigm, one starts from the desired optical response of a material and then determines the design parameters that are needed to obtain this optical response. There are many algorithms known in the literature that implement this inverse design. Some of the best performing, recent approaches are based on Deep Learning. The central idea is to train a neural network to predict the optical response for given design parameters. Since neural networks are completely differentiable, we can compute gradients of the response with respect to the design parameters. We can use these gradients to update the design parameters and get an optical response closer to the one we want. This allows us to obtain an optimal design much faster compared to the brute force approach. In my thesis, I use Deep Learning for the inverse design of the Fabry-Pérot resonator. This system can be described fully analytically and is therefore ideal to study.