LG MLJul 13, 2018

Neural Networks Regularization Through Representation Learning

arXiv:1807.05292v11.51 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses overfitting in neural networks for applications with scarce data, such as medical imaging, but is incremental as it builds on existing regularization techniques.

The thesis tackled neural network overfitting with limited training data by proposing three representation learning approaches: structured output regression, classification using hidden layer priors, and transfer learning for medical image localization, achieving an application to localize the third lumbar vertebra in 3D CT scans.

Neural network models and deep models are one of the leading and state of the art models in machine learning. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. This work has been done in collaboration with the clinic Rouen Henri Becquerel Center who provided us with data.

View on arXiv PDF Code

Similar