OC LG NEApr 1, 2020

Fractional Deep Neural Network via Constrained Optimization

Harbir Antil, Ratna Khatri, Rainald Löhner, Deepanshu Verma

arXiv:2004.00719v114.333 citations

Originality Highly original

AI Analysis

This addresses the vanishing gradient issue and nonsmooth data handling for deep learning practitioners, representing a novel method rather than an incremental improvement.

The paper tackles the problem of vanishing gradients and handling nonsmooth data in deep neural networks by introducing Fractional-DNN, a framework that incorporates memory into all layers via a fractional ODE constraint, resulting in improved performance on classification datasets.

This paper introduces a novel algorithmic framework for a deep neural network (DNN), which in a mathematically rigorous manner, allows us to incorporate history (or memory) into the network -- it ensures all layers are connected to one another. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We emphasize that an analogy between the existing DNN and ODEs, with standard time derivative, is well-known by now. The focus of our work is the Fractional-DNN. Using the Lagrangian approach, we provide a derivation of the backward propagation and the design equations. We test our network on several datasets for classification problems. Fractional-DNN offers various advantages over the existing DNN. The key benefits are a significant improvement to the vanishing gradient issue due to the memory effect, and better handling of nonsmooth data due to the network's ability to approximate non-smooth functions.

View on arXiv PDF

Similar