OCTANE -- Optimal Control for Tensor-based Autoencoder Network Emergence: Explicit Case
This addresses the challenge of efficient training and automated architecture discovery for researchers and practitioners in deep learning, particularly in image processing, though it appears incremental as it combines existing methods in a novel way.
The paper tackles the problem of memory-intensive training and architecture design in autoencoder neural networks by developing OCTANE, a framework that integrates optimal control theory and low-rank tensor methods, resulting in reduced memory usage and compact architectures, with applications demonstrated in image denoising and deblurring tasks.
This paper presents a novel, mathematically rigorous framework for autoencoder-type deep neural networks that combines optimal control theory and low-rank tensor methods to yield memory-efficient training and automated architecture discovery. The learning task is formulated as an optimization problem constrained by differential equations representing the encoder and decoder components of the network and the corresponding optimality conditions are derived via a Lagrangian approach. Efficient memory compression is enabled by approximating differential equation solutions on low-rank tensor manifolds using an adaptive explicit integration scheme. These concepts are combined to form OCTANE (Optimal Control for Tensor-based Autoencoder Network Emergence) -- a unified training framework that yields compact autoencoder architectures, reduces memory usage, and enables effective learning, even with limited training data. The framework's utility is illustrated with application to image denoising and deblurring tasks and recommendations regarding governing hyperparameters are provided.