Are Deep Neural Architectures Losing Information? Invertibility Is Indispensable
This work addresses a foundational issue in deep learning for tasks like image restoration, offering a theoretical insight that could guide future architecture design.
The paper tackles the problem of information loss in deep neural architectures by proving that invertibility is necessary for preserving maximum input details, and demonstrates that an Invertible Restoring Autoencoder (IRAE) outperforms non-invertible models on image restoration tasks with fewer parameters.
Ever since the advent of AlexNet, designing novel deep neural architectures for different tasks has consistently been a productive research direction. Despite the exceptional performance of various architectures in practice, we study a theoretical question: what is the condition for deep neural architectures to preserve all the information of the input data? Identifying the information lossless condition for deep neural architectures is important, because tasks such as image restoration require keep the detailed information of the input data as much as possible. Using the definition of mutual information, we show that: a deep neural architecture can preserve maximum details about the given data if and only if the architecture is invertible. We verify the advantages of our Invertible Restoring Autoencoder (IRAE) network by comparing it with competitive models on three perturbed image restoration tasks: image denoising, jpeg image decompression and image inpainting. Experimental results show that IRAE consistently outperforms non-invertible ones. Our model even contains far fewer parameters. Thus, it may be worthwhile to try replacing standard components of deep neural architectures, such as residual blocks and ReLU, with their invertible counterparts. We believe our work provides a unique perspective and direction for future deep learning research.