CVMay 23, 2024

IB-AdCSCNet:Adaptive Convolutional Sparse Coding Network Driven by Information Bottleneck

He Zou, Meng'en Qin, Yu Song, Xiaohui Yang

arXiv:2405.14192v12.01 citationsh-index: 1

Originality Incremental advance

AI Analysis

This work addresses the problem of improving generalization and robustness in deep learning models for computer vision tasks, though it appears incremental by integrating existing theories into a hybrid method.

The paper tackles the challenge of retaining task-relevant information while discarding redundant data in neural networks by introducing IB-AdCSCNet, a model based on information bottleneck theory that dynamically adjusts a trade-off hyperparameter; experimental results on CIFAR-10 and CIFAR-100 show it matches or outperforms deep residual convolutional networks, especially with corrupted data, enhancing robustness.

In the realm of neural network models, the perpetual challenge remains in retaining task-relevant information while effectively discarding redundant data during propagation. In this paper, we introduce IB-AdCSCNet, a deep learning model grounded in information bottleneck theory. IB-AdCSCNet seamlessly integrates the information bottleneck trade-off strategy into deep networks by dynamically adjusting the trade-off hyperparameter $λ$ through gradient descent, updating it within the FISTA(Fast Iterative Shrinkage-Thresholding Algorithm ) framework. By optimizing the compressive excitation loss function induced by the information bottleneck principle, IB-AdCSCNet achieves an optimal balance between compression and fitting at a global level, approximating the globally optimal representation feature. This information bottleneck trade-off strategy driven by downstream tasks not only helps to learn effective features of the data, but also improves the generalization of the model. This study's contribution lies in presenting a model with consistent performance and offering a fresh perspective on merging deep learning with sparse representation theory, grounded in the information bottleneck concept. Experimental results on CIFAR-10 and CIFAR-100 datasets demonstrate that IB-AdCSCNet not only matches the performance of deep residual convolutional networks but also outperforms them when handling corrupted data. Through the inference of the IB trade-off, the model's robustness is notably enhanced.

View on arXiv PDF

Similar