LG AI CVSep 8, 2024

Enhancing Convolutional Neural Networks with Higher-Order Numerical Difference Methods

Qi Wang, Zijun Gao, Mingxiu Sui, Taiyuan Mei, Xiaohan Cheng, Iris Li

arXiv:2409.04977v113.412 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses performance limitations in CNNs due to model size constraints, offering a theoretically supported method for broader neural network applications.

The paper tackles the problem of enhancing Convolutional Neural Networks (CNNs) without increasing model size by proposing a stacking scheme based on higher-order linear multi-step numerical difference methods, resulting in performance superior to existing schemes like ResNet and HO-ResNet.

With the rise of deep learning technology in practical applications, Convolutional Neural Networks (CNNs) have been able to assist humans in solving many real-world problems. To enhance the performance of CNNs, numerous network architectures have been explored. Some of these architectures are designed based on the accumulated experience of researchers over time, while others are designed through neural architecture search methods. The improvements made to CNNs by the aforementioned methods are quite significant, but most of the improvement methods are limited in reality by model size and environmental constraints, making it difficult to fully realize the improved performance. In recent years, research has found that many CNN structures can be explained by the discretization of ordinary differential equations. This implies that we can design theoretically supported deep network structures using higher-order numerical difference methods. It should be noted that most of the previous CNN model structures are based on low-order numerical methods. Therefore, considering that the accuracy of linear multi-step numerical difference methods is higher than that of the forward Euler method, this paper proposes a stacking scheme based on the linear multi-step method. This scheme enhances the performance of ResNet without increasing the model size and compares it with the Runge-Kutta scheme. The experimental results show that the performance of the stacking scheme proposed in this paper is superior to existing stacking schemes (ResNet and HO-ResNet), and it has the capability to be extended to other types of neural networks.

View on arXiv PDF

Similar