ZNorm: Z-Score Gradient Normalization Accelerating Skip-Connected Network Training without Architectural Modification
This addresses training inefficiencies in deep learning models, particularly for skip-connected architectures like ResNets, with incremental improvements in gradient handling.
The paper tackles the problem of vanishing and exploding gradients in skip-connected deep neural networks by proposing ZNorm, a gradient normalization technique that accelerates training without architectural changes, achieving superior performance on CIFAR-10 and medical datasets with improved tumor prediction and segmentation accuracy.
The rapid advancements in deep learning necessitate better training methods for deep neural networks (DNNs). As models grow in complexity, vanishing and exploding gradients impede performance, particularly in skip-connected architectures like Deep Residual Networks. We propose Z-Score Normalization for Gradient Descent (ZNorm), an innovative technique that adjusts only the gradients without modifying the network architecture to accelerate training and improve model performance. ZNorm normalizes the overall gradients, providing consistent gradient scaling across layers, effectively reducing the risks of vanishing and exploding gradients and achieving superior performance. Extensive experiments on CIFAR-10 and medical datasets confirm that ZNorm consistently outperforms existing methods under the same experimental settings. In medical imaging applications, ZNorm significantly enhances tumor prediction and segmentation accuracy, underscoring its practical utility. These findings highlight ZNorm's potential as a robust and versatile tool for enhancing the training and effectiveness of deep neural networks, especially in skip-connected architectures, across various applications.