LG AIMay 6, 2024

QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation

Chenhui Xu, Xinyao Wang, Fuxun Yu, Jinjun Xiong, Xiang Chen

arXiv:2405.03192v26.43 citations

Originality Incremental advance

AI Analysis

This addresses efficiency challenges in machine learning for researchers and practitioners by enabling more sustainable training of advanced models, though it is incremental as it builds on existing quadratic neural network concepts.

The paper tackles the problem of high overhead in training high-order neural networks by introducing QuadraNet V2, which uses quadratic adaptation to integrate pre-trained weights, reducing GPU training time by 90% to 98.4% compared to training from scratch.

Machine learning is evolving towards high-order models that necessitate pre-training on extensive datasets, a process associated with significant overheads. Traditional models, despite having pre-trained weights, are becoming obsolete due to architectural differences that obstruct the effective transfer and initialization of these weights. To address these challenges, we introduce a novel framework, QuadraNet V2, which leverages quadratic neural networks to create efficient and sustainable high-order learning models. Our method initializes the primary term of the quadratic neuron using a standard neural network, while the quadratic term is employed to adaptively enhance the learning of data non-linearity or shifts. This integration of pre-trained primary terms with quadratic terms, which possess advanced modeling capabilities, significantly augments the information characterization capacity of the high-order network. By utilizing existing pre-trained weights, QuadraNet V2 reduces the required GPU hours for training by 90\% to 98.4\% compared to training from scratch, demonstrating both efficiency and effectiveness.

View on arXiv PDF

Similar