LGDec 19, 2024

Lorentzian Residual Neural Networks

arXiv:2412.14695v218.816 citationsh-index: 8Has CodeKDD

Originality Highly original

AI Analysis

This addresses a specific bottleneck in hyperbolic neural networks for hierarchical data modeling, offering a generally applicable method with improved stability and efficiency.

The paper tackles limitations in hyperbolic residual networks (increased complexity, instability, mapping errors) by introducing LResNet, a Lorentzian residual neural network using weighted Lorentzian centroids, which demonstrates superior performance and robustness in graph and vision tasks compared to state-of-the-art alternatives.

Hyperbolic neural networks have emerged as a powerful tool for modeling hierarchical data structures prevalent in real-world datasets. Notably, residual connections, which facilitate the direct flow of information across layers, have been instrumental in the success of deep neural networks. However, current methods for constructing hyperbolic residual networks suffer from limitations such as increased model complexity, numerical instability, and errors due to multiple mappings to and from the tangent space. To address these limitations, we introduce LResNet, a novel Lorentzian residual neural network based on the weighted Lorentzian centroid in the Lorentz model of hyperbolic geometry. Our method enables the efficient integration of residual connections in Lorentz hyperbolic neural networks while preserving their hierarchical representation capabilities. We demonstrate that our method can theoretically derive previous methods while offering improved stability, efficiency, and effectiveness. Extensive experiments on both graph and vision tasks showcase the superior performance and robustness of our method compared to state-of-the-art Euclidean and hyperbolic alternatives. Our findings highlight the potential of LResNet for building more expressive neural networks in hyperbolic embedding space as a generally applicable method to multiple architectures, including CNNs, GNNs, and graph Transformers.

View on arXiv PDF Code

Similar