LGAIJul 1, 2024

Hypformer: Exploring Efficient Transformer Fully in Hyperbolic Space

arXiv:2407.01290v249 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the problem of scaling hyperbolic Transformers for large-scale data representation, offering a novel solution for domains with hierarchical structures, though it is incremental in adapting existing Transformer concepts to hyperbolic geometry.

The authors tackled the challenge of developing a complete Transformer model in hyperbolic space by proposing Hypformer, which introduces foundational blocks and a linear self-attention mechanism, enabling processing of billion-scale graph data and long sequences with improved efficiency.

Hyperbolic geometry have shown significant potential in modeling complex structured data, particularly those with underlying tree-like and hierarchical structures. Despite the impressive performance of various hyperbolic neural networks across numerous domains, research on adapting the Transformer to hyperbolic space remains limited. Previous attempts have mainly focused on modifying self-attention modules in the Transformer. However, these efforts have fallen short of developing a complete hyperbolic Transformer. This stems primarily from: (i) the absence of well-defined modules in hyperbolic space, including linear transformation layers, LayerNorm layers, activation functions, dropout operations, etc. (ii) the quadratic time complexity of the existing hyperbolic self-attention module w.r.t the number of input tokens, which hinders its scalability. To address these challenges, we propose, Hypformer, a novel hyperbolic Transformer based on the Lorentz model of hyperbolic geometry. In Hypformer, we introduce two foundational blocks that define the essential modules of the Transformer in hyperbolic space. Furthermore, we develop a linear self-attention mechanism in hyperbolic space, enabling hyperbolic Transformer to process billion-scale graph data and long-sequence inputs for the first time. Our experimental results confirm the effectiveness and efficiency of Hypformer across various datasets, demonstrating its potential as an effective and scalable solution for large-scale data representation and large models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes