Low-distortion and GPU-compatible Tree Embeddings in Hyperbolic Space
This work improves hyperbolic embeddings for hierarchical data, enabling better integration in deep learning applications, though it is incremental as it builds on existing combinatorial methods.
The paper tackled the problem of embedding tree-like data in hyperbolic space by addressing two key limitations: poor separation in combinatorial constructions leading to high distortion, and the incompatibility of high-precision arithmetic with GPU acceleration. The proposed HS-DTE method achieved lower distortion and maintained utility on GPUs through floating point expansion arithmetic.
Embedding tree-like data, from hierarchies to ontologies and taxonomies, forms a well-studied problem for representing knowledge across many domains. Hyperbolic geometry provides a natural solution for embedding trees, with vastly superior performance over Euclidean embeddings. Recent literature has shown that hyperbolic tree embeddings can even be placed on top of neural networks for hierarchical knowledge integration in deep learning settings. For all applications, a faithful embedding of trees is needed, with combinatorial constructions emerging as the most effective direction. This paper identifies and solves two key limitations of existing works. First, the combinatorial construction hinges on finding highly separated points on a hypersphere, a notoriously difficult problem. Current approaches achieve poor separation, degrading the quality of the corresponding hyperbolic embedding. We propose highly separated Delaunay tree embeddings (HS-DTE), which integrates angular separation in a generalized formulation of Delaunay embeddings, leading to lower embedding distortion. Second, low-distortion requires additional precision. The current approach for increasing precision is to use multiple precision arithmetic, which renders the embeddings useless on GPUs in deep learning settings. We reformulate the combinatorial construction using floating point expansion arithmetic, leading to superior embedding quality while retaining utility on accelerated hardware.