Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces
This addresses the challenge of aligning hierarchical data without external knowledge, which is important for applications in natural language processing and bioinformatics, representing a novel method for a known bottleneck.
The paper tackles the problem of unsupervised alignment of hierarchical data by proposing a geometric approach using optimal transport over hyperbolic spaces, which outperforms standard embedding alignment techniques in cross-lingual WordNet alignment and ontology matching tasks.
This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This is a problem that appears across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks.