CLSep 30, 2023

RelBERT: Embedding Relations with Language Models

Asahi Ushio, Jose Camacho-Collados, Steven Schockaert

arXiv:2310.00299v20.93 citationsh-index: 40Has Code

Originality Highly original

AI Analysis

This addresses the need for efficient and controllable relational embeddings in applications requiring background knowledge, offering a novel alternative to incomplete knowledge graphs and inefficient large language models.

The paper tackled the problem of capturing fine-grained relational knowledge by proposing RelBERT, a method that fine-tunes small masked language models like RoBERTa with minimal training data, achieving a new state-of-the-art in analogy benchmarks and outperforming much larger language models.

Many applications need access to background knowledge about how different concepts and entities are related. Although Knowledge Graphs (KG) and Large Language Models (LLM) can address this need to some extent, KGs are inevitably incomplete and their relational schema is often too coarse-grained, while LLMs are inefficient and difficult to control. As an alternative, we propose to extract relation embeddings from relatively small language models. In particular, we show that masked language models such as RoBERTa can be straightforwardly fine-tuned for this purpose, using only a small amount of training data. The resulting model, which we call RelBERT, captures relational similarity in a surprisingly fine-grained way, allowing us to set a new state-of-the-art in analogy benchmarks. Crucially, RelBERT is capable of modelling relations that go well beyond what the model has seen during training. For instance, we obtained strong results on relations between named entities with a model that was only trained on lexical relations between concepts, and we observed that RelBERT can recognise morphological analogies despite not being trained on such examples. Overall, we find that RelBERT significantly outperforms strategies based on prompting language models that are several orders of magnitude larger, including recent GPT-based models and open source models.

View on arXiv PDF Code

Similar