CLSep 16, 2021

Do Language Models Know the Way to Rome?

arXiv:2109.07971v1663 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of evaluating global knowledge in language models for researchers in NLP, but it is incremental as it builds on existing probing methods using geography as a test case.

The paper investigates whether language models encode global geographic knowledge, such as the relative positions of cities, by testing if they can infer the location of Rome given Paris and Berlin. The results show that language models have limited geographic information, with larger models performing better, indicating that such knowledge can be induced from co-occurrence statistics.

The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained. In this paper we exploit the fact that in geography, ground truths are available beyond local relations. In a series of experiments, we evaluate the extent to which language model representations of city and country names are isomorphic to real-world geography, e.g., if you tell a language model where Paris and Berlin are, does it know the way to Rome? We find that language models generally encode limited geographic information, but with larger models performing the best, suggesting that geographic knowledge can be induced from higher-order co-occurrence statistics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes