On the Scaling Laws of Geographical Representation in Language Models
This work addresses the problem of understanding and mitigating geographical biases in language models for AI fairness and representation.
The paper investigates how geographical knowledge in language models scales with model size, finding that such knowledge is present even in small models and scales consistently, but larger models do not reduce inherent geographical biases from training data.
Language models have long been shown to embed geographical information in their hidden representations. This line of work has recently been revisited by extending this result to Large Language Models (LLMs). In this paper, we propose to fill the gap between well-established and recent literature by observing how geographical knowledge evolves when scaling language models. We show that geographical knowledge is observable even for tiny models, and that it scales consistently as we increase the model size. Notably, we observe that larger language models cannot mitigate the geographical bias that is inherent to the training data.