CLAIMay 21, 2024

Quantifying Semantic Emergence in Language Models

arXiv:2405.12617v22 citationsh-index: 4ACL
Originality Incremental advance
AI Analysis

This provides a domain-specific tool for researchers to analyze semantic emergence in LLMs, though it is incremental as it builds on existing entropy and mutual information concepts.

The authors tackled the lack of a metric to quantify language models' semantic capture by introducing Information Emergence (IE), a lightweight estimator that measures entropy reduction across transformer layers, revealing both expected and unexpected patterns in synthetic and natural contexts.

Large language models (LLMs) are widely recognized for their exceptional capacity to capture semantics meaning. Yet, there remains no established metric to quantify this capability. In this work, we introduce a quantitative metric, Information Emergence (IE), designed to measure LLMs' ability to extract semantics from input tokens. We formalize ``semantics'' as the meaningful information abstracted from a sequence of tokens and quantify this by comparing the entropy reduction observed for a sequence of tokens (macro-level) and individual tokens (micro-level). To achieve this, we design a lightweight estimator to compute the mutual information at each transformer layer, which is agnostic to different tasks and language model architectures. We apply IE in both synthetic in-context learning (ICL) scenarios and natural sentence contexts. Experiments demonstrate informativeness and patterns about semantics. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes