CVCLSep 13, 2021

Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

arXiv:2109.06129v2682 citations
AI Analysis

This addresses the problem of understanding how language models capture grounded perceptual information, which is important for researchers in NLP and cognitive science, though it is incremental as it builds on prior work on relational encoding.

The study investigated whether language models can encode perceptual structure without grounding, using color as a case study, and found significant alignment between text-derived color term representations and perceptual color space, with warmer colors showing better alignment than cooler ones.

Pretrained language models have been shown to encode relational information, such as the relations between entities or concepts in knowledge-bases -- (Paris, Capital, France). However, simple relations of this type can often be recovered heuristically and the extent to which models implicitly reflect topological structure that is grounded in world, such as perceptual structure, is unknown. To explore this question, we conduct a thorough case study on color. Namely, we employ a dataset of monolexemic color terms and color chips represented in CIELAB, a color space with a perceptually meaningful distance metric. Using two methods of evaluating the structural alignment of colors in this space with text-derived color term representations, we find significant correspondence. Analyzing the differences in alignment across the color spectrum, we find that warmer colors are, on average, better aligned to the perceptual color space than cooler ones, suggesting an intriguing connection to findings from recent work on efficient communication in color naming. Further analysis suggests that differences in alignment are, in part, mediated by collocationality and differences in syntactic usage, posing questions as to the relationship between color perception and usage and context.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes