CLAINov 12, 2024

ExpressivityArena: Can LLMs Express Information Implicitly?

arXiv:2411.08010v14 citationsh-index: 21
Originality Synthesis-oriented
AI Analysis

This work addresses the need for better evaluation of LLM expressivity, which is incremental as it provides a new framework and library for testing, but does not introduce a novel method or achieve broad SOTA results.

The paper tackles the problem of evaluating how well Large Language Models (LLMs) can express implicit language cues used in human communication, by introducing ExpressivityArena, a Python library for measuring LLM expressivity, and finds that LLMs are capable of generating and understanding expressive content but with limitations.

While Large Language Models (LLMs) have demonstrated remarkable performance in certain dimensions, their ability to express implicit language cues that human use for effective communication remains unclear. This paper presents ExpressivityArena, a Python library for measuring the implicit communication abilities of LLMs. We provide a comprehensive framework to evaluate expressivity of arbitrary LLMs and explore its practical implications. To this end, we refine the definition and measurements of ``expressivity,'' and use our framework in a set of small experiments. These experiments test LLMs in creative and logical tasks such as poetry, coding, and emotion-based responses. They are then evaluated by an automated grader, through ExpressivityArena, which we verify to be the most pragmatic for testing expressivity. Building on these experiments, we deepen our understanding of the expressivity of LLMs by assessing their ability to remain expressive in conversations. Our findings indicate that LLMs are capable of generating and understanding expressive content, however, with some limitations. These insights will inform the future development and deployment of expressive LLMs. We provide the code for ExpressivityArena alongside our paper.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes