Can LLMs Interpret and Leverage Structured Linguistic Representations? A Case Study with AMRs
This work addresses the problem of enhancing LLM performance with structured data for researchers and practitioners, but it is incremental as it builds on existing methods for specific tasks.
This paper investigates whether Large Language Models (LLMs) can use structured linguistic representations like Abstract Meaning Representations (AMRs) to improve performance on language tasks, finding that AMR augmentation degrades performance for short contexts but boosts it for long contexts, such as increasing zero-shot cosine similarity from 66% to 76% in dialogue summarization.
This paper evaluates the ability of Large Language Models (LLMs) to leverage contextual information in the form of structured linguistic representations. Specifically, we examine the impact of encoding both short and long contexts using Abstract Meaning Representation (AMR) structures across a diverse set of language tasks. We perform our analysis using 8-bit quantized and instruction-tuned versions of Llama 3.1 (8B), Phi-3, and Mistral 7B. Our results indicate that, for tasks involving short contexts, augmenting the prompt with the AMR of the original language context often degrades the performance of the underlying LLM. However, for tasks that involve long contexts, such as dialogue summarization in the SAMSum dataset, this enhancement improves LLM performance, for example, by increasing the zero-shot cosine similarity score of Llama 3.1 from 66% to 76%. This improvement is more evident in the newer and larger LLMs, but does not extend to the older or smaller ones. In addition, we observe that LLMs can effectively reconstruct the original text from a linearized AMR, achieving a cosine similarity of 81% in the best-case scenario.