LGAISIMLFeb 8, 2024

Let Your Graph Do the Talking: Encoding Structured Data for LLMs

arXiv:2402.05862v1131 citationsh-index: 36
Originality Incremental advance
AI Analysis

This work addresses a general encoding challenge for structured data in LLMs, enabling better performance on various reasoning tasks, though it is incremental as it builds on existing graph representation methods.

The paper tackles the problem of encoding structured data for large language models (LLMs) by introducing GraphToken, a parameter-efficient method that explicitly represents graph structures in prompts, resulting in improvements of up to 73% points on graph reasoning tasks in the GraphQA benchmark.

How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an encoding function to extend prompts with explicit structured information. Unlike other work which focuses on limited domains (e.g. knowledge graph representation), our work is the first effort focused on the general encoding of structured data to be used for various reasoning tasks. We show that explicitly representing the graph structure allows significant improvements to graph reasoning tasks. Specifically, we see across the board improvements - up to 73% points - on node, edge and, graph-level tasks from the GraphQA benchmark.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes