FACTS: Table Summarization via Offline Template Generation with Agentic Workflows
This addresses efficiency, accuracy, and privacy issues in table summarization for users needing insights from tabular data, though it is incremental as it builds on existing agentic and template-based approaches.
The paper tackles query-focused table summarization by introducing FACTS, an agentic workflow that generates offline templates (SQL queries and Jinja2 templates) for reusable, fast, and privacy-compliant summaries, achieving consistent outperformance over baseline methods on benchmarks.
Query-focused table summarization requires generating natural language summaries of tabular data conditioned on a user query, enabling users to access insights beyond fact retrieval. Existing approaches face key limitations: table-to-text models require costly fine-tuning and struggle with complex reasoning, prompt-based LLM methods suffer from token-limit and efficiency issues while exposing sensitive data, and prior agentic pipelines often rely on decomposition, planning, or manual templates that lack robustness and scalability. To mitigate these issues, we introduce an agentic workflow, FACTS, a Fast, Accurate, and Privacy-Compliant Table Summarization approach via Offline Template Generation. FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs. Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods, establishing it as a practical solution for real-world query-focused table summarization.