ESGBench: A Benchmark for Explainable ESG Question Answering in Corporate Sustainability Reports
This work addresses the need for transparent and accountable AI in ESG analysis, providing a tool for researchers and practitioners, though it is incremental as it builds on existing benchmarking methods.
The authors tackled the problem of evaluating explainable ESG question answering systems by introducing ESGBench, a benchmark dataset and framework using corporate sustainability reports, and analyzed state-of-the-art LLMs to highlight challenges in factual consistency, traceability, and domain alignment.
We present ESGBench, a benchmark dataset and evaluation framework designed to assess explainable ESG question answering systems using corporate sustainability reports. The benchmark consists of domain-grounded questions across multiple ESG themes, paired with human-curated answers and supporting evidence to enable fine-grained evaluation of model reasoning. We analyze the performance of state-of-the-art LLMs on ESGBench, highlighting key challenges in factual consistency, traceability, and domain alignment. ESGBench aims to accelerate research in transparent and accountable ESG-focused AI systems.