CLMar 29

ESGLens: An LLM-Based RAG Framework for Interactive ESG Report Analysis and Score Prediction

arXiv:2604.1977943.4

Predicted impact top 98% in CL · last 90 daysOriginality Synthesis-oriented

AI Analysis

This addresses the challenge of ESG report analysis for investors and analysts, but it is incremental as it applies existing RAG and LLM methods to a specific domain with modest improvements.

The paper tackles the problem of analyzing ESG reports, which are costly and inconsistent to handle manually, by presenting ESGLens, a framework that automates information extraction, question-answering, and score prediction, achieving a Pearson correlation of 0.48 for ESG score prediction against reference scores.

Environmental, Social, and Governance (ESG) reports are central to investment decision-making, yet their length, heterogeneous content, and lack of standardized structure make manual analysis costly and inconsistent. We present ESGLens, a proof-of-concept framework combining retrieval-augmented generation (RAG) with prompt-engineered extraction to automate three tasks: (1)~structured information extraction guided by Global Reporting Initiative (GRI) standards, (2)~interactive question-answering with source traceability, and (3)~ESG score prediction via regression on LLM-generated embeddings. ESGLens is purpose-built for the domain: a report-processing module segments heterogeneous PDF content into typed chunks (text, tables, charts); a GRI-guided extraction module retrieves and synthesizes information aligned with specific standards; and a scoring module embeds extracted summaries and feeds them to a regression model trained against London Stock Exchange Group (LSEG) reference scores. We evaluate the framework on approximately 300 reports from companies in the QQQ, S\&P~500, and Russell~1000 indices (fiscal year 2022). Among three embedding methods (ChatGPT, BERT, RoBERTa) and two regressors (Neural Network, LightGBM), ChatGPT embeddings with a Neural Network achieve a Pearson correlation of 0.48 ($R^{2} \approx 0.23$) against LSEG ground-truth scores -- a modest but statistically meaningful signal given the ${\sim}300$-report training set and restriction to the environmental pillar. A traceability audit shows that 8 of 10 extracted claims verify against the source document, with two failures attributable to few-shot example leakage. We discuss limitations including dataset size and restriction to environmental indicators, and release the code to support reproducibility.

View on arXiv PDF

Similar