GRAB: A Risk Taxonomy--Grounded Benchmark for Unsupervised Topic Discovery in Financial Disclosures
This provides a standardized benchmark for researchers and practitioners in finance to evaluate unsupervised topic discovery methods on risk disclosures, addressing a gap in oversight and investment analysis.
The authors tackled the lack of a public benchmark for evaluating unsupervised topic models in financial risk disclosures by creating GRAB, a finance-specific benchmark with 1.61M sentences from 8,247 filings, which uses a risk taxonomy of 193 terms to produce labels without manual annotation and enables standardized comparison across models.
Risk categorization in 10-K risk disclosures matters for oversight and investment, yet no public benchmark evaluates unsupervised topic models for this task. We present GRAB, a finance-specific benchmark with 1.61M sentences from 8,247 filings and span-grounded sentence labels produced without manual annotation by combining FinBERT token attention, YAKE keyphrase signals, and taxonomy-aware collocation matching. Labels are anchored in a risk taxonomy mapping 193 terms to 21 fine-grained types nested under five macro classes; the 21 types guide weak supervision, while evaluation is reported at the macro level. GRAB unifies evaluation with fixed dataset splits and robust metrics--Accuracy, Macro-F1, Topic BERTScore, and the entropy-based Effective Number of Topics. The dataset, labels, and code enable reproducible, standardized comparison across classical, embedding-based, neural, and hybrid topic models on financial disclosures.