Won: Establishing Best Practices for Korean Financial NLP
This work addresses the need for standardized evaluation and resources for Korean financial NLP, though it is incremental as it applies existing methods to a new domain and language.
The authors established the first open leaderboard for evaluating Korean large language models in finance, collecting 1,119 submissions over eight weeks on a closed benchmark covering multiple financial tasks, and released an open instruction dataset of 80k instances along with a fully open LLM called Won built using best practices derived from the evaluations.
In this work, we present the first open leaderboard for evaluating Korean large language models focused on finance. Operated for about eight weeks, the leaderboard evaluated 1,119 submissions on a closed benchmark covering five MCQA categories: finance and accounting, stock price prediction, domestic company analysis, financial markets, and financial agent tasks and one open-ended qa task. Building on insights from these evaluations, we release an open instruction dataset of 80k instances and summarize widely used training strategies observed among top-performing models. Finally, we introduce Won, a fully open and transparent LLM built using these best practices. We hope our contributions help advance the development of better and safer financial LLMs for Korean and other languages.