Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages
This addresses the problem of English-centric LLM evaluation for linguistically diverse regions like India, though it is incremental as it builds on existing benchmarks and tools.
The authors tackled the lack of evaluation frameworks for large language models in Indian languages by introducing EKA-EVAL, a comprehensive suite that integrates over 35 benchmarks and achieved the highest participant ratings in four out of five categories compared to existing baselines.
The rapid advancement of Large Language Models (LLMs) has intensified the need for evaluation frameworks that address the requirements of linguistically diverse regions, such as India, and go beyond English-centric benchmarks. We introduce EKA-EVAL, a unified evaluation framework that integrates over 35+ benchmarks (including 10 Indic benchmarks) across nine major evaluation categories. The framework provides broader coverage than existing Indian language evaluation tools, offering 11 core capabilities through a modular architecture, seamless integration with Hugging Face and proprietary models, and plug-and-play usability. As the first end-to-end suite for scalable, multilingual LLM benchmarking, the framework combines extensive benchmarks, modular workflows, and dedicated support for low-resource Indian languages to enable inclusive assessment of LLM capabilities across diverse domains. We conducted extensive comparisons against five existing baselines, demonstrating that EKA-EVAL achieves the highest participant ratings in four out of five categories. The framework is open-source and publicly available at: https://github.com/lingo-iitgn/eka-eval.