CLApr 1

A Japanese Benchmark for Evaluating Social Bias in Reasoning Based on Attribution Theory

arXiv:2604.0056864.8h-index: 4
AI Analysis

This work addresses the need for culturally specific bias evaluation in LLMs for Japanese users, though it is incremental as it builds on prior datasets and focuses on a specific domain.

The authors tackled the problem of evaluating social biases in Large Language Models (LLMs) within Japanese cultural contexts by constructing a new dataset, JUBAKU-v2, based on attribution theory, which detected performance differences across models more sensitively than existing benchmarks.

In enhancing the fairness of Large Language Models (LLMs), evaluating social biases rooted in the cultural contexts of specific linguistic regions is essential. However, most existing Japanese benchmarks heavily rely on translating English data, which does not necessarily provide an evaluation suitable for Japanese culture. Furthermore, they only evaluate bias in the conclusion, failing to capture biases lurking in the reasoning. In this study, based on attribution theory in social psychology, we constructed a new dataset, ``JUBAKU-v2,'' which evaluates the bias in attributing behaviors to in-groups and out-groups within reasoning while fixing the conclusion. This dataset consists of 216 examples reflecting cultural biases specific to Japan. Experimental results verified that it can detect performance differences across models more sensitively than existing benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes