CLFeb 2

Sinhala Physical Common Sense Reasoning Dataset for Global PIQA

arXiv:2602.02207v1h-index: 14
Originality Synthesis-oriented
AI Analysis

This addresses a data gap for Sinhala speakers in Sri Lanka, but it is incremental as it extends an existing benchmark to a new language.

The paper tackled the lack of a Sinhala physical common sense reasoning dataset by creating the first one for Global PIQA, resulting in 110 human-created and verified samples with prompts, correct answers, and wrong answers.

This paper presents the first-ever Sinhala physical common sense reasoning dataset created as part of Global PIQA. It contains 110 human-created and verified data samples, where each sample consists of a prompt, the corresponding correct answer, and a wrong answer. Most of the questions refer to the Sri Lankan context, where Sinhala is an official language.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes