Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
This addresses a data gap for Sinhala speakers in Sri Lanka, but it is incremental as it extends an existing benchmark to a new language.
The paper tackled the lack of a Sinhala physical common sense reasoning dataset by creating the first one for Global PIQA, resulting in 110 human-created and verified samples with prompts, correct answers, and wrong answers.
This paper presents the first-ever Sinhala physical common sense reasoning dataset created as part of Global PIQA. It contains 110 human-created and verified data samples, where each sample consists of a prompt, the corresponding correct answer, and a wrong answer. Most of the questions refer to the Sri Lankan context, where Sinhala is an official language.