CYCLIRLGMar 3, 2024

SyllabusQA: A Course Logistics Question Answering Dataset

arXiv:2403.14666v229 citationsh-index: 10Has CodeACL
Originality Synthesis-oriented
AI Analysis

This addresses the problem of repetitive logistics questions for instructors and students, but it is incremental as it focuses on dataset creation and benchmarking.

The authors tackled the lack of public datasets for course logistics question answering by introducing SyllabusQA, a dataset with 5,078 question-answer pairs from 63 real syllabi, and found that automated methods still lag behind humans in fact precision despite performing well on textual similarity metrics.

Automated teaching assistants and chatbots have significant potential to reduce the workload of human instructors, especially for logistics-related question answering, which is important to students yet repetitive for instructors. However, due to privacy concerns, there is a lack of publicly available datasets. We introduce SyllabusQA, an open-source dataset with 63 real course syllabi covering 36 majors, containing 5,078 open-ended course logistics-related question-answer pairs that are diverse in both question types and answer formats. Since many logistics-related questions contain critical information like the date of an exam, it is important to evaluate the factuality of answers. We benchmark several strong baselines on this task, from large language model prompting to retrieval-augmented generation. We introduce Fact-QA, an LLM-based (GPT-4) evaluation metric to evaluate the factuality of predicted answers. We find that despite performing close to humans on traditional metrics of textual similarity, there remains a significant gap between automated approaches and humans in terms of fact precision.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes