Interpreting Answers to Yes-No Questions in User-Generated Content
This addresses a specific problem in natural language processing for social media analysis, but it is incremental as it focuses on dataset creation and benchmarking without proposing a new solution.
The paper tackled the challenge of interpreting yes-no question answers in social media, where keywords are rare and often misleading, by creating a corpus of 4,442 Twitter question-answer pairs and analyzing linguistic patterns, showing that large language models perform poorly even with fine-tuning.
Interpreting answers to yes-no questions in social media is difficult. Yes and no keywords are uncommon, and the few answers that include them are rarely to be interpreted what the keywords suggest. In this paper, we present a new corpus of 4,442 yes-no question-answer pairs from Twitter. We discuss linguistic characteristics of answers whose interpretation is yes or no, as well as answers whose interpretation is unknown. We show that large language models are far from solving this problem, even after fine-tuning and blending other corpora for the same problem but outside social media.