CLAIJun 21, 2019

Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction

arXiv:1906.09317v11115 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge for the NLP community in managing the abundance of research results, though it is incremental as it builds on existing extraction methods.

The paper tackles the problem of tracking research progress across numerous tasks and datasets by developing a framework (TDMS-IE) to automatically extract tasks, datasets, metrics, and scores from NLP papers for leaderboard construction, and it reports that the model outperforms baselines by a large margin.

While the fast-paced inception of novel tasks and new datasets helps foster active research in a community towards interesting directions, keeping track of the abundance of research activity in different areas on different datasets is likely to become increasingly difficult. The community could greatly benefit from an automatic system able to summarize scientific results, e.g., in the form of a leaderboard. In this paper we build two datasets and develop a framework (TDMS-IE) aimed at automatically extracting task, dataset, metric and score from NLP papers, towards the automatic construction of leaderboards. Experiments show that our model outperforms several baselines by a large margin. Our model is a first step towards automatic leaderboard construction, e.g., in the NLP domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes