Soeren Auer

CL
h-index8
4papers
16citations
Novelty31%
AI Score29

4 Papers

CLSep 26, 2025
NFDI4DS Shared Tasks for Scholarly Document Processing

Raia Abu Ahmad, Rana Abdulla, Tilahun Abedissa Taffa et al.

Shared tasks are powerful tools for advancing research through community-based standardised evaluation. As such, they play a key role in promoting findable, accessible, interoperable, and reusable (FAIR), as well as transparent and reproducible research practices. This paper presents an updated overview of twelve shared tasks developed and hosted under the German National Research Data Infrastructure for Data Science and Artificial Intelligence (NFDI4DS) consortium, covering a diverse set of challenges in scholarly document processing. Hosted at leading venues, the tasks foster methodological innovations and contribute open-access datasets, models, and tools for the broader research community, which are integrated into the consortium's research data infrastructure.

CLOct 18, 2021
Ranking Facts for Explaining Answers to Elementary Science Questions

Jennifer D'Souza, Isaiah Onando Mulang', Soeren Auer

In multiple-choice exams, students select one answer from among typically four choices and can explain why they made that particular choice. Students are good at understanding natural language questions and based on their domain knowledge can easily infer the question's answer by 'connecting the dots' across various pertinent facts. Considering automated reasoning for elementary science question answering, we address the novel task of generating explanations for answers from human-authored facts. For this, we examine the practically scalable framework of feature-rich support vector machines leveraging domain-targeted, hand-crafted features. Explanations are created from a human-annotated set of nearly 5,000 candidate facts in the WorldTree corpus. Our aim is to obtain better matches for valid facts of an explanation for the correct answer of a question over the available fact candidates. To this end, our features offer a comprehensive linguistic and semantic unification paradigm. The machine learning problem is the preference ordering of facts, for which we test pointwise regression versus pairwise learning-to-rank. Our contributions are: (1) a case study in which two preference ordering approaches are systematically compared; (2) it is a practically competent approach that can outperform some variants of BERT-based reranking models; and (3) the human-engineered features make it an interpretable machine learning model for the task.

IRSep 1, 2021
Pattern-based Acquisition of Scientific Entities from Scholarly Article Titles

Jennifer D'Souza, Soeren Auer

We describe a rule-based approach for the automatic acquisition of salient scientific entities from Computational Linguistics (CL) scholarly article titles. Two observations motivated the approach: (i) noting salient aspects of an article's contribution in its title; and (ii) pattern regularities capturing the salient terms that could be expressed in a set of rules. Only those lexico-syntactic patterns were selected that were easily recognizable, occurred frequently, and positionally indicated a scientific entity type. The rules were developed on a collection of 50,237 CL titles covering all articles in the ACL Anthology. In total, 19,799 research problems, 18,111 solutions, 20,033 resources, 1,059 languages, 6,878 tools, and 21,687 methods were extracted at an average precision of 75%.

CRSep 8, 2016
Disintermediation of Inter-Blockchain Transactions

S. Matthew English, Fabrizio Orlandi, Soeren Auer

Different versions of peer-to-peer electronic cash exist as data represented by separate blockchains. Payments between such systems cannot be sent directly from one party to another without going through a financial institution. Bitcoin provided part of the solution but its utility is limited to intra-blockchain transactions. The benefits are lost if a trusted third party is required to execute inter-blockchain transactions. We propose a solution to the inter-blockchain transaction problem using the same fundamental principles of Bitcoin. The protocol is described by the Uberledger framework, a hierarchical meta-blockchain layer that encapsulates information regarding the fidelity of peer-to-peer transaction facilitators.