Jurica Ševa

h-index9

3papers

1,039citations

Novelty15%

AI Score21

Ranked #181,301 of 194,257 authors (top 93%)#29,627 in CL (top 96%)

3 Papers

0.5CLOct 13, 2020Code

Annotationsaurus: A Searchable Directory of Annotation Tools

Mariana Neves, Jurica Seva

Manual annotation of textual documents is a necessary task when constructing benchmark corpora for training and evaluating machine learning algorithms. We created a comprehensive directory of annotation tools that currently includes 93 tools. We analyzed the tools over a set of 31 features and implemented simple scripts and a Web application that filters the tools based on chosen criteria. We present two use cases using the directory and propose ideas for its maintenance. The directory, source codes for scripts, and link to the Web application are available at: https://github.com/mariananeves/annotation-tools

8.6DLApr 25, 2020

QURATOR: Innovative Technologies for Content and Data Curation

Georg Rehm, Peter Bourgonje, Stefanie Hegele et al.

In all domains and sectors, the demand for intelligent systems to support the processing and generation of digital content is rapidly increasing. The availability of vast amounts of content and the pressure to publish new content quickly and in rapid succession requires faster, more efficient and smarter processing and generation methods. With a consortium of ten partners from research and industry and a broad range of expertise in AI, Machine Learning and Language Technologies, the QURATOR project, funded by the German Federal Ministry of Education and Research, develops a sustainable and innovative technology platform that provides services to support knowledge workers in various industries to address the challenges they face when curating digital content. The project's vision and ambition is to establish an ecosystem for content curation technologies that significantly pushes the current state of the art and transforms its region, the metropolitan area Berlin-Brandenburg, into a global centre of excellence for curation technologies.

31.1CLMar 29, 2020

Named Entities in Medical Case Reports: Corpus and Experiments

Sarah Schulz, Jurica Ševa, Samuel Rodriguez et al.

We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central's open access library. In the case reports, we annotate cases, conditions, findings, factors and negation modifiers. Moreover, where applicable, we annotate relations between these entities. As such, this is the first corpus of this kind made available to the scientific community in English. It enables the initial investigation of automatic information extraction from case reports through tasks like Named Entity Recognition, Relation Extraction and (sentence/paragraph) relevance detection. Additionally, we present four strong baseline systems for the detection of medical entities made available through the annotated dataset.