HCIRApr 24, 2020

TeamTat: a collaborative text annotation tool

arXiv:2004.11894v171 citations
AI Analysis

This tool addresses the problem of managing collaborative annotation projects for domain experts in biomedical text mining, but it is incremental as it builds on existing annotation tools by adding team and project management features.

The authors tackled the challenge of inefficient and limited support for team-based text annotation in biomedical literature by developing TeamTat, a web-based tool that facilitates multi-user annotation with features like project management, image display, and quality assessment, resulting in a tool that supports the entire annotation life cycle and outputs in BioC format.

Manually annotated data is key to developing text-mining and information-extraction algorithms. However, human annotation requires considerable time, effort and expertise. Given the rapid growth of biomedical literature, it is paramount to build tools that facilitate speed and maintain expert quality. While existing text annotation tools may provide user-friendly interfaces to domain experts, limited support is available for image display, project management, and multi-user team annotation. In response, we developed TeamTat (teamtat.org), a web-based annotation tool (local setup available), equipped to manage team annotation projects engagingly and efficiently. TeamTat is a novel tool for managing multi-user, multi-label document annotation, reflecting the entire production life cycle. Project managers can specify annotation schema for entities and relations and select annotator(s) and distribute documents anonymously to prevent bias. Document input format can be plain text, PDF or BioC, (uploaded locally or automatically retrieved from PubMed or PMC), and output format is BioC with inline annotations. TeamTat displays figures from the full text for the annotators convenience. Multiple users can work on the same document independently in their workspaces, and the team manager can track task completion. TeamTat provides corpus-quality assessment via inter-annotator agreement statistics, and a user-friendly interface convenient for annotation review and inter-annotator disagreement resolution to improve corpus quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes