CLMar 3, 2020

Seshat: A tool for managing and verifying annotation campaigns of audio data

arXiv:2003.01472v2998 citationsHas Code
AI Analysis

This tool addresses inefficiencies in annotation campaigns for audio data, particularly for researchers and practitioners in speech processing, but it is incremental as it builds on existing annotation management concepts.

The authors tackled the problem of managing and verifying annotations for speech corpora by introducing Seshat, an open-source tool that customizes and checks annotations, resulting in automated computation of inter-annotator agreement using the γ measure.

We introduce Seshat, a new, simple and open-source software to efficiently manage annotations of speech corpora. The Seshat software allows users to easily customise and manage annotations of large audio corpora while ensuring compliance with the formatting and naming conventions of the annotated output files. In addition, it includes procedures for checking the content of annotations following specific rules that can be implemented in personalised parsers. Finally, we propose a double-annotation mode, for which Seshat computes automatically an associated inter-annotator agreement with the $γ$ measure taking into account the categorisation and segmentation discrepancies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes