SD CL ASJun 9, 2020

audino: A Modern Annotation Tool for Audio and Speech

Manraj Singh Grover, Pakhi Bamdev, Ratin Kumar Brala, Yaman Kumar, Mika Hama, Rajiv Ratn Shah

arXiv:2006.05236v28.415 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This tool addresses the need for efficient audio annotation in research and industry, but it is incremental as it builds on existing annotation tool concepts.

The authors introduced audino, a collaborative annotation tool for audio and speech that enables temporal segmentation, labeling, and transcription with features like an admin dashboard and API, and it supports various tasks such as speech recognition and emotion recognition under an MIT license.

In this paper, we introduce a collaborative and modern annotation tool for audio and speech: audino. The tool allows annotators to define and describe temporal segmentation in audios. These segments can be labelled and transcribed easily using a dynamically generated form. An admin can centrally control user roles and project assignment through the admin dashboard. The dashboard also enables describing labels and their values. The annotations can easily be exported in JSON format for further analysis. The tool allows audio data and their corresponding annotations to be uploaded and assigned to a user through a key-based API. The flexibility available in the annotation tool enables annotation for Speech Scoring, Voice Activity Detection (VAD), Speaker Diarisation, Speaker Identification, Speech Recognition, Emotion Recognition tasks and more. The MIT open source license allows it to be used for academic and commercial projects.

View on arXiv PDF Code

Similar