SD CL MA ASMay 2, 2024

Sequence-to-sequence models in peer-to-peer learning: A practical application

arXiv:2406.02565v1h-index: 4

Originality Synthesis-oriented

AI Analysis

This is an incremental study exploring decentralized ASR for peer-to-peer learning scenarios.

This paper tackled the problem of applying sequence-to-sequence models for automatic speech recognition in peer-to-peer learning environments, finding that decentralized training with 55 agents resulted in word error rates of 87-92% on UserLibri and 52-56% on LJ Speech, compared to 84% and 38% in centralized settings.

This paper explores the applicability of sequence-to-sequence (Seq2Seq) models based on LSTM units for Automatic Speech Recognition (ASR) task within peer-to-peer learning environments. Leveraging two distinct peer-to-peer learning methods, the study simulates the learning process of agents and evaluates their performance in ASR task using two different ASR datasets. In a centralized training setting, utilizing a scaled-down variant of the Deep Speech 2 model, a single model achieved a Word Error Rate (WER) of 84\% when trained on the UserLibri dataset, and 38\% when trained on the LJ Speech dataset. Conversely, in a peer-to-peer learning scenario involving 55 agents, the WER ranged from 87\% to 92\% for the UserLibri dataset, and from 52\% to 56\% for the LJ Speech dataset. The findings demonstrate the feasibility of employing Seq2Seq models in decentralized settings, albeit with slightly higher Word Error Rates (WER) compared to centralized training methods.

View on arXiv PDF

Similar