CLSDASSep 7, 2023

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

arXiv:2309.04031v24 citationsh-index: 40
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving ASR accuracy by leveraging diverse linguistic cues, though it appears incremental as it builds on existing transfer techniques.

The authors tackled the problem of incorporating linguistic knowledge into end-to-end automatic speech recognition (ASR) systems by transferring multiple representations from large language models (LLMs), showing that this approach is an effective alternative to using only a single representation.

Transferring the knowledge of large language models (LLMs) is a promising technique to incorporate linguistic knowledge into end-to-end automatic speech recognition (ASR) systems. However, existing works only transfer a single representation of LLM (e.g. the last layer of pretrained BERT), while the representation of a text is inherently non-unique and can be obtained variously from different layers, contexts and models. In this work, we explore a wide range of techniques to obtain and transfer multiple representations of LLMs into a transducer-based ASR system. While being conceptually simple, we show that transferring multiple representations of LLMs can be an effective alternative to transferring only a single representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes