CL HC SD ASJun 29, 2022

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

Daniel Bermuth, Alexander Poeppel, Wolfgang Reif

arXiv:2206.14589v10.88 citationsh-index: 35

Originality Incremental advance

AI Analysis

This provides a simple, efficient solution for SLU tasks, benefiting developers and systems needing quick deployment, though it appears incremental as it builds on existing transducer and speech-to-text methods.

The paper tackles the problem of extracting intents and entities from audio commands in Spoken Language Understanding (SLU) by embedding them into Finite State Transducers combined with a pretrained Speech-to-Text model, enabling fast, language-independent model building without additional training, and it outperforms more resource-intensive methods on benchmarks.

In Spoken Language Understanding (SLU) the task is to extract important information from audio commands, like the intent of what a user wants the system to do and special entities like locations or numbers. This paper presents a simple method for embedding intents and entities into Finite State Transducers, and, in combination with a pretrained general-purpose Speech-to-Text model, allows building SLU-models without any additional training. Building those models is very fast and only takes a few seconds. It is also completely language independent. With a comparison on different benchmarks it is shown that this method can outperform multiple other, more resource demanding SLU approaches.

View on arXiv PDF

Similar