SDAICLASAug 22, 2025

H-PRM: A Pluggable Hotword Pre-Retrieval Module for Various Speech Recognition Systems

arXiv:2508.18295v1h-index: 4CIKM
Originality Incremental advance
AI Analysis

This addresses the challenge of accurately recognizing domain-specific terms in ASR systems, though it appears incremental as it builds on existing models with a new module.

The paper tackles the problem of hotword customization in automatic speech recognition (ASR), where recognition rates drop with large-scale hotwords, by introducing a pluggable hotword pre-retrieval module (H-PRM) that improves post-recall rates when integrated into traditional models and Audio LLMs.

Hotword customization is crucial in ASR to enhance the accuracy of domain-specific terms. It has been primarily driven by the advancements in traditional models and Audio large language models (LLMs). However, existing models often struggle with large-scale hotwords, as the recognition rate drops dramatically with the number of hotwords increasing. In this paper, we introduce a novel hotword customization system that utilizes a hotword pre-retrieval module (H-PRM) to identify the most relevant hotword candidate by measuring the acoustic similarity between the hotwords and the speech segment. This plug-and-play solution can be easily integrated into traditional models such as SeACo-Paraformer, significantly enhancing hotwords post-recall rate (PRR). Additionally, we incorporate H-PRM into Audio LLMs through a prompt-based approach, enabling seamless customization of hotwords. Extensive testing validates that H-PRM can outperform existing methods, showing a new direction for hotword customization in ASR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes