CLHCLGSDASOct 26, 2022

End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English

arXiv:2211.07710v1289 citationsh-index: 135
Originality Incremental advance
AI Analysis

This addresses the challenge of high latency and error compounding in multi-component speech-to-intent systems for e-commerce customer support, though it is incremental as it builds on pre-trained ASR models.

The paper tackled the problem of automating customer support voicebots by developing an end-to-end speech-to-intent model for Hindi and English, which outperformed a conventional pipeline by a relative ~27% on the F1 score.

Automation of on-call customer support relies heavily on accurate and efficient speech-to-intent (S2I) systems. Building such systems using multi-component pipelines can pose various challenges because they require large annotated datasets, have higher latency, and have complex deployment. These pipelines are also prone to compounding errors. To overcome these challenges, we discuss an end-to-end (E2E) S2I model for customer support voicebot task in a bilingual setting. We show how we can solve E2E intent classification by leveraging a pre-trained automatic speech recognition (ASR) model with slight modification and fine-tuning on small annotated datasets. Experimental results show that our best E2E model outperforms a conventional pipeline by a relative ~27% on the F1 score.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes