LG SD ASJul 9, 2024

Knowledge boosting during low-latency inference

Vidya Srinivas, Malek Itani, Tuochao Chen, Sefik Emre Eskimez, Takuya Yoshioka, Shyamnath Gollakota

arXiv:2407.11055v39.24 citationsh-index: 49Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of deploying AI models on resource-constrained edge devices for real-time applications like speech processing, offering an incremental improvement in model collaboration.

The paper tackles the problem of enabling low-latency, streaming applications on edge devices by transferring knowledge from a large remote model to a small on-device model, despite communication delays. It proposes knowledge boosting, which allows the large model to operate on time-delayed input, and shows gains in speech separation and enhancement tasks with delays up to 48 ms, particularly when the performance gap between models is wide.

Models for low-latency, streaming applications could benefit from the knowledge capacity of larger models, but edge devices cannot run these models due to resource constraints. A possible solution is to transfer hints during inference from a large model running remotely to a small model running on-device. However, this incurs a communication delay that breaks real-time requirements and does not guarantee that both models will operate on the same data at the same time. We propose knowledge boosting, a novel technique that allows a large model to operate on time-delayed input during inference, while still boosting small model performance. Using a streaming neural network that processes 8 ms chunks, we evaluate different speech separation and enhancement tasks with communication delays of up to six chunks or 48 ms. Our results show larger gains where the performance gap between the small and large models is wide, demonstrating a promising method for large-small model collaboration for low-latency applications. Code, dataset, and audio samples available at https://knowledgeboosting.cs.washington.edu/.

View on arXiv PDF Code

Similar