CLFeb 18, 2025

On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation

Rune Birkmose, Nathan Mørkeberg Reece, Esben Hofstedt Norvin, Johannes Bjerva, Mike Zhang

arXiv:2502.12923v219.914 citationsh-index: 20Proceedings of the Tenth Workshop on Noisy and User-generated Text

Originality Synthesis-oriented

AI Analysis

It addresses the need for unified, on-device AI in home automation without specialized hardware, though it is incremental in applying existing quantization techniques to this domain.

This paper tackled the problem of using fine-tuned Large Language Models (LLMs) for both slot/intent detection and response generation in smart home assistants on resource-limited edge hardware, achieving around 80-86% accuracy and 5-6 seconds inference time per query.

This paper investigates whether Large Language Models (LLMs), fine-tuned on synthetic but domain-representative data, can perform the twofold task of (i) slot and intent detection and (ii) natural language response generation for a smart home assistant, while running solely on resource-limited, CPU-only edge hardware. We fine-tune LLMs to produce both JSON action calls and text responses. Our experiments show that 16-bit and 8-bit quantized variants preserve high accuracy on slot and intent detection and maintain strong semantic coherence in generated text, while the 4-bit model, while retaining generative fluency, suffers a noticeable drop in device-service classification accuracy. Further evaluations on noisy human (non-synthetic) prompts and out-of-domain intents confirm the models' generalization ability, obtaining around 80--86\% accuracy. While the average inference time is 5--6 seconds per query -- acceptable for one-shot commands but suboptimal for multi-turn dialogue -- our results affirm that an on-device LLM can effectively unify command interpretation and flexible response generation for home automation without relying on specialized hardware.

View on arXiv PDF

Similar