CLFeb 18, 2025

On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation

arXiv:2502.12923v214 citationsh-index: 20Proceedings of the Tenth Workshop on Noisy and User-generated Text
Originality Synthesis-oriented
AI Analysis

It addresses the need for unified, on-device AI in home automation without specialized hardware, though it is incremental in applying existing quantization techniques to this domain.

This paper tackled the problem of using fine-tuned Large Language Models (LLMs) for both slot/intent detection and response generation in smart home assistants on resource-limited edge hardware, achieving around 80-86% accuracy and 5-6 seconds inference time per query.

This paper investigates whether Large Language Models (LLMs), fine-tuned on synthetic but domain-representative data, can perform the twofold task of (i) slot and intent detection and (ii) natural language response generation for a smart home assistant, while running solely on resource-limited, CPU-only edge hardware. We fine-tune LLMs to produce both JSON action calls and text responses. Our experiments show that 16-bit and 8-bit quantized variants preserve high accuracy on slot and intent detection and maintain strong semantic coherence in generated text, while the 4-bit model, while retaining generative fluency, suffers a noticeable drop in device-service classification accuracy. Further evaluations on noisy human (non-synthetic) prompts and out-of-domain intents confirm the models' generalization ability, obtaining around 80--86\% accuracy. While the average inference time is 5--6 seconds per query -- acceptable for one-shot commands but suboptimal for multi-turn dialogue -- our results affirm that an on-device LLM can effectively unify command interpretation and flexible response generation for home automation without relying on specialized hardware.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes