Domain-Specific Query Understanding for Automotive Applications: A Modular and Scalable Approach
This work addresses the challenge of precise query interpretation for automotive systems, which is incremental as it builds on existing methods by optimizing for domain-specific complexities.
The paper tackled the problem of query understanding in the automotive domain by developing a modular two-step system that decomposes classification and entity extraction, achieving substantial gains in efficiency and accuracy compared to a single-step approach.
Despite the growing prevalence of large language models (LLMs) in domain-specific applications, the challenge of query understanding in the automotive sector still remains underexplored. This domain presents unique complexities due to its specialized vocabulary and the diverse range of user intents it encompasses. Unlike general-purpose assistants, automotive systems must precisely interpret user queries and route them to appropriate underlying tool, each designed to fulfill a distinct task such as part recommendations, repair procedures, or regulatory lookups. Moreover, these systems must extract structured inputs precisely aligned with the schema required by each tool. In this study, we present a novel two-step system for domain-specific query interpretation in the automotive context that achieves an effective balance between responsiveness, reliability, and scalability. Our initial single-step approach, which jointly performed classification and entity extraction, exhibited moderate performance and higher latency. By decomposing the task into a lightweight classification stage followed by targeted entity extraction using smaller, specialized prompts, our system achieves substantial gains in both efficiency and accuracy. Due to the niche nature of the automotive domain, we also curated a high-quality dataset by combining manually annotated and synthetically generated samples, all reviewed by domain experts. Overall, our findings demonstrate that decomposing query understanding into modular subtasks leads to a scalable, accurate, and latency-efficient solution. This approach establishes a strong ground for practical deployment in real-world automotive query understanding systems.