Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies
This work addresses the challenge of scalable and adaptable intent analysis for web search services, particularly for AI-driven chat, though it is incremental as it builds on existing LLM and human-in-the-loop methods.
The paper tackles the problem of analyzing user intents in log data for web search services, which is difficult due to the expense and inflexibility of existing labeling methods, by proposing a novel solution using large language models (LLMs) with human-in-the-loop validation to generate, validate, and apply user intent taxonomies, demonstrating effectiveness by uncovering new insights from Microsoft Bing search and chat logs.
Log data can reveal valuable information about how users interact with Web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especially for emerging forms of Web search such as AI-driven chat. To understand user intents from log data, we need a way to label them with meaningful categories that capture their diversity and dynamics. Existing methods rely on manual or machine-learned labeling, which are either expensive or inflexible for large and dynamic datasets. We propose a novel solution using large language models (LLMs), which can generate rich and relevant concepts, descriptions, and examples for user intents. However, using LLMs to generate a user intent taxonomy and apply it for log analysis can be problematic for two main reasons: (1) such a taxonomy is not externally validated; and (2) there may be an undesirable feedback loop. To address this, we propose a new methodology with human experts and assessors to verify the quality of the LLM-generated taxonomy. We also present an end-to-end pipeline that uses an LLM with human-in-the-loop to produce, refine, and apply labels for user intent analysis in log data. We demonstrate its effectiveness by uncovering new insights into user intents from search and chat logs from the Microsoft Bing commercial search engine. The proposed work's novelty stems from the method for generating purpose-driven user intent taxonomies with strong validation. This method not only helps remove methodological and practical bottlenecks from intent-focused research, but also provides a new framework for generating, validating, and applying other kinds of taxonomies in a scalable and adaptable way with reasonable human effort.