Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach
This work addresses the need for more naturalistic task representations in search systems to improve user experience, though it is incremental by building on existing task extraction methodologies.
The authors tackled the problem of representing search tasks as hierarchical structures rather than flat ones, proposing a Bayesian nonparametric model that extracts task and subtask hierarchies from query logs, and demonstrated its effectiveness through quantitative and crowdsourced experiments.
A significant amount of search queries originate from some real world information need or tasks. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, significant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well as providing the end user with better query suggestions, for better recommendations, for satisfaction prediction, and for improved personalization in terms of tasks. Most existing task extraction methodologies focus on representing tasks as flat structures. However, tasks often tend to have multiple subtasks associated with them and a more naturalistic representation of tasks would be in terms of a hierarchy, where each task can be composed of multiple (sub)tasks. To this end, we propose an efficient Bayesian nonparametric model for extracting hierarchies of such tasks \& subtasks. We evaluate our method based on real world query log data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies.