IRSep 2, 2016

Incorporating Clicks, Attention and Satisfaction into a Search Engine Result Page Evaluation Model

arXiv:1609.00552v126 citations
Originality Incremental advance
AI Analysis

This addresses the problem of inaccurate evaluation metrics for modern SERPs, which is crucial for search engine developers and researchers, though it is an incremental improvement over existing models by incorporating additional user behavior factors.

The paper tackled the challenge of evaluating search engine result pages (SERPs) with non-linear layouts and features like one-box answers, which can lead to misleading metrics such as 'good abandonments' and unaccounted user attention patterns. They proposed the CAS model that jointly captures clicks, attention, and satisfaction, showing it predicts user actions and satisfaction more accurately than click-based models, with better agreement in user-reported satisfaction.

Modern search engine result pages often provide immediate value to users and organize information in such a way that it is easy to navigate. The core ranking function contributes to this and so do result snippets, smart organization of result blocks and extensive use of one-box answers or side panels. While they are useful to the user and help search engines to stand out, such features present two big challenges for evaluation. First, the presence of such elements on a search engine result page (SERP) may lead to the absence of clicks, which is, however, not related to dissatisfaction, so-called "good abandonments." Second, the non-linear layout and visual difference of SERP items may lead to non-trivial patterns of user attention, which is not captured by existing evaluation metrics. In this paper we propose a model of user behavior on a SERP that jointly captures click behavior, user attention and satisfaction, the CAS model, and demonstrate that it gives more accurate predictions of user actions and self-reported satisfaction than existing models based on clicks alone. We use the CAS model to build a novel evaluation metric that can be applied to non-linear SERP layouts and that can account for the utility that users obtain directly on a SERP. We demonstrate that this metric shows better agreement with user-reported satisfaction than conventional evaluation metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes