Evaluating search engines and defining a consensus implementation
This addresses the issue of bias and inconsistency in search results for users and content providers, but it is incremental as it builds on existing evaluation methods.
The paper tackles the problem of varying search engine outputs by proposing a method to define consensual relevance based on page visibility across multiple engines, resulting in an implementation and analysis of search engine scores and a consensus engine.
Different search engines provide different outputs for the same keyword. This may be due to different definitions of relevance, and/or to different knowledge/anticipation of users' preferences, but rankings are also suspected to be biased towards own content, which may prejudicial to other content providers. In this paper, we make some initial steps toward a rigorous comparison and analysis of search engines, by proposing a definition for a consensual relevance of a page with respect to a keyword, from a set of search engines. More specifically, we look at the results of several search engines for a sample of keywords, and define for each keyword the visibility of a page based on its ranking over all search engines. This allows to define a score of the search engine for a keyword, and then its average score over all keywords. Based on the pages visibility, we can also define the consensus search engine as the one showing the most visible results for each keyword. We have implemented this model and present an analysis of the results.