The Generalized Cascade Click Model: A Unified Framework for Estimating Click Models
This work addresses the problem of simplifying click model estimation for researchers and practitioners in information retrieval, though it is incremental as it builds on existing IO-HMM frameworks.
The paper tackles the complexity of deriving expectation-maximization (EM) procedures for click models in search engines by proposing the Generalized Cascade Model (GCM), which allows many existing click models to be optimized as Input-Output Hidden Markov Models (IO-HMMs), reducing the need for manual derivation of the E-step and implemented in the gecasmo Python package.
Given the vital importance of search engines to find digital information, there has been much scientific attention on how users interact with search engines, and how such behavior can be modeled. Many models on user - search engine interaction, which in the literature are known as click models, come in the form of Dynamic Bayesian Networks. Although many authors have used the resemblance between the different click models to derive estimation procedures for these models, in particular in the form of expectation maximization (EM), still this commonly requires considerable work, in particular when it comes to deriving the E-step. What we propose in this paper, is that this derivation is commonly unnecessary: many existing click models can in fact, under certain assumptions, be optimized as they were Input-Output Hidden Markov Models (IO-HMMs), for which the forward-backward equations immediately provide this E-step. To arrive at that conclusion, we will present the Generalized Cascade Model (GCM) and show how this model can be estimated using the IO-HMM EM framework, and provide two examples of how existing click models can be mapped to GCM. Our GCM approach to estimating click models has also been implemented in the gecasmo Python package.