Kishor Patil

IR
h-index11
3papers
21citations
Novelty52%
AI Score31

3 Papers

IRNov 5, 2024
Bridging Personalization and Control in Scientific Personalized Search

Sheshera Mysore, Garima Dhanania, Kishor Patil et al.

Personalized search is a problem where models benefit from learning user preferences from per-user historical interaction data. The inferred preferences enable personalized ranking models to improve the relevance of documents for users. However, personalization is also seen as opaque in its use of historical interactions and is not amenable to users' control. Further, personalization limits the diversity of information users are exposed to. While search results may be automatically diversified this does little to address the lack of control over personalization. In response, we introduce a model for personalized search that enables users to control personalized rankings proactively. Our model, CtrlCE, is a novel cross-encoder model augmented with an editable memory built from users' historical interactions. The editable memory allows cross-encoders to be personalized efficiently and enables users to control personalized ranking. Next, because all queries do not require personalization, we introduce a calibrated mixing model which determines when personalization is necessary. This enables users to control personalization via their editable memory only when necessary. To thoroughly evaluate CtrlCE, we demonstrate its empirical performance in four domains of science, its ability to selectively request user control in a calibration evaluation of the mixing model, and the control provided by its editable memory in a user study.

IRSep 17, 2020
Online Algorithms for Estimating Change Rates of Web Pages

Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

A search engine maintains local copies of different web pages to provide quick search results. This local cache is kept up-to-date by a web crawler that frequently visits these different pages to track changes in them. Ideally, the local copy should be updated as soon as a page changes on the web. However, finite bandwidth availability and server restrictions limit how frequently different pages can be crawled. This brings forth the following optimization problem: maximize the freshness of the local cache subject to the crawling frequencies being within prescribed bounds. While tractable algorithms do exist to solve this problem, these either assume the knowledge of exact page change rates or use inefficient methods such as MLE for estimating the same. We address this issue here. We provide three novel schemes for online estimation of page change rates, all of which have extremely low running times per iteration. The first is based on the law of large numbers and the second on stochastic approximation. The third is an extension of the second and includes a heavy-ball momentum term. All these schemes only need partial information about the page change process, i.e., they only need to know if the page has changed or not since the last crawled instance. Our main theoretical results concern asymptotic convergence and convergence rates of these three schemes. In fact, our work is the first to show convergence of the original stochastic heavy-ball method when neither the gradient nor the noise variance is uniformly bounded. We also provide some numerical experiments (based on real and synthetic data) to demonstrate the superiority of our proposed estimators over existing ones such as MLE. We emphasize that our algorithms are also readily applicable to the synchronization of databases and network inventory management.

IRApr 5, 2020
Change Rate Estimation and Optimal Freshness in Web Page Crawling

Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

For providing quick and accurate results, a search engine maintains a local snapshot of the entire web. And, to keep this local cache fresh, it employs a crawler for tracking changes across various web pages. However, finite bandwidth availability and server restrictions impose some constraints on the crawling frequency. Consequently, the ideal crawling rates are the ones that maximise the freshness of the local cache and also respect the above constraints. Azar et al. 2018 recently proposed a tractable algorithm to solve this optimisation problem. However, they assume the knowledge of the exact page change rates, which is unrealistic in practice. We address this issue here. Specifically, we provide two novel schemes for online estimation of page change rates. Both schemes only need partial information about the page change process, i.e., they only need to know if the page has changed or not since the last crawled instance. For both these schemes, we prove convergence and, also, derive their convergence rates. Finally, we provide some numerical experiments to compare the performance of our proposed estimators with the existing ones (e.g., MLE).