Chittaranjan Tripathy

h-index6

3papers

138citations

3 Papers

11.8IRFeb 21, 2025Code

Automated Query-Product Relevance Labeling using Large Language Models for E-commerce Search

Jayant Sachdev, Sean D Rosario, Abhijeet Phatak et al.

Accurate query-product relevance labeling is indispensable to generate ground truth dataset for search ranking in e-commerce. Traditional approaches for annotating query-product pairs rely on human-based labeling services, which is expensive, time-consuming and prone to errors. In this work, we explore the application of Large Language Models (LLMs) to automate query-product relevance labeling for large-scale e-commerce search. We use several publicly available and proprietary LLMs for this task, and conducted experiments on two open-source datasets and an in-house e-commerce search dataset. Using prompt engineering techniques such as Chain-of-Thought (CoT) prompting, In-context Learning (ICL), and Retrieval Augmented Generation (RAG) with Maximum Marginal Relevance (MMR), we show that LLM's performance has the potential to approach human-level accuracy on this task in a fraction of the time and cost required by human-labelers, thereby suggesting that our approach is more efficient than the conventional methods. We have generated query-product relevance labels using LLMs at scale, and are using them for evaluating improvements to our search algorithms. Our work demonstrates the potential of LLMs to improve query-product relevance thus enhancing e-commerce search user experience. More importantly, this scalable alternative to human-annotation has significant implications for information retrieval domains including search and recommendation systems, where relevance scoring is crucial for optimizing the ranking of products and content to improve customer engagement and other conversion metrics.

IRJun 9

The Voronoi Bottleneck: Capacity-Aware Dense Retrieval for Product Search

Charith Chandra Sai Balne, Rithwik Maramraju, Siddharth Pratap Singh et al.

Dense embedding retrieval compresses all relevance information into a single inner product, imposing a fundamental geometric limit -- the Voronoi Bottleneck -- on the number of query-document relevance patterns expressible at fixed embedding dimension (d). We make three contributions. (1) Unified capacity theory. We prove that Voronoi complexity and sign-rank are equivalent for top-1 retrieval, yielding tight dimension bounds and a computable diagnostic, the Capacity Utilization Score (CUS), that predicts per-query retrieval failure with AUC (> 0.8) without relevance labels. (2) Diagnosis. CUS identifies two capacity regimes -- moderate ((δ\gtrsim 1)), where density-aware training yields measurable gains, and vacuous ((δ\ll 1)), where it does not -- giving practitioners an a priori check before investing in retraining. (3) DART training. We introduce AT-DW-InfoNCE, an Adaptive-Temperature Density-Weighted contrastive objective with formally derived optimal weighting (α^* = 2.0). On a 100K-query synthetic product-search corpus with controlled relevance structure, DART improves +1.9 Recall@100 over a same-data InfoNCE baseline ((84.9 \pm 0.0) vs. (83.0 \pm 0.3); 8 seeds, (p < 0.001)), outperforming focal loss and temperature-schedule alternatives. DART requires zero inference-time overhead -- it is a drop-in training objective that improves any dual-encoder system.

3.6IRFeb 5, 2025

Inducing Diversity in Differentiable Search Indexing

Abhijeet Phatak, Jayant Sachdev, Sean D Rosario et al.

Differentiable Search Indexing (DSI) is a recent paradigm for information retrieval which uses a transformer-based neural network architecture as the document index to simplify the retrieval process. A differentiable index has many advantages enabling modifications, updates or extensions to the index. In this work, we explore balancing relevance and novel information content (diversity) for training DSI systems inspired by Maximal Marginal Relevance (MMR), and show the benefits of our approach over the naive DSI training. We present quantitative and qualitative evaluations of relevance and diversity measures obtained using our method on NQ320K and MSMARCO datasets in comparison to naive DSI. With our approach, it is possible to achieve diversity without any significant impact to relevance. Since we induce diversity while training DSI, the trained model has learned to diversify while being relevant. This obviates the need for a post-processing step to induce diversity in the recall set as typically performed using MMR. Our approach will be useful for Information Retrieval problems where both relevance and diversity are important such as in sub-topic retrieval. Our work can also be easily be extended to the incremental DSI settings which would enable fast updates to the index while retrieving a diverse recall set.