IR LGFeb 22, 2024

GenSERP: Large Language Models for Whole Page Presentation

Zhenning Zhang, Yunan Zhang, Suyu Ge, Guangwei Weng, Mridu Narang, Xia Song, Saurabh Tiwary

arXiv:2402.14301v24.03 citationsh-index: 24

Originality Incremental advance

AI Analysis

This addresses the challenge of minimizing effort in SERP organization for search engine users, though it appears incremental as it builds on existing LLM capabilities.

The paper tackles the problem of organizing search engine result pages (SERPs) by proposing GenSERP, a framework that uses large language models (LLMs) with vision to dynamically arrange heterogeneous search results like chat answers and multimedia into coherent layouts based on user queries, demonstrating promising user experience in offline experiments on real-world data.

The advent of large language models (LLMs) brings an opportunity to minimize the effort in search engine result page (SERP) organization. In this paper, we propose GenSERP, a framework that leverages LLMs with vision in a few-shot setting to dynamically organize intermediate search results, including generated chat answers, website snippets, multimedia data, knowledge panels into a coherent SERP layout based on a user's query. Our approach has three main stages: (1) An information gathering phase where the LLM continuously orchestrates API tools to retrieve different types of items, and proposes candidate layouts based on the retrieved items, until it's confident enough to generate the final result. (2) An answer generation phase where the LLM populates the layouts with the retrieved content. In this phase, the LLM adaptively optimize the ranking of items and UX configurations of the SERP. Consequently, it assigns a location on the page to each item, along with the UX display details. (3) A scoring phase where an LLM with vision scores all the generated SERPs based on how likely it can satisfy the user. It then send the one with highest score to rendering. GenSERP features two generation paradigms. First, coarse-to-fine, which allow it to approach optimal layout in a more manageable way, (2) beam search, which give it a better chance to hit the optimal solution compared to greedy decoding. Offline experimental results on real-world data demonstrate how LLMs can contextually organize heterogeneous search results on-the-fly and provide a promising user experience.

View on arXiv PDF

Similar