Beyond Pipelines: A Fundamental Study on the Rise of Generative-Retrieval Architectures in Web Research
It addresses the impact of LLMs on web research and industry, but it is incremental as it synthesizes existing advances rather than introducing new methods.
This survey examines how large language models (LLMs) are transforming web research by shifting from traditional pipelines to generative solutions like retrieval-augmented generation (RAG) for tasks such as information retrieval and question answering, though it does not report specific numerical results.
Web research and practices have evolved significantly over time, offering users diverse and accessible solutions across a wide range of tasks. While advanced concepts such as Web 4.0 have emerged from mature technologies, the introduction of large language models (LLMs) has profoundly influenced both the field and its applications. This wave of LLMs has permeated science and technology so deeply that no area remains untouched. Consequently, LLMs are reshaping web research and development, transforming traditional pipelines into generative solutions for tasks like information retrieval, question answering, recommendation systems, and web analytics. They have also enabled new applications such as web-based summarization and educational tools. This survey explores recent advances in the impact of LLMs-particularly through the use of retrieval-augmented generation (RAG)-on web research and industry. It discusses key developments, open challenges, and future directions for enhancing web solutions with LLMs.