The Unreasonable Effectiveness of LLMs for Query Optimization
This work addresses query optimization for database systems, offering potential performance and simplicity benefits, though it is incremental as it builds on existing LLM and optimization methods.
The paper tackles query optimization by demonstrating that LLM embeddings of query text contain useful semantic information, enabling a simple binary classifier trained on a small number of labeled embedded query vectors to outperform existing heuristic systems.
Recent work in database query optimization has used complex machine learning strategies, such as customized reinforcement learning schemes. Surprisingly, we show that LLM embeddings of query text contain useful semantic information for query optimization. Specifically, we show that a simple binary classifier deciding between alternative query plans, trained only on a small number of labeled embedded query vectors, can outperform existing heuristic systems. Although we only present some preliminary results, an LLM-powered query optimizer could provide significant benefits, both in terms of performance and simplicity.