Optimization of Latent-Space Compression using Game-Theoretic Techniques for Transformer-Based Vector Search
This addresses scalability and efficiency issues in information retrieval systems for users of transformer-based embeddings, though it is an incremental improvement as it builds on existing compression methods.
The paper tackles the problem of high dimensionality in transformer-based vector search by proposing a game-theoretic framework for latent-space compression, resulting in significantly higher average similarity (0.9981 vs. 0.5517) and utility (0.8873 vs. 0.5194) compared to FAISS, with a modest increase in query time.
Vector similarity search plays a pivotal role in modern information retrieval systems, especially when powered by transformer-based embeddings. However, the scalability and efficiency of such systems are often hindered by the high dimensionality of latent representations. In this paper, we propose a novel game-theoretic framework for optimizing latent-space compression to enhance both the efficiency and semantic utility of vector search. By modeling the compression strategy as a zero-sum game between retrieval accuracy and storage efficiency, we derive a latent transformation that preserves semantic similarity while reducing redundancy. We benchmark our method against FAISS, a widely-used vector search library, and demonstrate that our approach achieves a significantly higher average similarity (0.9981 vs. 0.5517) and utility (0.8873 vs. 0.5194), albeit with a modest increase in query time. This trade-off highlights the practical value of game-theoretic latent compression in high-utility, transformer-based search applications. The proposed system can be seamlessly integrated into existing LLM pipelines to yield more semantically accurate and computationally efficient retrieval.