LGOct 6, 2025

Gamma Mixture Modeling for Cosine Similarity in Small Language Models

arXiv:2510.05309v1
Originality Synthesis-oriented
AI Analysis

This work provides a practical tool for analyzing similarity distributions in small language models, but it is incremental as it builds on existing embedding and clustering methods.

The authors tackled the problem of modeling cosine similarity distributions of sentence transformer embeddings by showing they are well captured by gamma mixtures, proposing a heuristic model based on hierarchical clustering and an expectation-maximization algorithm for fitting.

We study the cosine similarity of sentence transformer embeddings and observe that they are well modeled by gamma mixtures. From a fixed corpus, we measure similarities between all document embeddings and a reference query embedding. Empirically we find that these distributions are often well captured by a gamma distribution shifted and truncated to [-1,1], and in many cases, by a gamma mixture. We propose a heuristic model in which a hierarchical clustering of topics naturally leads to a gamma-mixture structure in the similarity scores. Finally, we outline an expectation-maximization algorithm for fitting shifted gamma mixtures, which provides a practical tool for modeling similarity distributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes