LGJul 28, 2025

Uncertainty-driven Embedding Convolution

Sungjun Lim, Kangjun Noh, Youngjun Choi, Heeyoung Lee, Kyungwoo Song

arXiv:2507.20718v2h-index: 2

Originality Incremental advance

AI Analysis

This addresses the need for more reliable and robust ensemble techniques in NLP pipelines, though it is incremental as it builds on existing ensemble methods by incorporating uncertainty modeling.

The paper tackled the problem of ensemble methods for text embeddings failing to account for model-specific uncertainty, limiting robustness, by proposing Uncertainty-driven Embedding Convolution (UEC), which improved performance and robustness across diverse benchmarks.

Text embeddings are essential components in modern NLP pipelines. While numerous embedding models have been proposed, their performance varies across domains. This variability motivates the use of ensemble techniques to combine complementary strengths. However, most existing ensemble methods operate on deterministic embeddings and fail to account for model-specific uncertainty, limiting their robustness and reliability in downstream applications. To address these limitations, we propose Uncertainty-driven Embedding Convolution (UEC). UEC first transforms deterministic embeddings into probabilistic ones in a post-hoc manner. It then computes adaptive ensemble weights based on embedding uncertainty, grounded in a Bayes-optimal solution under a surrogate loss. Additionally, UEC employs an uncertainty-aware similarity function that directly incorporates uncertainty into the similarity scoring, providing a theoretically grounded and efficient surrogate to distributional distances. Extensive experiments on diverse benchmarks demonstrate that UEC consistently improves both performance and robustness by leveraging principled uncertainty modeling.

View on arXiv PDF

Similar