CLLGFeb 28, 2025

Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition

arXiv:2502.20726v2h-index: 3
Originality Incremental advance
AI Analysis

This addresses the challenge of enhancing contextual information encoding in language models for zero-shot learning, though it appears incremental as it builds on existing pre-trained models without additional training.

The paper tackled the problem of improving zero-shot performance of pre-trained language models by proposing a novel backward attention mechanism to enhance embeddings, achieving significant improvements on the Chinese Massive Text Embedding Benchmark (C-MTEB).

Language models can be viewed as functions that embed text into Euclidean space, where the quality of the embedding vectors directly determines model performance, training such neural networks involves various uncertainties. This paper focuses on improving the performance of pre-trained language models in zero-shot settings through a simple and easily implementable method. We propose a novel backward attention mechanism to enhance contextual information encoding. Evaluated on the Chinese Massive Text Embedding Benchmark (C-MTEB), our approach achieves significant improvements across multiple tasks, providing valuable insights for advancing zero-shot learning capabilities.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes