Refine Thought: A Test-Time Inference Method for Embedding Model Reasoning
This work addresses the need for better semantic reasoning in text embedding models, particularly for domain-specific tasks, but it is incremental as it builds on existing pretrained models like Qwen3-Embedding-8B.
The paper tackles the problem of enhancing semantic reasoning in text embedding models by proposing RT (Refine Thought), a test-time inference method that runs multiple forward passes to improve performance on tasks like BRIGHT and PJBenchmark1, achieving significant improvements while maintaining consistency on general-purpose tasks such as C-MTEB.
We propose RT (Refine Thought), a method that can enhance the semantic rea-soning ability of text embedding models. The method obtains the final semanticrepresentation by running multiple forward passes of the text embedding model.Experiments show that RT achieves significant improvements on semantic reason-ing tasks in BRIGHT and the person job matching benchmark PJBenchmark1, while maintaining consistent performance on general-purpose semantic under-standing tasks such as C-MTEB. Our results indicate that RT is effective becauseit further activates the semantic reasoning ability learned during pretraining bydecoder-only text embedding models(e.g., Qwen3-Embedding-8B). RT canbe seen as a test-time inference method.