Space Decomposition for Sentence Embedding
This work addresses sentence similarity tasks in NLP, but it appears incremental as it builds on existing STS evaluation methods.
The paper tackled the problem of sentence pair similarity by introducing a new embedding space decomposition method, MixSP, which decreased overlap between upper-range and lower-range classes and outperformed competitors on STS and zero-shot benchmarks.
Determining sentence pair similarity is crucial for various NLP tasks. A common technique to address this is typically evaluated on a continuous semantic textual similarity scale from 0 to 5. However, based on a linguistic observation in STS annotation guidelines, we found that the score in the range [4,5] indicates an upper-range sample, while the rest are lower-range samples. This necessitates a new approach to treating the upper-range and lower-range classes separately. In this paper, we introduce a novel embedding space decomposition method called MixSP utilizing a Mixture of Specialized Projectors, designed to distinguish and rank upper-range and lower-range samples accurately. The experimental results demonstrate that MixSP decreased the overlap representation between upper-range and lower-range classes significantly while outperforming competitors on STS and zero-shot benchmarks.