Advances and Challenges in Semantic Textual Similarity: A Comprehensive Survey
It provides a comprehensive overview for researchers and practitioners to navigate advancements and challenges in STS, though it is incremental as a survey.
This survey reviews progress in Semantic Textual Similarity (STS) since 2021, covering transformer-based models, contrastive learning, and domain-specific techniques, and highlights recent models like FarSSiBERT and DeBERTa-v3 that have achieved high accuracy.
Semantic Textual Similarity (STS) research has expanded rapidly since 2021, driven by advances in transformer architectures, contrastive learning, and domain-specific techniques. This survey reviews progress across six key areas: transformer-based models, contrastive learning, domain-focused solutions, multi-modal methods, graph-based approaches, and knowledge-enhanced techniques. Recent transformer models such as FarSSiBERT and DeBERTa-v3 have achieved remarkable accuracy, while contrastive methods like AspectCSE have established new benchmarks. Domain-adapted models, including CXR-BERT for medical texts and Financial-STS for finance, demonstrate how STS can be effectively customized for specialized fields. Moreover, multi-modal, graph-based, and knowledge-integrated models further enhance semantic understanding and representation. By organizing and analyzing these developments, the survey provides valuable insights into current methods, practical applications, and remaining challenges. It aims to guide researchers and practitioners alike in navigating rapid advancements, highlighting emerging trends and future opportunities in the evolving field of STS.