CLFeb 16, 2025

Improving Similar Case Retrieval Ranking Performance By Revisiting RankSVM

arXiv:2502.11131v22.7h-index: 1Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses ranking issues in legal AI for case retrieval, but it is incremental as it applies an existing method (RankSVM) to a specific domain.

The paper tackles the problem of improving ranking performance in similar case retrieval for Legal AI by using RankSVM as a classifier instead of a fully connected layer, combined with language models, and reports improved retrieval performance on LeCaRDv1 and LeCaRDv2 datasets while mitigating overfitting due to class imbalance.

Given the rapid development of Legal AI, a lot of attention has been paid to one of the most important legal AI tasks--similar case retrieval, especially with language models to use. In our paper, however, we try to improve the ranking performance of current models from the perspective of learning to rank instead of language models. Specifically, we conduct experiments using a pairwise method--RankSVM as the classifier to substitute a fully connected layer, combined with commonly used language models on similar case retrieval datasets LeCaRDv1 and LeCaRDv2. We finally come to the conclusion that RankSVM could generally help improve the retrieval performance on the LeCaRDv1 and LeCaRDv2 datasets compared with original classifiers by optimizing the precise ranking. It could also help mitigate overfitting owing to class imbalance. Our code is available in https://github.com/liuyuqi123study/RankSVM_for_SLR

View on arXiv PDF Code

Similar