IRCLMar 31, 2024

CuSINeS: Curriculum-driven Structure Induced Negative Sampling for Statutory Article Retrieval

arXiv:2404.00590v183 citationsh-index: 13LREC
Originality Incremental advance
AI Analysis

This addresses the challenge of retrieving statutory articles for legal professionals, representing an incremental improvement in negative sampling methods for a specific domain.

The paper tackles the problem of Statutory Article Retrieval by introducing CuSINeS, a negative sampling approach that uses curriculum-based strategies and structural information to improve performance, with experimental validation on a real-world dataset showing effectiveness across multiple baselines.

In this paper, we introduce CuSINeS, a negative sampling approach to enhance the performance of Statutory Article Retrieval (SAR). CuSINeS offers three key contributions. Firstly, it employs a curriculum-based negative sampling strategy guiding the model to focus on easier negatives initially and progressively tackle more difficult ones. Secondly, it leverages the hierarchical and sequential information derived from the structural organization of statutes to evaluate the difficulty of samples. Lastly, it introduces a dynamic semantic difficulty assessment using the being-trained model itself, surpassing conventional static methods like BM25, adapting the negatives to the model's evolving competence. Experimental results on a real-world expert-annotated SAR dataset validate the effectiveness of CuSINeS across four different baselines, demonstrating its versatility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes