Annotation and Classification of Relevant Clauses in Terms-and-Conditions Contracts
This work addresses the need for legal experts to quickly assess issues in Terms-and-Conditions contracts, but it is incremental as it applies existing NLP methods to a new domain-specific dataset.
The paper tackled the problem of identifying problematic clauses in Terms-and-Conditions contracts by developing an annotation scheme with 14 categories and achieving an inter-annotator agreement of 0.92, and it demonstrated the feasibility of automatic classification with accuracies ranging from 0.79 to 0.95 using few-shot prompting and fine-tuned BERT-based models.
In this paper, we propose a new annotation scheme to classify different types of clauses in Terms-and-Conditions contracts with the ultimate goal of supporting legal experts to quickly identify and assess problematic issues in this type of legal documents. To this end, we built a small corpus of Terms-and-Conditions contracts and finalized an annotation scheme of 14 categories, eventually reaching an inter-annotator agreement of 0.92. Then, for 11 of them, we experimented with binary classification tasks using few-shot prompting with a multilingual T5 and two fine-tuned versions of two BERT-based LLMs for Italian. Our experiments showed the feasibility of automatic classification of our categories by reaching accuracies ranging from .79 to .95 on validation tasks.