Towards Unsupervised Question Answering System with Multi-level Summarization for Legal Text
This addresses the challenge of complex legal text classification for legal AI applications, though it appears incremental as it builds on existing methods like BERT and T5.
The paper tackled the problem of legal argument reasoning classification by proposing an unsupervised similarity-based approach with multi-level feature fusion and T5-based summarization, achieving a 20-point macro F1-score increase on the development set and a 10-point increase on the test set.
This paper summarizes Team SCaLAR's work on SemEval-2024 Task 5: Legal Argument Reasoning in Civil Procedure. To address this Binary Classification task, which was daunting due to the complexity of the Legal Texts involved, we propose a simple yet novel similarity and distance-based unsupervised approach to generate labels. Further, we explore the Multi-level fusion of Legal-Bert embeddings using ensemble features, including CNN, GRU, and LSTM. To address the lengthy nature of Legal explanation in the dataset, we introduce T5-based segment-wise summarization, which successfully retained crucial information, enhancing the model's performance. Our unsupervised system witnessed a 20-point increase in macro F1-score on the development set and a 10-point increase on the test set, which is promising given its uncomplicated architecture.