ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks
This addresses temporal generalization challenges in legal text classification, offering incremental improvements for handling evolving legal concepts.
This study tackled the problem of legal multi-label text classification where concepts evolve over time, by introducing ChronosLex, an incremental training paradigm that trains on chronological splits; results showed that continual learning methods enhanced temporal generalizability, while temporal invariant methods struggled with temporal shifts.
This study investigates the challenges posed by the dynamic nature of legal multi-label text classification tasks, where legal concepts evolve over time. Existing models often overlook the temporal dimension in their training process, leading to suboptimal performance of those models over time, as they treat training data as a single homogeneous block. To address this, we introduce ChronosLex, an incremental training paradigm that trains models on chronological splits, preserving the temporal order of the data. However, this incremental approach raises concerns about overfitting to recent data, prompting an assessment of mitigation strategies using continual learning and temporal invariant methods. Our experimental results over six legal multi-label text classification datasets reveal that continual learning methods prove effective in preventing overfitting thereby enhancing temporal generalizability, while temporal invariant methods struggle to capture these dynamics of temporal shifts.