An Evaluation Framework for Legal Document Summarization
This addresses a problem for law practitioners who need efficient summarization of lengthy legal documents, though it is incremental as it builds on existing summarization evaluation methods.
The authors tackled the lack of evaluation metrics for legal document summarization based on intent, proposing an automated intent-based metric that shows better agreement with human satisfaction compared to existing metrics like BLEU and ROUGE-L.
A law practitioner has to go through numerous lengthy legal case proceedings for their practices of various categories, such as land dispute, corruption, etc. Hence, it is important to summarize these documents, and ensure that summaries contain phrases with intent matching the category of the case. To the best of our knowledge, there is no evaluation metric that evaluates a summary based on its intent. We propose an automated intent-based summarization metric, which shows a better agreement with human evaluation as compared to other automated metrics like BLEU, ROUGE-L etc. in terms of human satisfaction. We also curate a dataset by annotating intent phrases in legal documents, and show a proof of concept as to how this system can be automated. Additionally, all the code and data to generate reproducible results is available on Github.