Hierarchical RNN for Information Extraction from Lawsuit Documents
This work addresses the challenge of automating legal document analysis for better case understanding and prediction, but it is incremental as it applies an existing hierarchical RNN method to a new dataset.
The paper tackles the problem of extracting key information from complex civil lawsuit documents in China, which is difficult due to complicated language and variable sentence lengths, by using a hierarchical RNN framework for sequence labeling, achieving the first such research in this domain.
Every lawsuit document contains the information about the party's claim, court's analysis, decision and others, and all of this information are helpful to understand the case better and predict the judge's decision on similar case in the future. However, the extraction of these information from the document is difficult because the language is too complicated and sentences varied at length. We treat this problem as a task of sequence labeling, and this paper presents the first research to extract relevant information from the civil lawsuit document in China with the hierarchical RNN framework.