A Multi-Source Heterogeneous Knowledge Injected Prompt Learning Method for Legal Charge Prediction
This addresses the problem of accurate legal charge prediction for legal AI applications, representing an incremental improvement over existing neural network methods.
The paper tackles legal charge prediction by proposing a prompt learning method that incorporates multi-source external knowledge from legal knowledge bases, conversational LLMs, and legal articles, achieving state-of-the-art results on the CAIL-2018 dataset with lower data dependency.
Legal charge prediction, an essential task in legal AI, seeks to assign accurate charge labels to case descriptions, attracting significant recent interest. Existing methods primarily employ diverse neural network structures for modeling case descriptions directly, failing to effectively leverage multi-source external knowledge. We propose a prompt learning framework-based method that simultaneously leverages multi-source heterogeneous external knowledge from a legal knowledge base, a conversational LLM, and related legal articles. Specifically, we match knowledge snippets in case descriptions via the legal knowledge base and encapsulate them into the input through a hard prompt template. Additionally, we retrieve legal articles related to a given case description through contrastive learning, and then obtain factual elements within the case description through a conversational LLM. We fuse the embedding vectors of soft prompt tokens with the encoding vector of factual elements to achieve knowledge-enhanced model forward inference. Experimental results show that our method achieved state-of-the-art results on CAIL-2018, the largest legal charge prediction dataset, and our method has lower data dependency. Case studies also demonstrate our method's strong interpretability.