StateCensusLaws.org: A Web Application for Consuming and Annotating Legal Discourse Learning
This tool aids journalists and legal interpreters in analyzing census-related laws, but it is incremental as it builds on existing NLP methods for a specific domain.
The authors developed a web application to display NLP model outputs for parsing and labeling discourse in state-level U.S. laws related to the census, releasing a corpus of 6,000 laws and a flexible annotation framework for user contributions.
In this work, we create a web application to highlight the output of NLP models trained to parse and label discourse segments in law text. Our system is built primarily with journalists and legal interpreters in mind, and we focus on state-level law that uses U.S. Census population numbers to allocate resources and organize government. Our system exposes a corpus we collect of 6,000 state-level laws that pertain to the U.S. census, using 25 scrapers we built to crawl state law websites, which we release. We also build a novel, flexible annotation framework that can handle span-tagging and relation tagging on an arbitrary input text document and be embedded simply into any webpage. This framework allows journalists and researchers to add to our annotation database by correcting and tagging new data.