BiographyNet: Extracting Relations Between People and Events
This project addresses the problem of improving search and presentation of biographical data for professional historians, but it is incremental as it builds on existing datasets and methods.
The paper describes BiographyNet, a digital humanities project that tackled the challenge of enhancing historical research by extracting relations between people and events from Dutch biographical datasets, resulting in a user-evaluated demonstrator with an NLP pipeline applied to approximately 125,000 biographies covering 76,000 individuals.
This paper describes BiographyNet, a digital humanities project (2012-2016) that brings together researchers from history, computational linguistics and computer science. The project uses data from the Biography Portal of the Netherlands (BPN), which contains approximately 125,000 biographies from a variety of Dutch biographical dictionaries from the eighteenth century until now, describing around 76,000 individuals. BiographyNet's aim is to strengthen the value of the portal and comparable biographical datasets for historical research, by improving the search options and the presentation of its outcome, with a historically justified NLP pipeline that works through a user evaluated demonstrator. The project's main target group are professional historians. The project therefore worked with two key concepts: "provenance" -understood as a term allowing for both historical source criticism and for references to data-management and programming interventions in digitized sources; and "perspective" interpreted as inherent uncertainty concerning the interpretation of historical results.