CLJan 25, 2021

Process-Level Representation of Scientific Protocols with Interactive Annotation

arXiv:2101.10244v2807 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of automated understanding of scientific protocols for researchers in biochemistry and computational linguistics, but it is incremental as it builds on existing annotation and modeling approaches.

The paper tackles the challenge of representing complex biochemistry protocols by introducing Process Execution Graphs (PEGs) to handle cross-sentence relations and other issues, and finds that graph-prediction models perform well on entity identification and local relation extraction.

We develop Process Execution Graphs (PEG), a document-level representation of real-world wet lab biochemistry protocols, addressing challenges such as cross-sentence relations, long-range coreference, grounding, and implicit arguments. We manually annotate PEGs in a corpus of complex lab protocols with a novel interactive textual simulator that keeps track of entity traits and semantic constraints during annotation. We use this data to develop graph-prediction models, finding them to be good at entity identification and local relation extraction, while our corpus facilitates further exploration of challenging long-range relations.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes