An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols
This work addresses the need for automated processing of instructional texts in biological research, but it is incremental as it focuses on corpus creation rather than novel method development.
The authors tackled the problem of converting natural language instructions in wet lab protocols into a machine-readable format by creating an annotated corpus of 622 protocols, which they demonstrated is useful for developing machine learning approaches to shallow semantic parsing.
We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. Experimental results demonstrate the utility of our corpus for developing machine learning approaches to shallow semantic parsing of instructional texts. We make our annotated Wet Lab Protocol Corpus available to the research community.