Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data
This addresses the problem of shallow procedure modeling for AI systems needing hierarchical reasoning, though it is incremental as it builds on existing semi-structured data.
The paper tackled constructing an open-domain hierarchical knowledge-base of procedures from wikiHow data to model parent-child relations, achieving significant outperformance over baselines in automatic evaluation, human judgment, and downstream tasks like instructional video retrieval.
Procedures are inherently hierarchical. To "make videos", one may need to "purchase a camera", which in turn may require one to "set a budget". While such hierarchical knowledge is critical for reasoning about complex procedures, most existing work has treated procedures as shallow structures without modeling the parent-child relation. In this work, we attempt to construct an open-domain hierarchical knowledge-base (KB) of procedures based on wikiHow, a website containing more than 110k instructional articles, each documenting the steps to carry out a complex procedure. To this end, we develop a simple and efficient method that links steps (e.g., "purchase a camera") in an article to other articles with similar goals (e.g., "how to choose a camera"), recursively constructing the KB. Our method significantly outperforms several strong baselines according to automatic evaluation, human judgment, and application to downstream tasks such as instructional video retrieval. A demo with partial data can be found at https://wikihow-hierarchy.github.io. The code and the data are at https://github.com/shuyanzhou/wikihow_hierarchy.