When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications
This work helps NLP practitioners choose appropriate OpenIE systems for downstream applications, but it is incremental as it focuses on comparative analysis rather than introducing new methods.
The paper tackles the lack of consensus on which OpenIE models to use for different NLP tasks by conducting an empirical survey of neural models, training sets, and benchmarks, finding that model and dataset assumptions significantly affect performance, and demonstrates this with a Complex QA application.
Open Information Extraction (OpenIE) has been used in the pipelines of various NLP tasks. Unfortunately, there is no clear consensus on which models to use in which tasks. Muddying things further is the lack of comparisons that take differing training sets into account. In this paper, we present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications. We find that the different assumptions made by different models and datasets have a statistically significant effect on performance, making it important to choose the most appropriate model for one's applications. We demonstrate the applicability of our recommendations on a downstream Complex QA application.