CLOct 31, 2018

Extracting Linguistic Resources from the Web for Concept-to-Text Generation

arXiv:1810.13414v12 citations
Originality Incremental advance
AI Analysis

This work addresses the tedious and costly manual resource construction for concept-to-text generation systems, offering a semi-automatic solution that is incremental in improving efficiency for developers and users in natural language generation domains.

The paper tackled the problem of manually constructing domain-specific linguistic resources for concept-to-text generation by proposing methods to extract sentence plans and natural language names from the Web, specifically for the NaturalOWL generator. Experiments showed that texts generated with these semi-automatically extracted resources were perceived as almost as good as those using manually authored resources and much better than those using ontology identifiers.

Many concept-to-text generation systems require domain-specific linguistic resources to produce high quality texts, but manually constructing these resources can be tedious and costly. Focusing on NaturalOWL, a publicly available state of the art natural language generator for OWL ontologies, we propose methods to extract from the Web sentence plans and natural language names, two of the most important types of domain-specific linguistic resources used by the generator. Experiments show that texts generated using linguistic resources extracted by our methods in a semi-automatic manner, with minimal human involvement, are perceived as being almost as good as texts generated using manually authored linguistic resources, and much better than texts produced by using linguistic resources extracted from the relation and entity identifiers of the ontology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes