Suo Feng

2papers

2 Papers

CLOct 19, 2020
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning

Ye Liu, Sheng Zhang, Rui Song et al.

Open attribute value extraction for emerging entities is an important but challenging task. A lot of previous works formulate the problem as a \textit{question-answering} (QA) task. While the collections of articles from web corpus provide updated information about the emerging entities, the retrieved texts can be noisy, irrelevant, thus leading to inaccurate answers. Effectively filtering out noisy articles as well as bad answers is the key to improving extraction accuracy. Knowledge graph (KG), which contains rich, well organized information about entities, provides a good resource to address the challenge. In this work, we propose a knowledge-guided reinforcement learning (RL) framework for open attribute value extraction. Informed by relevant knowledge in KG, we trained a deep Q-network to sequentially compare extracted answers to improve extraction accuracy. The proposed framework is applicable to different information extraction system. Our experimental results show that our method outperforms the baselines by 16.5 - 27.8\%.

CLMay 29, 2019
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation

Jiangjie Chen, Ao Wang, Haiyun Jiang et al.

A type description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity. Entities in most knowledge graphs (KGs) still lack such descriptions, thus calling for automatic methods to supplement such information. However, existing generative methods either overlook the grammatical structure or make factual mistakes in generated texts. To solve these problems, we propose a head-modifier template-based method to ensure the readability and data fidelity of generated type descriptions. We also propose a new dataset and two automatic metrics for this task. Experiments show that our method improves substantially compared with baselines and achieves state-of-the-art performance on both datasets.