CLJun 15, 2020

Extracting N-ary Cross-sentence Relations using Constrained Subsequence Kernel

Sachin Pawar, Pushpak Bhattacharyya, Girish K. Palshikar

arXiv:2006.08185v10.2

Originality Incremental advance

AI Analysis

This addresses a limitation in relation extraction for tasks requiring multi-sentence and multi-argument analysis, such as in biomedical literature, but is incremental as it builds on existing kernel methods.

The paper tackles the problem of extracting n-ary cross-sentence relations, which span multiple sentences and involve more than two arguments, proposing a novel sequence representation and classifiers including a Constrained Subsequence Kernel for SVM. It evaluates the approach on three datasets across biomedical and general domains, achieving competitive results, though specific numbers are not provided in the abstract.

Most of the past work in relation extraction deals with relations occurring within a sentence and having only two entity arguments. We propose a new formulation of the relation extraction task where the relations are more general than intra-sentence relations in the sense that they may span multiple sentences and may have more than two arguments. Moreover, the relations are more specific than corpus-level relations in the sense that their scope is limited only within a document and not valid globally throughout the corpus. We propose a novel sequence representation to characterize instances of such relations. We then explore various classifiers whose features are derived from this sequence representation. For SVM classifier, we design a Constrained Subsequence Kernel which is a variant of Generalized Subsequence Kernel. We evaluate our approach on three datasets across two domains: biomedical and general domain.

View on arXiv PDF

Similar