CLFeb 20, 2013

Probabilistic Frame Induction

arXiv:1302.4813v1117 citations
Originality Highly original
AI Analysis

This addresses the need for automated frame induction in natural language processing, benefiting tasks like information extraction and generation, but it is incremental as it builds on prior methods with a novel probabilistic formulation.

The paper tackled the problem of automatically identifying frames (sets of related events) in natural language discourse, which is typically done manually, by proposing the first probabilistic approach that models frames, events, and participants as latent topics and infers the number of frames using a split-merge method. The result was state-of-the-art performance in end-to-end evaluations, with substantial reductions in engineering effort.

In natural-language discourse, related events tend to appear near each other to describe a larger scenario. Such structures can be formalized by the notion of a frame (a.k.a. template), which comprises a set of related events and prototypical participants and event transitions. Identifying frames is a prerequisite for information extraction and natural language generation, and is usually done manually. Methods for inducing frames have been proposed recently, but they typically use ad hoc procedures and are difficult to diagnose or extend. In this paper, we propose the first probabilistic approach to frame induction, which incorporates frames, events, participants as latent topics and learns those frame and event transitions that best explain the text. The number of frames is inferred by a novel application of a split-merge method from syntactic parsing. In end-to-end evaluations from text to induced frames and extracted facts, our method produced state-of-the-art results while substantially reducing engineering effort.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes