CLJul 28, 2013

Learning Frames from Text with an Unsupervised Latent Variable Model

arXiv:1307.7382v13 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of automatically extracting event structures from corpora for natural language processing applications, but it is incremental as it builds on existing probabilistic methods.

The authors tackled the problem of discovering semantic frames from text using an unsupervised latent variable model, resulting in a model that learns some novel frames compared to FrameNet.

We develop a probabilistic latent-variable model to discover semantic frames---types of events and their participants---from corpora. We present a Dirichlet-multinomial model in which frames are latent categories that explain the linking of verb-subject-object triples, given document-level sparsity. We analyze what the model learns, and compare it to FrameNet, noting it learns some novel and interesting frames. This document also contains a discussion of inference issues, including concentration parameter learning; and a small-scale error analysis of syntactic parsing accuracy.

View on arXiv PDF

Similar