CLLGMLSep 28, 2017

Structured Embedding Models for Grouped Data

arXiv:1709.10367v137 citations
Originality Incremental advance
AI Analysis

This work addresses the need for group-specific analysis in embedding models for researchers and practitioners in fields like political science, text mining, and retail analytics, representing an incremental advancement over prior exponential family embeddings.

The authors tackled the problem of discovering embeddings that vary across related groups of data, such as word usage in political speeches or shopping patterns across seasons, by developing structured exponential family embeddings (S-EFE) with sharing strategies like hierarchical modeling and amortization, resulting in improved group-specific interpretation and outperforming existing methods in predicting held-out data.

Word embeddings are a powerful approach for analyzing language, and exponential family embeddings (EFE) extend them to other types of data. Here we develop structured exponential family embeddings (S-EFE), a method for discovering embeddings that vary across related groups of data. We study how the word usage of U.S. Congressional speeches varies across states and party affiliation, how words are used differently across sections of the ArXiv, and how the co-purchase patterns of groceries can vary across seasons. Key to the success of our method is that the groups share statistical information. We develop two sharing strategies: hierarchical modeling and amortization. We demonstrate the benefits of this approach in empirical studies of speeches, abstracts, and shopping baskets. We show how S-EFE enables group-specific interpretation of word usage, and outperforms EFE in predicting held-out data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes