CLApr 9, 2019

Mixing syntagmatic and paradigmatic information for concept detection

arXiv:1904.04461v20.22 citations

Originality Incremental advance

AI Analysis

This addresses a bottleneck in empirical philosophy by improving concept detection methods, though it is incremental as it builds on existing topic models.

The paper tackles the problem of automatically detecting concepts in textual data for corpus-based conceptual analysis by combining syntagmatic and paradigmatic information, resulting in significant performance increases and more flexible concept expression using word vectors.

In the last decades, philosophers have begun using empirical data for conceptual analysis, but corpus-based conceptual analysis has so far failed to develop, in part because of the absence of reliable methods to automatically detect concepts in textual data. Previous attempts have shown that topic models can constitute efficient concept detection heuristics, but while they leverage the syntagmatic relations in a corpus, they fail to exploit paradigmatic relations, and thus probably fail to model concepts accurately. In this article, we show that using a topic model that models concepts on a space of word embeddings (Hu and Tsujii, 2016) can lead to significant increases in concept detection performance, as well as enable the target concept to be expressed in more flexible ways using word vectors.

View on arXiv PDF

Similar