CLCOJun 2, 2012

Automated Word Puzzle Generation via Topic Dictionaries

arXiv:1206.0377v16 citations
Originality Incremental advance
AI Analysis

This addresses the problem of reducing human annotation effort in puzzle generation for educators and learners, though it is incremental as it builds on existing topic modeling and similarity techniques.

The paper tackles automated word puzzle generation by proposing a method that uses only an unstructured corpus, a topic model, and semantic similarity to generate various puzzle types with parameterizable difficulty levels, resulting in the ability to create domain-specific puzzles and a large number of proper puzzles automatically.

We propose a general method for automated word puzzle generation. Contrary to previous approaches in this novel field, the presented method does not rely on highly structured datasets obtained with serious human annotation effort: it only needs an unstructured and unannotated corpus (i.e., document collection) as input. The method builds upon two additional pillars: (i) a topic model, which induces a topic dictionary from the input corpus (examples include e.g., latent semantic analysis, group-structured dictionaries or latent Dirichlet allocation), and (ii) a semantic similarity measure of word pairs. Our method can (i) generate automatically a large number of proper word puzzles of different types, including the odd one out, choose the related word and separate the topics puzzle. (ii) It can easily create domain-specific puzzles by replacing the corpus component. (iii) It is also capable of automatically generating puzzles with parameterizable levels of difficulty suitable for, e.g., beginners or intermediate learners.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes