CLDec 31, 2024

"Dialogue" vs "Dialog" in NLP and AI research: Statistics from a Confused Discourse

arXiv:2501.00598v1
Originality Synthesis-oriented
AI Analysis

This addresses a minor terminological confusion for NLP/AI researchers, but it is incremental as it analyzes an existing phenomenon without solving it.

The paper tackled the inconsistent spelling of 'dialogue' vs. 'dialog' in NLP and AI research by analyzing thousands of papers, finding that 72% use 'dialogue', 24% use 'dialog', and 5% use both, with no clear shift over time.

Within computing research, there are two spellings for an increasingly important term - dialogue and dialog. We analyze thousands of research papers to understand this "dialog(ue) debacle". Among publications in top venues that use "dialog(ue)" in the title or abstract, 72% use "dialogue", 24% use "dialog", and 5% use both in the same title and abstract. This split distribution is more common in Computing than any other academic discipline. We investigate trends over ~20 years of NLP/AI research, not finding clear evidence of a shift over time. Author nationality is weakly correlated with spelling choice, but far from explains the mixed use. Many prolific authors publish papers with both spellings. We use several methods (such as syntactic parses and LM embeddings) to study how dialog(ue) context influences spelling, finding limited influence. Combining these results together, we discuss different theories that might explain the dialog(ue) divergence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes