CLMar 11, 2018

Generating Bilingual Pragmatic Color References

Will Monroe, Jennifer Hu, Andrew Jong, Christopher Potts

arXiv:1803.03917v232.01093 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of cross-lingual pragmatic language generation for AI systems, though it is incremental as it builds on existing methods with a new dataset.

The study tackled the problem of generating bilingual pragmatic color references by confirming cross-lingual regularities in contextual sensitivity between Mandarin Chinese and English, and showed that a neural speaker trained with multitask learning achieved more human-like patterns and pragmatic informativeness, with concrete improvements over a monolingual model.

Contextual influences on language often exhibit substantial cross-lingual regularities; for example, we are more verbose in situations that require finer distinctions. However, these regularities are sometimes obscured by semantic and syntactic differences. Using a newly-collected dataset of color reference games in Mandarin Chinese (which we release to the public), we confirm that a variety of constructions display the same sensitivity to contextual difficulty in Chinese and English. We then show that a neural speaker agent trained on bilingual data with a simple multitask learning approach displays more human-like patterns of context dependence and is more pragmatically informative than its monolingual Chinese counterpart. Moreover, this is not at the expense of language-specific semantic understanding: the resulting speaker model learns the different basic color term systems of English and Chinese (with noteworthy cross-lingual influences), and it can identify synonyms between the two languages using vector analogy operations on its output layer, despite having no exposure to parallel data.

View on arXiv PDF Code

Similar