CLLOAug 11, 2022

Language-independence of DisCoCirc's Text Circuits: English and Urdu

arXiv:2208.10281v15 citationsh-index: 48
Originality Synthesis-oriented
AI Analysis

This work addresses language independence in computational linguistics, but it is incremental as it builds on prior DisCoCirc developments and focuses on restricted fragments.

The paper tackled the problem of grammatical differences between languages by applying the DisCoCirc framework to English and Urdu, showing that differences in word and phrase ordering vanish in the resulting circuits.

DisCoCirc is a newly proposed framework for representing the grammar and semantics of texts using compositional, generative circuits. While it constitutes a development of the Categorical Distributional Compositional (DisCoCat) framework, it exposes radically new features. In particular, [14] suggested that DisCoCirc goes some way toward eliminating grammatical differences between languages. In this paper we provide a sketch that this is indeed the case for restricted fragments of English and Urdu. We first develop DisCoCirc for a fragment of Urdu, as it was done for English in [14]. There is a simple translation from English grammar to Urdu grammar, and vice versa. We then show that differences in grammatical structure between English and Urdu - primarily relating to the ordering of words and phrases - vanish when passing to DisCoCirc circuits.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes