CLMay 26, 2021

TexRel: a Green Family of Datasets for Emergent Communications on Relations

arXiv:2105.12804v12 citations
Originality Synthesis-oriented
AI Analysis

This provides a more efficient and realistic playground for researchers in emergent communications, though it is incremental as it builds on existing datasets and methods.

The authors introduced TexRel, a new dataset for studying emergent communications on relations, offering rapid training and realistic non-symbolic inputs compared to alternatives like Shapeworld, with baseline results showing improved accuracy and compositionality metrics in multitask learning scenarios.

We propose a new dataset TexRel as a playground for the study of emergent communications, in particular for relations. By comparison with other relations datasets, TexRel provides rapid training and experimentation, whilst being sufficiently large to avoid overfitting in the context of emergent communications. By comparison with using symbolic inputs, TexRel provides a more realistic alternative whilst remaining efficient and fast to learn. We compare the performance of TexRel with a related relations dataset Shapeworld. We provide baseline performance results on TexRel for sender architectures, receiver architectures and end-to-end architectures. We examine the effect of multitask learning in the context of shapes, colors and relations on accuracy, topological similarity and clustering precision. We investigate whether increasing the size of the latent meaning space improves metrics of compositionality. We carry out a case-study on using TexRel to reproduce the results of an experiment in a recent paper that used symbolic inputs, but using our own non-symbolic inputs, from TexRel, instead.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes