Learning Functional Distributional Semantics with Visual Data
This work addresses the challenge of grounding semantics in visual data for improved interpretability in natural language processing, representing an incremental advance in domain-specific applications.
The authors tackled the problem of learning distributional semantics with linguistic interpretability by training a Functional Distributional Semantics model using visual data from Visual Genome, resulting in outperformance on four external evaluation datasets compared to prior work.
Functional Distributional Semantics is a recently proposed framework for learning distributional semantics that provides linguistic interpretability. It models the meaning of a word as a binary classifier rather than a numerical vector. In this work, we propose a method to train a Functional Distributional Semantics model with grounded visual data. We train it on the Visual Genome dataset, which is closer to the kind of data encountered in human language acquisition than a large text corpus. On four external evaluation datasets, our model outperforms previous work on learning semantics from Visual Genome.