Martha Lewis

CL
h-index47
34papers
3,950citations
Novelty34%
AI Score54

34 Papers

CLJun 9, 2022
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao et al. · allen-ai, amazon-science

Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.

CVDec 20, 2022
Does CLIP Bind Concepts? Probing Compositionality in Large Image Models

Martha Lewis, Nihal V. Nayak, Peilin Yu et al.

Large-scale neural network models combining text and images have made incredible progress in recent years. However, it remains an open question to what extent such models encode compositional representations of the concepts over which they operate, such as correctly identifying "red cube" by reasoning over the constituents "red" and "cube". In this work, we focus on the ability of a large pretrained vision and language model (CLIP) to encode compositional concepts and to bind variables in a structure-sensitive way (e.g., differentiating "cube behind sphere" from "sphere behind cube"). To inspect the performance of CLIP, we compare several architectures from research on compositional distributional semantics models (CDSMs), a line of research that attempts to implement traditional compositional linguistic structures within embedding spaces. We benchmark them on three synthetic datasets - single-object, two-object, and relational - designed to test concept binding. We find that CLIP can compose concepts in a single-object setting, but in situations where concept binding is needed, performance drops dramatically. At the same time, CDSMs also perform poorly, with best performance at chance level.

LGOct 31, 2023
EXTRACT: Explainable Transparent Control of Bias in Embeddings

Zhijin Guo, Zhaozhen Xu, Martha Lewis et al.

Knowledge Graphs are a widely used method to represent relations between entities in various AI applications, and Graph Embedding has rapidly become a standard technique to represent Knowledge Graphs in such a way as to facilitate inferences and decisions. As this representation is obtained from behavioural data, and is not in a form readable by humans, there is a concern that it might incorporate unintended information that could lead to biases. We propose EXTRACT: a suite of Explainable and Transparent methods to ConTrol bias in knowledge graph embeddings, so as to assess and decrease the implicit presence of protected information. Our method uses Canonical Correlation Analysis (CCA) to investigate the presence, extent and origins of information leaks during training, then decomposes embeddings into a sum of their private attributes by solving a linear system. Our experiments, performed on the MovieLens1M dataset, show that a range of personal attributes can be inferred from a user's viewing behaviour and preferences, including gender, age, and occupation. Further experiments, performed on the KG20C citation dataset, show that the information about the conference in which a paper was published can be inferred from the citation network of that article. We propose four transparent methods to maintain the capability of the embedding to make the intended predictions without retaining unwanted information. A trade-off between these two goals is observed.

LGNov 18, 2023
Compositional Fusion of Signals in Data Embedding

Zhijin Guo, Zhaozhen Xu, Martha Lewis et al.

Embeddings in AI convert symbolic structures into fixed-dimensional vectors, effectively fusing multiple signals. However, the nature of this fusion in real-world data is often unclear. To address this, we introduce two methods: (1) Correlation-based Fusion Detection, measuring correlation between known attributes and embeddings, and (2) Additive Fusion Detection, viewing embeddings as sums of individual vectors representing attributes. Applying these methods, word embeddings were found to combine semantic and morphological signals. BERT sentence embeddings were decomposed into individual word vectors of subject, verb and object. In the knowledge graph-based recommender system, user embeddings, even without training on demographic data, exhibited signals of demographics like age and gender. This study highlights that embeddings are fusions of multiple signals, from Word2Vec components to demographic hints in graph embeddings.

CLMar 18, 2024Code
Metaphor Understanding Challenge Dataset for LLMs

Xiaoyu Tong, Rochelle Choenni, Martha Lewis et al.

Metaphors in natural language are a reflection of fundamental cognitive processes such as analogical reasoning and categorisation, and are deeply rooted in everyday communication. Metaphor understanding is therefore an essential task for large language models (LLMs). We release the Metaphor Understanding Challenge Dataset (MUNCH), designed to evaluate the metaphor understanding capabilities of LLMs. The dataset provides over 10k paraphrases for sentences containing metaphor use, as well as 1.5k instances containing inapt paraphrases. The inapt paraphrases were carefully selected to serve as control to determine whether the model indeed performs full metaphor interpretation or rather resorts to lexical similarity. All apt and inapt paraphrases were manually annotated. The metaphorical sentences cover natural metaphor uses across 4 genres (academic, news, fiction, and conversation), and they exhibit different levels of novelty. Experiments with LLaMA and GPT-3.5 demonstrate that MUNCH presents a challenging task for LLMs. The dataset is freely accessible at https://github.com/xiaoyuisrain/metaphor-understanding-challenge.

LGApr 7
Transformer See, Transformer Do: Copying as an Intermediate Step in Learning Analogical Reasoning

Philipp Hellwig, Willem Zuidema, Claire E. Stevenson et al.

Analogical reasoning is a hallmark of human intelligence, enabling us to solve new problems by transferring knowledge from one situation to another. Yet, developing artificial intelligence systems capable of robust human-like analogical reasoning has proven difficult. In this work, we train transformers using Meta-Learning for Compositionality (MLC) on an analogical reasoning task (letter-string analogies) and assess their generalization capabilities. We find that letter-string analogies become learnable when guiding the models to attend to the most informative problem elements induced by including copying tasks in the training data. Furthermore, generalization to new alphabets becomes better when models are trained with more heterogeneous datasets, where our 3-layer encoder-decoder model outperforms most frontier models. The MLC approach also enables some generalization to compositions of trained transformations, but not to completely novel transformations. To understand how the model operates, we identify an algorithm that approximates the model's computations. We verify this using interpretability analyses and show that the model can be steered precisely according to expectations derived from the algorithm. Finally, we discuss implications of our findings for generalization capabilities of larger models and parallels to human analogical reasoning.

CLSep 14, 2025Code
Quantifying Compositionality of Classic and State-of-the-Art Embeddings

Zhijin Guo, Chenhao Xue, Zhaozhen Xu et al.

For language models to generalize correctly to novel expressions, it is critical that they exploit access compositional meanings when this is justified. Even if we don't know what a "pelp" is, we can use our knowledge of numbers to understand that "ten pelps" makes more pelps than "two pelps". Static word embeddings such as Word2vec made strong, indeed excessive, claims about compositionality. The SOTA generative, transformer models and graph models, however, go too far in the other direction by providing no real limits on shifts in meaning due to context. To quantify the additive compositionality, we formalize a two-step, generalized evaluation that (i) measures the linearity between known entity attributes and their embeddings via canonical correlation analysis, and (ii) evaluates additive generalization by reconstructing embeddings for unseen attribute combinations and checking reconstruction metrics such as L2 loss, cosine similarity, and retrieval accuracy. These metrics also capture failure cases where linear composition breaks down. Sentences, knowledge graphs, and word embeddings are evaluated and tracked the compositionality across all layers and training stages. Stronger compositional signals are observed in later training stages across data modalities, and in deeper layers of the transformer-based model before a decline at the top layer. Code is available at https://github.com/Zhijin-Guo1/quantifying-compositionality.

CVAug 28, 2025Code
Evaluating Compositional Generalisation in VLMs and Diffusion Models

Beth Pearson, Bilal Boulbarss, Michael Wray et al.

A fundamental aspect of the semantics of natural language is that novel meanings can be formed from the composition of previously known parts. Vision-language models (VLMs) have made significant progress in recent years, however, there is evidence that they are unable to perform this kind of composition. For example, given an image of a red cube and a blue cylinder, a VLM such as CLIP is likely to incorrectly label the image as a red cylinder or a blue cube, indicating it represents the image as a `bag-of-words' and fails to capture compositional semantics. Diffusion models have recently gained significant attention for their impressive generative abilities, and zero-shot classifiers based on diffusion models have been shown to perform competitively with CLIP in certain compositional tasks. In this work we explore whether the generative Diffusion Classifier has improved compositional generalisation abilities compared to discriminative models. We assess three models -- Diffusion Classifier, CLIP, and ViLT -- on their ability to bind objects with attributes and relations in both zero-shot learning (ZSL) and generalised zero-shot learning (GZSL) settings. Our results show that the Diffusion Classifier and ViLT perform well at concept binding tasks, but that all models struggle significantly with the relational GZSL task, underscoring the broader challenges VLMs face with relational reasoning. Analysis of CLIP embeddings suggests that the difficulty may stem from overly similar representations of relational concepts such as left and right. Code and dataset are available at: https://github.com/otmive/diffusion_classifier_clip

CLApr 3, 2025Code
Hummus: A Dataset of Humorous Multimodal Metaphor Use

Xiaoyu Tong, Zhi Zhang, Martha Lewis et al.

Metaphor and humor share a lot of common ground, and metaphor is one of the most common humorous mechanisms. This study focuses on the humorous capacity of multimodal metaphors, which has not received due attention in the community. We take inspiration from the Incongruity Theory of humor, the Conceptual Metaphor Theory, and the annotation scheme behind the VU Amsterdam Metaphor Corpus, and developed a novel annotation scheme for humorous multimodal metaphor use in image-caption pairs. We create the Hummus Dataset of Humorous Multimodal Metaphor Use, providing expert annotation on 1k image-caption pairs sampled from the New Yorker Caption Contest corpus. Using the dataset, we test state-of-the-art multimodal large language models (MLLMs) on their ability to detect and understand humorous multimodal metaphor use. Our experiments show that current MLLMs still struggle with processing humorous multimodal metaphors, particularly with regard to integrating visual and textual information. We release our dataset and code at github.com/xiaoyuisrain/humorous-multimodal-metaphor-use.

CLAug 12, 2024
Density Matrices for Metaphor Understanding

Jay Owers, Ekaterina Shutova, Martha Lewis

In physics, density matrices are used to represent mixed states, i.e. probabilistic mixtures of pure states. This concept has previously been used to model lexical ambiguity. In this paper, we consider metaphor as a type of lexical ambiguity, and examine whether metaphorical meaning can be effectively modelled using mixtures of word senses. We find that modelling metaphor is significantly more difficult than other kinds of lexical ambiguity, but that our best-performing density matrix method outperforms simple baselines as well as some neural language models.

AIFeb 14, 2024
Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in Large Language Models

Martha Lewis, Melanie Mitchell

Large language models (LLMs) have performed well on several reasoning benchmarks, including ones that test analogical reasoning abilities. However, it has been debated whether they are actually performing humanlike abstract reasoning or instead employing less general processes that rely on similarity to what has been seen in their training data. Here we investigate the generality of analogy-making abilities previously claimed for LLMs (Webb, Holyoak, & Lu, 2023). We take one set of analogy problems used to evaluate LLMs and create a set of "counterfactual" variants-versions that test the same abstract reasoning abilities but that are likely dissimilar from any pre-training data. We test humans and three GPT models on both the original and counterfactual problems, and show that, while the performance of humans remains high for all the problems, the GPT models' performance declines sharply on the counterfactual set. This work provides evidence that, despite previously reported successes of LLMs on analogical reasoning, these models lack the robustness and generality of human analogy-making.

CLNov 21, 2024
Evaluating the Robustness of Analogical Reasoning in Large Language Models

Martha Lewis, Melanie Mitchell

LLMs have performed well on several reasoning benchmarks, including ones that test analogical reasoning abilities. However, there is debate on the extent to which they are performing general abstract reasoning versus employing non-robust processes, e.g., that overly rely on similarity to pre-training data. Here we investigate the robustness of analogy-making abilities previously claimed for LLMs on three of four domains studied by Webb, Holyoak, and Lu (2023): letter-string analogies, digit matrices, and story analogies. For each domain we test humans and GPT models on robustness to variants of the original analogy problems that test the same abstract reasoning abilities but are likely dissimilar from tasks in the pre-training data. The performance of a system that uses robust abstract reasoning should not decline substantially on these variants. On simple letter-string analogies, we find that while the performance of humans remains high for two types of variants we tested, the GPT models' performance declines sharply. This pattern is less pronounced as the complexity of these problems is increased, as both humans and GPT models perform poorly on both the original and variant problems requiring more complex analogies. On digit-matrix problems, we find a similar pattern but only on one out of the two types of variants we tested. On story-based analogy problems, we find that, unlike humans, the performance of GPT models are susceptible to answer-order effects, and that GPT models also may be more sensitive than humans to paraphrasing. This work provides evidence that LLMs often lack the robustness of zero-shot human analogy-making, exhibiting brittleness on most of the variations we tested. More generally, this work points to the importance of carefully evaluating AI systems not only for accuracy but also robustness when testing their cognitive capabilities.

LGJun 4, 2025
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey

Ivan Vegner, Sydelle de Souza, Valentin Forch et al.

A core aspect of compositionality, systematicity is a desirable property in ML models as it enables strong generalization to novel contexts. This has led to numerous studies proposing benchmarks to assess systematic generalization, as well as models and training regimes designed to enhance it. Many of these efforts are framed as addressing the challenge posed by Fodor and Pylyshyn. However, while they argue for systematicity of representations, existing benchmarks and models primarily focus on the systematicity of behaviour. We emphasize the crucial nature of this distinction. Furthermore, building on Hadley's (1994) taxonomy of systematic generalization, we analyze the extent to which behavioural systematicity is tested by key benchmarks in the literature across language and vision. Finally, we highlight ways of assessing systematicity of representations in ML models as practiced in the field of mechanistic interpretability.

CLJan 10, 2024
Grounded learning for compositional vector semantics

Martha Lewis

Categorical compositional distributional semantics is an approach to modelling language that combines the success of vector-based models of meaning with the compositional power of formal semantics. However, this approach was developed without an eye to cognitive plausibility. Vector representations of concepts and concept binding are also of interest in cognitive science, and have been proposed as a way of representing concepts within a biologically plausible spiking neural network. This work proposes a way for compositional distributional semantics to be implemented within a spiking neural network architecture, with the potential to address problems in concept binding, and give a small implementation. We also describe a means of training word representations using labelled images.

AISep 11, 2025
Compositional Concept Generalization with Variational Quantum Circuits

Hala Hawashin, Mina Abbaszadeh, Nicholas Joseph et al.

Compositional generalization is a key facet of human cognition, but lacking in current AI tools such as vision-language models. Previous work examined whether a compositional tensor-based sentence semantics can overcome the challenge, but led to negative results. We conjecture that the increased training efficiency of quantum models will improve performance in these tasks. We interpret the representations of compositional tensor-based models in Hilbert spaces and train Variational Quantum Circuits to learn these representations on an image captioning task requiring compositional generalization. We used two image encoding techniques: a multi-hot encoding (MHE) on binary image vectors and an angle/amplitude encoding on image vectors taken from the vision-language model CLIP. We achieve good proof-of-concept results using noisy MHE encodings. Performance on CLIP image vectors was more mixed, but still outperformed classical compositional models.

CLOct 12, 2020
Modelling Lexical Ambiguity with Density Matrices

Francois Meyer, Martha Lewis

Words can have multiple senses. Compositional distributional models of meaning have been argued to deal well with finer shades of meaning variation known as polysemy, but are not so well equipped to handle word senses that are etymologically unrelated, or homonymy. Moving from vectors to density matrices allows us to encode a probability distribution over different senses of a word, and can also be accommodated within a compositional distributional model of meaning. In this paper we present three new neural models for learning density matrices from a corpus, and test their ability to discriminate between word senses on a range of compositional datasets. When paired with a particular composition method, our best model outperforms existing vector-based compositional models as well as strong sentence encoders.

CLMay 28, 2020
Cats climb entails mammals move: preserving hyponymy in compositional distributional semantics

Gemma De las Cuevas, Andreas Klingler, Martha Lewis et al.

To give vector-based representations of meaning more structure, one approach is to use positive semidefinite (psd) matrices. These allow us to model similarity of words as well as the hyponymy or is-a relationship. Psd matrices can be learnt relatively easily in a given vector space $M\otimes M^*$, but to compose words to form phrases and sentences, we need representations in larger spaces. In this paper, we introduce a generic way of composing the psd matrices corresponding to words. We propose that psd matrices for verbs, adjectives, and other functional words be lifted to completely positive (CP) maps that match their grammatical type. This lifting is carried out by our composition rule called Compression, Compr. In contrast to previous composition rules like Fuzz and Phaser (a.k.a. KMult and BMult), Compr preserves hyponymy. Mathematically, Compr is itself a CP map, and is therefore linear and generally non-commutative. We give a number of proposals for the structure of Compr, based on spiders, cups and caps, and generate a range of composition rules. We test these rules on a small sentence entailment dataset, and see some improvements over the performance of Fuzz and Phaser.

CLMay 11, 2020
Towards logical negation for compositional distributional semantics

Martha Lewis

The categorical compositional distributional model of meaning gives the composition of words into phrases and sentences pride of place. However, it has so far lacked a model of logical negation. This paper gives some steps towards providing this operator, modelling it as a version of projection onto the subspace orthogonal to a word. We give a small demonstration of the operators performance in a sentence entailment task.

CLJan 30, 2019
Compositionality for Recursive Neural Networks

Martha Lewis

Modelling compositionality has been a longstanding area of research in the field of vector space semantics. The categorical approach to compositionality maps grammar onto vector spaces in a principled way, but comes under fire for requiring the formation of very high-dimensional matrices and tensors, and therefore being computationally infeasible. In this paper I show how a linear simplification of recursive neural tensor network models can be mapped directly onto the categorical approach, giving a way of computing the required matrices and tensors. This mapping suggests a number of lines of research for both categorical compositional vector space models of meaning and for recursive neural network models of compositionality.

CLNov 8, 2018
Internal Wiring of Cartesian Verbs and Prepositions

Bob Coecke, Martha Lewis, Dan Marsden

Categorical compositional distributional semantics (CCDS) allows one to compute the meaning of phrases and sentences from the meaning of their constituent words. A type-structure carried over from the traditional categorial model of grammar a la Lambek becomes a 'wire-structure' that mediates the interaction of word meanings. However, CCDS has a much richer logical structure than plain categorical semantics in that certain words can also be given an 'internal wiring' that either provides their entire meaning or reduces the size their meaning space. Previous examples of internal wiring include relative pronouns and intersective adjectives. Here we establish the same for a large class of well-behaved transitive verbs to which we refer as Cartesian verbs, and reduce the meaning space from a ternary tensor to a unary one. Some experimental evidence is also provided.

CLNov 8, 2018
Translating and Evolving: Towards a Model of Language Change in DisCoCat

Tai-Danae Bradley, Martha Lewis, Jade Master et al.

The categorical compositional distributional (DisCoCat) model of meaning developed by Coecke et al. (2010) has been successful in modeling various aspects of meaning. However, it fails to model the fact that language can change. We give an approach to DisCoCat that allows us to represent language models and translations between them, enabling us to describe translations from one language to another, or changes within the same language. We unify the product space representation given in (Coecke et al., 2010) and the functorial description in (Kartsaklis et al., 2013), in a way that allows us to view a language as a catalogue of meanings. We formalize the notion of a lexicon in DisCoCat, and define a dictionary of meanings between two lexicons. All this is done within the framework of monoidal categories. We give examples of how to apply our methods, and give a concrete suggestion for compositional translation in corpora.

CLNov 6, 2018
Proceedings of the 2018 Workshop on Compositional Approaches in Physics, NLP, and Social Sciences

Martha Lewis, Bob Coecke, Jules Hedges et al.

The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. This workshop bring together researchers applying compositional approaches to physics, NLP, cognitive science, and game theory. Within NLP, a long-standing aim is to represent how words can combine to form phrases and sentences. Within the framework of distributional semantics, words are represented as vectors in vector spaces. The categorical model of Coecke et al. [2010], inspired by quantum protocols, has provided a convincing account of compositionality in vector space models of NLP. There is furthermore a history of vector space models in cognitive science. Theories of categorization such as those developed by Nosofsky [1986] and Smith et al. [1988] utilise notions of distance between feature vectors. More recently Gärdenfors [2004, 2014] has developed a model of concepts in which conceptual spaces provide geometric structures, and information is represented by points, vectors and regions in vector spaces. The same compositional approach has been applied to this formalism, giving conceptual spaces theory a richer model of compositionality than previously [Bolt et al., 2018]. Compositional approaches have also been applied in the study of strategic games and Nash equilibria. In contrast to classical game theory, where games are studied monolithically as one global object, compositional game theory works bottom-up by building large and complex games from smaller components. Such an approach is inherently difficult since the interaction between games has to be considered. Research into categorical compositional methods for this field have recently begun [Ghani et al., 2018]. Moreover, the interaction between the three disciplines of cognitive science, linguistics and game theory is a fertile ground for research. Game theory in cognitive science is a well-established area [Camerer, 2011]. Similarly game theoretic approaches have been applied in linguistics [Jäger, 2008]. Lastly, the study of linguistics and cognitive science is intimately intertwined [Smolensky and Legendre, 2006, Jackendoff, 2007]. Physics supplies compositional approaches via vector spaces and categorical quantum theory, allowing the interplay between the three disciplines to be examined.

LOJul 20, 2018
Towards Functorial Language-Games

Jules Hedges, Martha Lewis

In categorical compositional semantics of natural language one studies functors from a category of grammatical derivations (such as a Lambek pregroup) to a semantic category (such as real vector spaces). We compositionally build game-theoretic semantics of sentences by taking the semantic category to be the category whose morphisms are open games. This requires some modifications to the grammar category to compensate for the failure of open games to form a compact closed category. We illustrate the theory using simple examples of Wittgenstein's language-games.

LOMar 24, 2017
Interacting Conceptual Spaces I : Grammatical Composition of Concepts

Joe Bolt, Bob Coecke, Fabrizio Genovese et al.

The categorical compositional approach to meaning has been successfully applied in natural language processing, outperforming other models in mainstream empirical language processing tasks. We show how this approach can be generalized to conceptual space models of cognition. In order to do this, first we introduce the category of convex relations as a new setting for categorical compositional semantics, emphasizing the convex structure important to conceptual space applications. We then show how to construct conceptual spaces for various types such as nouns, adjectives and verbs. Finally we show by means of examples how concepts can be systematically combined to establish the meanings of composite phrases from the meanings of their constituent parts. This provides the mathematical underpinnings of a new compositional approach to cognition.

AIAug 12, 2016
Compositional Distributional Cognition

Yaared Al-Mehairi, Bob Coecke, Martha Lewis

We accommodate the Integrated Connectionist/Symbolic Architecture (ICS) of [32] within the categorical compositional semantics (CatCo) of [13], forming a model of categorical compositional cognition (CatCog). This resolves intrinsic problems with ICS such as the fact that representations inhabit an unbounded space and that sentences with differing tree structures cannot be directly compared. We do so in a way that makes the most of the grammatical structure available, in contrast to strategies like circular convolution. Using the CatCo model also allows us to make use of tools developed for CatCo such as the representation of ambiguity and logical reasoning via density matrices, structural meanings for words such as relative pronouns, and addressing over- and under-extension, all of which are present in cognitive processes. Moreover the CatCog framework is sufficiently flexible to allow for entirely different representations of meaning, such as conceptual spaces. Interestingly, since the CatCo model was largely inspired by categorical quantum mechanics, so is CatCog.

AIAug 4, 2016
Interacting Conceptual Spaces

Josef Bolt, Bob Coecke, Fabrizio Genovese et al.

We propose applying the categorical compositional scheme of [6] to conceptual space models of cognition. In order to do this we introduce the category of convex relations as a new setting for categorical compositional semantics, emphasizing the convex structure important to conceptual space applications. We show how conceptual spaces for composite types such as adjectives and verbs can be constructed. We illustrate this new model on detailed examples.

CLAug 2, 2016
Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science

Dimitrios Kartsaklis, Martha Lewis, Laura Rimell

This volume contains the Proceedings of the 2016 Workshop on Semantic Spaces at the Intersection of NLP, Physics and Cognitive Science (SLPCS 2016), which was held on the 11th of June at the University of Strathclyde, Glasgow, and was co-located with Quantum Physics and Logic (QPL 2016). Exploiting the common ground provided by the concept of a vector space, the workshop brought together researchers working at the intersection of Natural Language Processing (NLP), cognitive science, and physics, offering them an appropriate forum for presenting their uniquely motivated work and ideas. The interplay between these three disciplines inspired theoretically motivated approaches to the understanding of how word meanings interact with each other in sentences and discourse, how diagrammatic reasoning depicts and simplifies this interaction, how language models are determined by input from the world, and how word and sentence meanings interact logically. This first edition of the workshop consisted of three invited talks from distinguished speakers (Hans Briegel, Peter Gärdenfors, Dominic Widdows) and eight presentations of selected contributed papers. Each submission was refereed by at least three members of the Programme Committee, who delivered detailed and insightful comments and suggestions.

AIFeb 5, 2016
Harmonic Grammar in a DisCo Model of Meaning

Martha Lewis, Bob Coecke

The model of cognition developed in (Smolensky and Legendre, 2006) seeks to unify two levels of description of the cognitive process: the connectionist and the symbolic. The theory developed brings together these two levels into the Integrated Connectionist/Symbolic Cognitive architecture (ICS). Clark and Pulman (2007) draw a parallel with semantics where meaning may be modelled on both distributional and symbolic levels, developed by Coecke et al, 2010 into the Distributional Compositional (DisCo) model of meaning. In the current work, we revisit Smolensky and Legendre (S&L)'s model. We describe the DisCo framework, summarise the key ideas in S&L's architecture, and describe how their description of harmony as a graded measure of grammaticality may be applied in the DisCo model.

AIJan 25, 2016
Emerging Dimension Weights in a Conceptual Spaces Model of Concept Combination

Martha Lewis, Jonathan Lawry

We investigate the generation of new concepts from combinations of properties as an artificial language develops. To do so, we have developed a new framework for conjunctive concept combination. This framework gives a semantic grounding to the weighted sum approach to concept combination seen in the literature. We implement the framework in a multi-agent simulation of language evolution and show that shared combination weights emerge. The expected value and the variance of these weights across agents may be predicted from the distribution of elements in the conceptual space, as determined by the underlying environment, together with the rate at which agents adopt others' concepts. When this rate is smaller, the agents are able to converge to weights with lower variance. However, the time taken to converge to a steady state distribution of weights is longer.

AIJan 25, 2016
The Utility of Hedged Assertions in the Emergence of Shared Categorical Labels

Martha Lewis, Jonathan Lawry

We investigate the emergence of shared concepts in a community of language users using a multi-agent simulation. We extend results showing that negated assertions are of use in developing shared categories, to include assertions modified by linguistic hedges. Results show that using hedged assertions positively affects the emergence of shared categories in two distinct ways. Firstly, using contraction hedges like `very' gives better convergence over time. Secondly, using expansion hedges such as `quite' reduces concept overlap. However, both these improvements come at a cost of slower speed of development.

AIJan 25, 2016
A Label Semantics Approach to Linguistic Hedges

Martha Lewis, Jonathan Lawry

We introduce a model for the linguistic hedges `very' and `quite' within the label semantics framework, and combined with the prototype and conceptual spaces theories of concepts. The proposed model emerges naturally from the representational framework we use and as such, has a clear semantic grounding. We give generalisations of these hedge models and show that they can be composed with themselves and with other functions, going on to examine their behaviour in the limit of composition.

AIJan 25, 2016
Concept Generation in Language Evolution

Martha Lewis, Jonathan Lawry

This thesis investigates the generation of new concepts from combinations of existing concepts as a language evolves. We give a method for combining concepts, and will be investigating the utility of composite concepts in language evolution and thence the utility of concept generation.

CLJan 19, 2016
Graded Entailment for Compositional Distributional Semantics

Desislava Bankova, Bob Coecke, Martha Lewis et al.

The categorical compositional distributional model of natural language provides a conceptually motivated procedure to compute the meaning of sentences, given grammatical structure and the meanings of its words. This approach has outperformed other models in mainstream empirical language processing tasks. However, until recently it has lacked the crucial feature of lexical entailment -- as do other distributional models of meaning. In this paper we solve the problem of entailment for categorical compositional distributional semantics. Taking advantage of the abstract categorical framework allows us to vary our choice of model. This enables the introduction of a notion of entailment, exploiting ideas from the categorical semantics of partial knowledge in quantum computation. The new model of language uses density matrices, on which we introduce a novel robust graded order capturing the entailment strength between concepts. This graded measure emerges from a general framework for approximate entailment, induced by any commutative monoid. Quantum logic embeds in our graded order. Our main theorem shows that entailment strength lifts compositionally to the sentence level, giving a lower bound on sentence entailment. We describe the essential properties of graded entailment such as continuity, and provide a procedure for calculating entailment strength.

AISep 22, 2015
A Compositional Explanation of the Pet Fish Phenomenon

Bob Coecke, Martha Lewis

The `pet fish' phenomenon is often cited as a paradigm example of the `non-compositionality' of human concept use. We show here how this phenomenon is naturally accommodated within a compositional distributional model of meaning. This model describes the meaning of a composite concept by accounting for interaction between its constituents via their grammatical roles. We give two illustrative examples to show how the qualitative phenomena are exhibited. We go on to apply the model to experimental data, and finally discuss extensions of the formalism.