AICLCTAug 28, 2025

Transparent Semantic Spaces: A Categorical Approach to Explainable Word Embeddings

arXiv:2508.20701v1h-index: 2
Originality Incremental advance
AI Analysis

This work addresses the problem of black-box AI for researchers and practitioners by providing a mathematically precise and explainable approach to word embeddings, though it is incremental in applying category theory to this domain.

The paper tackles the lack of explainability in AI systems by developing a category theory framework for word embeddings, resulting in a transparent method that mathematically proves the equivalence between GloVe, Word2Vec, and MDS algorithms.

The paper introduces a novel framework based on category theory to enhance the explainability of artificial intelligence systems, particularly focusing on word embeddings. Key topics include the construction of categories $\mathcal{L}_T$ and $\mathcal{P}_T$, providing schematic representations of the semantics of a text $ T $, and reframing the selection of the element with maximum probability as a categorical notion. Additionally, the monoidal category $\mathcal{P}_T$ is constructed to visualize various methods of extracting semantic information from $T$, offering a dimension-agnostic definition of semantic spaces reliant solely on information within the text. Furthermore, the paper defines the categories of configurations Conf and word embeddings $\mathcal{Emb}$, accompanied by the concept of divergence as a decoration on $\mathcal{Emb}$. It establishes a mathematically precise method for comparing word embeddings, demonstrating the equivalence between the GloVe and Word2Vec algorithms and the metric MDS algorithm, transitioning from neural network algorithms (black box) to a transparent framework. Finally, the paper presents a mathematical approach to computing biases before embedding and offers insights on mitigating biases at the semantic space level, advancing the field of explainable artificial intelligence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes