IT AI NENov 3, 2025

Efficient Vector Symbolic Architectures from Histogram Recovery

arXiv:2511.01838v11.2h-index: 15

Originality Incremental advance

AI Analysis

This work addresses a critical bottleneck in neurosymbolic AI systems by enabling reliable information retrieval from noisy compositional representations, though it is incremental as it builds on prior coding methods.

The paper tackles the problem of noise resilience in vector symbolic architectures (VSAs) by proposing a coding-theoretic approach using concatenated Reed-Solomon and Hadamard codes, achieving efficient recovery with formal guarantees and improved parameters over existing methods like the Hadamard code.

Vector symbolic architectures (VSAs) are a family of information representation techniques which enable composition, i.e., creating complex information structures from atomic vectors via binding and superposition, and have recently found wide ranging applications in various neurosymbolic artificial intelligence (AI) systems. Recently, Raviv proposed the use of random linear codes in VSAs, suggesting that their subcode structure enables efficient binding, while preserving the quasi-orthogonality that is necessary for neural processing. Yet, random linear codes are difficult to decode under noise, which severely limits the resulting VSA's ability to support recovery, i.e., the retrieval of information objects and their attributes from a noisy compositional representation. In this work we bridge this gap by utilizing coding theoretic tools. First, we argue that the concatenation of Reed-Solomon and Hadamard codes is suitable for VSA, due to the mutual quasi-orthogonality of the resulting codewords (a folklore result). Second, we show that recovery of the resulting compositional representations can be done by solving a problem we call histogram recovery. In histogram recovery, a collection of $N$ histograms over a finite field is given as input, and one must find a collection of Reed-Solomon codewords of length $N$ whose entry-wise symbol frequencies obey those histograms. We present an optimal solution to the histogram recovery problem by using algorithms related to list-decoding, and analyze the resulting noise resilience. Our results give rise to a noise-resilient VSA with formal guarantees regarding efficient encoding, quasi-orthogonality, and recovery, without relying on any heuristics or training, and while operating at improved parameters relative to similar solutions such as the Hadamard code.

View on arXiv PDF

Similar