LGETITJul 19, 2025

Discrete approach to machine learning

arXiv:2508.00869v1
Originality Synthesis-oriented
AI Analysis

This work proposes a novel discrete approach for structural information processing, potentially impacting domains like natural language processing and bioinformatics, but it appears incremental as it builds on existing encoding and geometric methods.

The paper tackles the problem of encoding and processing structural information in machine learning by introducing a discrete method for dimensionality reduction and geometric embeddings of code spaces, applied to language morphology and immunohistochemical markers, and draws parallels to neocortex organization.

The article explores an encoding and structural information processing approach using sparse bit vectors and fixed-length linear vectors. The following are presented: a discrete method of speculative stochastic dimensionality reduction of multidimensional code and linear spaces with linear asymptotic complexity; a geometric method for obtaining discrete embeddings of an organised code space that reflect the internal structure of a given modality. The structure and properties of a code space are investigated using three modalities as examples: morphology of Russian and English languages, and immunohistochemical markers. Parallels are drawn between the resulting map of the code space layout and so-called pinwheels appearing on the mammalian neocortex. A cautious assumption is made about similarities between neocortex organisation and processes happening in our models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes