Misha Tsodyks

CL
h-index53
6papers
40citations
Novelty62%
AI Score40

6 Papers

CLNov 8, 2023
Large-scale study of human memory for meaningful narratives

Antonios Georgiou, Tankut Can, Mikhail Katkov et al.

The statistical study of human memory requires large-scale experiments, involving many stimuli conditions and test subjects. While this approach has proven to be quite fruitful for meaningless material such as random lists of words, naturalistic stimuli, like narratives, have until now resisted such a large-scale study, due to the quantity of manual labor required to design and analyze such experiments. In this work, we develop a pipeline that uses large language models (LLMs) both to design naturalistic narrative stimuli for large-scale recall and recognition memory experiments, as well as to analyze the results. We performed online memory experiments with a large number of participants and collected recognition and recall data for narratives of different sizes. We found that both recall and recognition performance scale linearly with narrative length; however, for longer narratives people tend to summarize the content rather than recalling precise details. To investigate the role of narrative comprehension in memory, we repeated these experiments using scrambled versions of the narratives. Although recall performance declined significantly, recognition remained largely unaffected. Recalls in this condition seem to follow the original narrative order rather than the actual scrambled presentation, pointing to a contextual reconstruction of the story in memory. Finally, using LLM text embeddings, we construct a simple measure for each clause based on semantic similarity to the whole narrative, that shows a strong correlation with recall probability. Overall, our work demonstrates the power of LLMs in accessing new regimes in the study of human memory, as well as suggesting novel psychologically informed benchmarks for LLM performance.

NCAug 14, 2024
Synaptic Theory of Chunking in Working Memory

Weishun Zhong, Mikhail Katkov, Misha Tsodyks

Working memory often appears to exceed its basic span by organizing items into compact representations called chunks. Chunking can be learned over time for familiar inputs; however, it can also arise spontaneously for novel stimuli. Such on-the-fly structuring is crucial for cognition, yet the underlying neural mechanism remains unclear. Here we introduce a synaptic theory of chunking, in which short-term synaptic plasticity enables the formation of chunk representations in working memory. We show that a specialized population of ``chunking neurons'' selectively controls groups of stimulus-responsive neurons, akin to gating. As a result, the network maintains and retrieves the stimuli in chunks, thereby exceeding the basic capacity. Moreover, we show that our model can dynamically construct hierarchical representations within working memory through hierarchical chunking. A consequence of this proposed mechanism is a new limit on the number of items that can be stored and subsequently retrieved from working memory, depending only on the basic working memory capacity when chunking is not invoked. Predictions from our model were confirmed by analyzing single-unit responses in epileptic patients and memory experiments with verbal material. Our work provides a novel conceptual and analytical framework for understanding how the brain organizes information in real time.

CLFeb 13
Semantic Chunking and the Entropy of Natural Language

Weishun Zhong, Doron Sivan, Tankut Can et al.

The entropy rate of printed English is famously estimated to be about one bit per character, a benchmark that modern large language models (LLMs) have only recently approached. This entropy rate implies that English contains nearly 80 percent redundancy relative to the five bits per character expected for random text. We introduce a statistical model that attempts to capture the intricate multi-scale structure of natural language, providing a first-principles account of this redundancy level. Our model describes a procedure of self-similarly segmenting text into semantically coherent chunks down to the single-word level. The semantic structure of the text can then be hierarchically decomposed, allowing for analytical treatment. Numerical experiments with modern LLMs and open datasets suggest that our model quantitatively captures the structure of real texts at different levels of the semantic hierarchy. The entropy rate predicted by our model agrees with the estimated entropy rate of printed English. Moreover, our theory further reveals that the entropy rate of natural language is not fixed but should increase systematically with the semantic complexity of corpora, which are captured by the only free parameter in our model.

STAT-MECHDec 2, 2024
Random Tree Model of Meaningful Memory

Weishun Zhong, Tankut Can, Antonis Georgiou et al.

Traditional studies of memory for meaningful narratives focus on specific stories and their semantic structures but do not address common quantitative features of recall across different narratives. We introduce a statistical ensemble of random trees to represent narratives as hierarchies of key points, where each node is a compressed representation of its descendant leaves, which are the original narrative segments. Recall is modeled as constrained by working memory capacity from this hierarchical structure. Our analytical solution aligns with observations from large-scale narrative recall experiments. Specifically, our model explains that (1) average recall length increases sublinearly with narrative length, and (2) individuals summarize increasingly longer narrative segments in each recall sentence. Additionally, the theory predicts that for sufficiently long narratives, a universal, scale-invariant limit emerges, where the fraction of a narrative summarized by a single recall sentence follows a distribution independent of narrative length.

CLNov 19, 2024
Information Theory of Meaningful Communication

Doron Sivan, Misha Tsodyks

In Shannon's seminal paper, entropy of printed English, treated as a stationary stochastic process, was estimated to be roughly 1 bit per character. However, considered as a means of communication, language differs considerably from its printed form: (i) the units of information are not characters or even words but clauses, i.e. shortest meaningful parts of speech; and (ii) what is transmitted is principally the meaning of what is being said or written, while the precise phrasing that was used to communicate the meaning is typically ignored. In this study, we show that one can leverage recently developed large language models to quantify information communicated in meaningful narratives in terms of bits of meaning per clause.

NCNov 1, 2021
Recurrent neural network models for working memory of continuous variables: activity manifolds, connectivity patterns, and dynamic codes

Christopher J. Cueva, Adel Ardalan, Misha Tsodyks et al.

Many daily activities and psychophysical experiments involve keeping multiple items in working memory. When items take continuous values (e.g., orientation, contrast, length, loudness) they must be stored in a continuous structure of appropriate dimensions. We investigate how this structure is represented in neural circuits by training recurrent networks to report two previously shown stimulus orientations. We find the activity manifold for the two orientations resembles a Clifford torus. Although a Clifford and standard torus (the surface of a donut) are topologically equivalent, they have important functional differences. A Clifford torus treats the two orientations equally and keeps them in orthogonal subspaces, as demanded by the task, whereas a standard torus does not. We find and characterize the connectivity patterns that support the Clifford torus. Moreover, in addition to attractors that store information via persistent activity, our networks also use a dynamic code where units change their tuning to prevent new sensory input from overwriting the previously stored one. We argue that such dynamic codes are generally required whenever multiple inputs enter a memory system via shared connections. Finally, we apply our framework to a human psychophysics experiment in which subjects reported two remembered orientations. By varying the training conditions of the RNNs, we test and support the hypothesis that human behavior is a product of both neural noise and reliance on the more stable and behaviorally relevant memory of the ordinal relationship between the two orientations. This suggests that suitable inductive biases in RNNs are important for uncovering how the human brain implements working memory. Together, these results offer an understanding of the neural computations underlying a class of visual decoding tasks, bridging the scales from human behavior to synaptic connectivity.