Schrodinger's Memory: Large Language Models
This work addresses the fundamental issue of memory in LLMs for researchers and practitioners, but it is incremental as it builds on existing theories without introducing a new paradigm.
The paper tackles the problem of understanding memory mechanisms in Large Language Models (LLMs) by using the Universal Approximation Theorem to explain them and proposing a new assessment method, with experiments verifying memory capabilities across various LLMs.
Memory is the foundation of all human activities; without memory, it would be nearly impossible for people to perform any task in daily life. With the development of Large Language Models (LLMs), their language capabilities are becoming increasingly comparable to those of humans. But do LLMs have memory? Based on current performance, LLMs do appear to exhibit memory. So, what is the underlying mechanism of this memory? Previous research has lacked a deep exploration of LLMs' memory capabilities and the underlying theory. In this paper, we use Universal Approximation Theorem (UAT) to explain the memory mechanism in LLMs. We also conduct experiments to verify the memory capabilities of various LLMs, proposing a new method to assess their abilities based on these memory ability. We argue that LLM memory operates like Schrödinger's memory, meaning that it only becomes observable when a specific memory is queried. We can only determine if the model retains a memory based on its output in response to the query; otherwise, it remains indeterminate. Finally, we expand on this concept by comparing the memory capabilities of the human brain and LLMs, highlighting the similarities and differences in their operational mechanisms.