Why LLMs Cannot Think and How to Fix It
It addresses a foundational problem in AI by critiquing LLM limitations and suggesting fixes, but is theoretical without experimental validation.
The paper argues that current LLMs are fundamentally incapable of genuine thought due to architectural constraints, and proposes solutions to enable thought processes in the feature space.
This paper elucidates that current state-of-the-art Large Language Models (LLMs) are fundamentally incapable of making decisions or developing "thoughts" within the feature space due to their architectural constraints. We establish a definition of "thought" that encompasses traditional understandings of that term and adapt it for application to LLMs. We demonstrate that the architectural design and language modeling training methodology of contemporary LLMs inherently preclude them from engaging in genuine thought processes. Our primary focus is on this theoretical realization rather than practical insights derived from experimental data. Finally, we propose solutions to enable thought processes within the feature space and discuss the broader implications of these architectural modifications.