CLFeb 28, 2025
Beyond Words: A Latent Memory Approach to Internal Reasoning in LLMsJosé I. Orlicki
Recent advances in large language models (LLMs) have popularized the chain-of-thought (CoT) paradigm, in which models produce explicit reasoning steps in natural language. Although this approach improves interpretability and facilitates external auditing, it may not represent the most computationally efficient method for internal reasoning. In contrast, human cognition relies on implicit mental representations that recall past sensory and episodic information without requiring complete verbalization. In this paper, we propose a framework that integrates implicit mental representations into the internal reasoning processes of LLMs. Preliminary experiments indicate that incorporating an Implicit Memory Module (IMM) into a simple GPT model yields a reduction of between 35% and 57% in final training loss compared to a regular GPT baseline. The addition of an explicit interpretability channel (e.g., a chain-of-thought decoder) is straightforward to implement within this approach. We outline theoretical foundations, propose technical mechanisms to scale the memory module, and discuss how these ideas may lead to more efficient and robust reasoning, with optional future extensions for explicit auditability.
LGApr 10, 2025
PoGO: A Scalable Proof of Useful Work via Quantized Gradient Descent and Merkle ProofsJosé I. Orlicki
We present a design called Proof of Gradient Optimization (PoGO) for blockchain consensus, where miners produce verifiable evidence of training large-scale machine-learning models. Building on previous work, we incorporate quantized gradients (4-bit precision) to reduce storage and computation requirements, while still preserving the ability of verifiers to check that real progress has been made on lowering the model's loss. Additionally, we employ Merkle proofs over the full 32-bit model to handle large parameter sets and to enable random leaf checks with minimal on-chain data. We illustrate these ideas using GPT-3 (175B parameters) as a reference example and also refer to smaller but high-performance models (e.g., Gemma~3 with 27B parameters). We provide an empirical cost analysis showing that verification is significantly cheaper than training, thanks in part to quantization and sampling. We also discuss the necessity of longer block times (potentially hours) when incorporating meaningful training steps, the trade-offs when using specialized GPU hardware, and how binary diffs may incrementally optimize updates. Finally, we note that fine-tuning can be handled in a similar manner, merely changing the dataset and the manner of sampling but preserving the overall verification flow. Our protocol allows verifiers to issue either positive or negative attestations; these are aggregated at finalization to either confirm the update or slash the miner.
CRAug 24, 2020
Fair Proof-of-Stake using VDF+VRF ConsensusJosé I. Orlicki
We propose a new Proof-of-Stake consensus protocol constructed with a verifiable random function (VRF) and a verifiable delay function (VDF) that has the following properties: a) all addresses with positive stake can participate; b) is fair because the coin stake is proportional to the distribution of rewards; c) is resistant to several classic blockchain attacks such as Sybil attacks, "Nothing-at-stake" attacks and "Winner-takes-all" attacks. We call it Vixify Consensus.