LGFeb 6
Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold SteeringCraig Atkinson
Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output entropy from hallucination by tethering hidden-state trajectories to a pre-computed truthfulness manifold. HELIX computes a Unified Truth Score (UTS) combining token-level semantic entropy with Mahalanobis distance from the manifold. When UTS indicates trajectory divergence, graduated steering vectors redirect activations toward structurally coherent regions while affecting only 0.2-2.5% of tokens. On 4-bit quantized Granite 4.0 H Small (32B/9B active, hybrid Mamba-Transformer): GSM8K maintains 88.84% accuracy at T = 3.0 (2.81pp degradation from T = 0.5); MMLU maintains 72.49% across 14,042 questions (1.24pp degradation). This demonstrates that high-temperature hallucination is primarily trajectory divergence rather than semantic collapse. Notably, steering the sparse Transformer attention layers (~10% of layers) is sufficient to correct drift in the Mamba-2 state-space formulation. Geometric tethering reveals a previously-masked High-Entropy Creative Reservoir. At T > 2.0, steered outputs exhibit 5-20% idea duplication versus 70-80% at conservative settings. Cross-architecture validation (Qwen3-30B-A3B MOE) confirms this phenomenon is architecture-independent, with 46.7% higher unique concept generation. HELIX acts as a syntax tether, enabling exploration of semantic diversity without violating the logical backbone required for valid output. This enables Multi-Temperature Synthesis, generating 200% more unique concepts than single-temperature inference.
LGMar 13, 2021
Conceptual capacity and effective complexity of neural networksLech Szymanski, Brendan McCane, Craig Atkinson
We propose a complexity measure of a neural network mapping function based on the diversity of the set of tangent spaces from different inputs. Treating each tangent space as a linear PAC concept we use an entropy-based measure of the bundle of concepts in order to estimate the conceptual capacity of the network. The theoretical maximal capacity of a ReLU network is equivalent to the number of its neurons. In practice however, due to correlations between neuron activities within the network, the actual capacity can be remarkably small, even for very big networks. Empirical evaluations show that this new measure is correlated with the complexity of the mapping function and thus the generalisation capabilities of the corresponding network. It captures the effective, as oppose to the theoretical, complexity of the network function. We also showcase some uses of the proposed measure for analysis and comparison of trained neural network models.
LGJan 16, 2020
MIME: Mutual Information Minimisation ExplorationHaitao Xu, Brendan McCane, Lech Szymanski et al.
We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn. We propose a counter-intuitive solution that we call Mutual Information Minimising Exploration (MIME) where an agent learns a latent representation of the environment without trying to predict the future states. We show that our agent performs significantly better over sharp transition boundaries while matching the performance of surprisal driven agents elsewhere. In particular, we show state-of-the-art performance on difficult learning games such as Gravitar, Montezuma's Revenge and Doom.
LGNov 27, 2019
GRIm-RePR: Prioritising Generating Important Features for Pseudo-RehearsalCraig Atkinson, Brendan McCane, Lech Szymanski et al.
Pseudo-rehearsal allows neural networks to learn a sequence of tasks without forgetting how to perform in earlier tasks. Preventing forgetting is achieved by introducing a generative network which can produce data from previously seen tasks so that it can be rehearsed along side learning the new task. This has been found to be effective in both supervised and reinforcement learning. Our current work aims to further prevent forgetting by encouraging the generator to accurately generate features important for task retention. More specifically, the generator is improved by introducing a second discriminator into the Generative Adversarial Network which learns to classify between real and fake items from the intermediate activation patterns that they produce when fed through a continual learning agent. Using Atari 2600 games, we experimentally find that improving the generator can considerably reduce catastrophic forgetting compared to the standard pseudo-rehearsal methods used in deep reinforcement learning. Furthermore, we propose normalising the Q-values taught to the long-term system as we observe this substantially reduces catastrophic forgetting by minimising the interference between tasks' reward functions.
LGSep 25, 2019
Switched linear projections for neural network interpretabilityLech Szymanski, Brendan McCane, Craig Atkinson
We introduce switched linear projections for expressing the activity of a neuron in a deep neural network in terms of a single linear projection in the input space. The method works by isolating the active subnetwork, a series of linear transformations, that determine the entire computation of the network for a given input instance. With these projections we can decompose activity in any hidden layer into patterns detected in a given input instance. We also propose that in ReLU networks it is instructive and meaningful to examine patterns that deactivate the neurons in a hidden layer, something that is implicitly ignored by the existing interpretability methods tracking solely the active aspect of the network's computation.
LGDec 6, 2018
Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic ForgettingCraig Atkinson, Brendan McCane, Lech Szymanski et al.
Neural networks can achieve excellent results in a wide variety of applications. However, when they attempt to sequentially learn, they tend to learn the new task while catastrophically forgetting previous ones. We propose a model that overcomes catastrophic forgetting in sequential reinforcement learning by combining ideas from continual learning in both the image classification domain and the reinforcement learning domain. This model features a dual memory system which separates continual learning from reinforcement learning and a pseudo-rehearsal system that "recalls" items representative of previous tasks via a deep generative network. Our model sequentially learns Atari 2600 games without demonstrating catastrophic forgetting and continues to perform above human level on all three games. This result is achieved without: demanding additional storage requirements as the number of tasks increases, storing raw data or revisiting past tasks. In comparison, previous state-of-the-art solutions are substantially more vulnerable to forgetting on these complex deep reinforcement learning tasks.
LGFeb 12, 2018
Pseudo-Recursal: Solving the Catastrophic Forgetting Problem in Deep Neural NetworksCraig Atkinson, Brendan McCane, Lech Szymanski et al.
In general, neural networks are not currently capable of learning tasks in a sequential fashion. When a novel, unrelated task is learnt by a neural network, it substantially forgets how to solve previously learnt tasks. One of the original solutions to this problem is pseudo-rehearsal, which involves learning the new task while rehearsing generated items representative of the previous task/s. This is very effective for simple tasks. However, pseudo-rehearsal has not yet been successfully applied to very complex tasks because in these tasks it is difficult to generate representative items. We accomplish pseudo-rehearsal by using a Generative Adversarial Network to generate items so that our deep network can learn to sequentially classify the CIFAR-10, SVHN and MNIST datasets. After training on all tasks, our network loses only 1.67% absolute accuracy on CIFAR-10 and gains 0.24% absolute accuracy on SVHN. Our model's performance is a substantial improvement compared to the current state of the art solution.