AISep 21, 2024
Loop Neural Networks for Parameter SharingKei-Sing Ng, Qingchen Wang
The success of large-scale language models like GPT can be attributed to their ability to efficiently predict the next token in a sequence. However, these models rely on constant computational effort regardless of the complexity of the token they are predicting, lacking the capacity for iterative refinement. In this paper, we introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size. Our approach revisits the input multiple times, refining the prediction by iteratively looping over a subset of the model with residual connections. We demonstrate the effectiveness of this method through experiments comparing versions of GPT-2 with our loop models, showing improved performance in language modeling tasks while maintaining similar parameter counts. Importantly, these improvements are achieved without the need for extra training data.
LGDec 27, 2022
Self Meta Pseudo Labels: Meta Pseudo Labels Without The TeacherKei-Sing Ng, Qingchen Wang
We present Self Meta Pseudo Labels, a novel semi-supervised learning method similar to Meta Pseudo Labels but without the teacher model. We introduce a novel way to use a single model for both generating pseudo labels and classification, allowing us to store only one model in memory instead of two. Our method attains similar performance to the Meta Pseudo Labels method while drastically reducing memory usage.
AISep 21, 2025
Similarity Field Theory: A Mathematical Framework for IntelligenceKei-Sing Ng
We posit that persisting and transforming similarity relations form the structural basis of any comprehensible dynamic system. This paper introduces Similarity Field Theory, a mathematical framework that formalizes the principles governing similarity values among entities and their evolution. We define: (1) a similarity field $S: U \times U \to [0,1]$ over a universe of entities $U$, satisfying reflexivity $S(E,E)=1$ and treated as a directed relational field (asymmetry and non-transitivity are allowed); (2) the evolution of a system through a sequence $Z_p=(X_p,S^{(p)})$ indexed by $p=0,1,2,\ldots$; (3) concepts $K$ as entities that induce fibers $F_α(K)={E\in U \mid S(E,K)\ge α}$, i.e., superlevel sets of the unary map $S_K(E):=S(E,K)$; and (4) a generative operator $G$ that produces new entities. Within this framework, we formalize a generative definition of intelligence: an operator $G$ is intelligent with respect to a concept $K$ if, given a system containing entities belonging to the fiber of $K$, it generates new entities that also belong to that fiber. Similarity Field Theory thus offers a foundational language for characterizing, comparing, and constructing intelligent systems. At a high level, this framework reframes intelligence and interpretability as geometric problems on similarity fields -- preserving and composing level-set fibers -- rather than purely statistical ones. We prove two theorems: (i) asymmetry blocks mutual inclusion; and (ii) stability requires either an anchor coordinate or eventual confinement within a level set. These results ensure that the evolution of similarity fields is both constrained and interpretable, culminating in a framework that not only interprets large language models but also introduces a novel way of using them as experimental probes of societal cognition, supported by preliminary evidence across diverse consumer categories.
AIJul 30, 2025
On the Definition of IntelligenceKei-Sing Ng
To engineer AGI, we should first capture the essence of intelligence in a species-agnostic form that can be evaluated, while being sufficiently general to encompass diverse paradigms of intelligent behavior, including reinforcement learning, generative models, classification, analogical reasoning, and goal-directed decision-making. We propose a general criterion based on \textit{entity fidelity}: Intelligence is the ability, given entities exemplifying a concept, to generate entities exemplifying the same concept. We formalise this intuition as \(\varepsilon\)-concept intelligence: it is \(\varepsilon\)-intelligent with respect to a concept if no chosen admissible distinguisher can separate generated entities from original entities beyond tolerance \(\varepsilon\). We present the formal framework, outline empirical protocols, and discuss implications for evaluation, safety, and generalization.