Quan Cheng

h-index11

6papers

5citations

Novelty57%

AI Score49

Ranked #47,825 of 201,326 authors (top 24%)#2,612 in AI (top 18%)

6 Papers

42.6ARMar 27

VeRA+: Vector-Based Lightweight Digital Compensation for Drift-Resilient RRAM In-Memory Computing

Weirong Dong, Kai Zhou, Zhen Kong et al.

RRAM-based in-memory computing (IMC) offers high energy efficiency but suffers from conductance drift that severely degrades long-term accuracy. Existing approaches including retraining, noise-aware training, and Batch Normalization (BN)-based calibration either require RRAM rewriting, demand large storage overhead, or rely on online correction. We propose VeRA+, a lightweight drift compensation framework that reuses shared projection matrices and introduces only two compact drift-specific vectors per drift level. A drift-aware scheduling algorithm offline-trains a small set of VeRA+ parameters and selects the appropriate set over time without any on-chip retraining or data replay. VeRA+ preserves up to 99.77% of the drift-free accuracy after ten years of simulated drift and reduces storage overhead by more than three orders of magnitude compared with BN-based calibration. To validate VeRA+ under realistic device behavior, we extract one-week drift statistics from measurements on our fabricated 1T1R RRAM devices and use them to simulate realistic drifted weights. Under these measured drift conditions, VeRA+ achieves accuracy close to the drift-free baseline, providing an efficient and practical solution for long-term drift resilience in RRAM-IMC.

46.7CVMay 17

Stable Routing for Mixture-of-Experts in Class-Incremental Learning

Zirui Guo, Quan Cheng, Da-Wei Zhou et al.

Class-incremental learning (CIL) requires models to learn new classes sequentially while preserving prior knowledge. Recently, approaches that combine pre-trained models with mixture-of-experts (MoE) have received increasing attention in CIL: they typically expand experts during learning and employ a router to assign weights across experts. However, existing MoE methods often overlook routing drift induced by expert expansion. Once new experts are introduced, the router may reassign samples from earlier classes to newly added experts, thereby perturbing previously established expert compositions and causing interference even when old experts remain frozen. We argue that expandable MoE in CIL requires two complementary properties: stable old-class routing for knowledge preservation and sufficient capacity utilization for new-class adaptation. To this end, we propose Stable Routing for MoE (StaR-MoE), a routing-level framework for expandable MoE in CIL. By incorporating sensitivity-aware routing alignment, StaR-MoE aligns current old-class routing behavior with historical routing distributions through sensitivity-guided constraints. Complementarily, StaR-MoE introduces asymmetric capacity regularization to encourage effective utilization of the expanded expert pool without compromising class-specific routing specialization. Extensive experiments across four standard CIL benchmarks demonstrate that StaR-MoE consistently improves both average and last accuracy over state-of-the-art methods, highlighting the importance of stable routing.

27.8AIMar 16

Why the Valuable Capabilities of LLMs Are Precisely the Unexplainable Ones

Quan Cheng

This paper proposes and argues for a counterintuitive thesis: the truly valuable capabilities of large language models (LLMs) reside precisely in the part that cannot be fully captured by human-readable discrete rules. The core argument is a proof by contradiction via expert system equivalence: if the full capabilities of an LLM could be described by a complete set of human-readable rules, then that rule set would be functionally equivalent to an expert system; but expert systems have been historically and empirically demonstrated to be strictly weaker than LLMs; therefore, a contradiction arises -- the capabilities of LLMs that exceed those of expert systems are exactly the capabilities that cannot be rule-encoded. This thesis is further supported by the Chinese philosophical concept of Wu (sudden insight through practice), the historical failure of expert systems, and a structural mismatch between human cognitive tools and complex systems. The paper discusses implications for interpretability research, AI safety, and scientific epistemology.

23.2AIMar 17

Via Negativa for AI Alignment: Why Negative Constraints Are Structurally Superior to Positive Preferences

Quan Cheng

Recent empirical results have demonstrated that training large language models (LLMs) with negative-only feedback can match or exceed standard reinforcement learning from human feedback (RLHF). Negative Sample Reinforcement achieves parity with PPO on mathematical reasoning; Distributional Dispreference Optimization trains effectively using only dispreferred samples; and Constitutional AI outperforms pure RLHF on harmlessness benchmarks. Yet no unified theoretical account explains why negative signals are so effective. This paper proposes such an account: positive preferences and negative constraints are structurally asymmetric. Positive preferences ("which is better") encode continuously coupled, context-dependent human values that cannot be exhaustively specified -- leading models to learn surface correlates such as agreement with the user (sycophancy). Negative constraints ("what is wrong") encode discrete, finite, independently verifiable prohibitions that can converge to a stable boundary. This asymmetry -- rooted in Popper's falsification logic and the epistemology of negative knowledge -- explains both the sycophancy failure of preference-based RLHF and the surprising effectiveness of negative-signal methods. We argue that alignment research should shift its center of gravity from "learning what humans prefer" to "learning what humans reject," and offer testable predictions for this framework.

CVMay 17, 2025

Continuous Subspace Optimization for Continual Learning

Quan Cheng, Yuanyu Wan, Lingyu Wu et al.

Continual learning aims to learn multiple tasks sequentially while preserving prior knowledge, but faces the challenge of catastrophic forgetting when adapting to new tasks. Recently, approaches leveraging pre-trained models have gained increasing popularity in mitigating this issue, due to the strong generalization ability of foundation models. To adjust pre-trained models for new tasks, existing methods usually employ low-rank adaptation, which restricts parameter updates to a fixed low-rank subspace. However, constraining the optimization space inherently compromises the model's learning capacity, resulting in inferior performance. To address this limitation, we propose Continuous Subspace Optimization for Continual Learning (CoSO) to fine-tune the model in a series of subspaces rather than a single one. These sequential subspaces are dynamically determined through the singular value decomposition of the gradients. CoSO updates the model by projecting gradients onto these subspaces, ensuring memory-efficient optimization. To mitigate forgetting, the optimization subspace of each task is constrained to be orthogonal to the historical task subspace. During task learning, CoSO maintains a task-specific component that captures the critical update directions for the current task. Upon completing a task, this component is used to update the historical task subspace, laying the groundwork for subsequent learning. Extensive experiments on multiple datasets demonstrate that CoSO significantly outperforms state-of-the-art methods, especially in challenging scenarios with long task sequences.

ARApr 2, 2025

Efficient Calibration for RRAM-based In-Memory Computing using DoRA

Weirong Dong, Kai Zhou, Zhen Kong et al.

Resistive In-Memory Computing (RIMC) offers ultra-efficient computation for edge AI but faces accuracy degradation due to RRAM conductance drift over time. Traditional retraining methods are limited by RRAM's high energy consumption, write latency, and endurance constraints. We propose a DoRA-based calibration framework that restores accuracy by compensating influential weights with minimal calibration parameters stored in SRAM, leaving RRAM weights untouched. This eliminates in-field RRAM writes, ensuring energy-efficient, fast, and reliable calibration. Experiments on RIMC-based ResNet50 (ImageNet-1K) demonstrate 69.53% accuracy restoration using just 10 calibration samples while updating only 2.34% of parameters.