Geng Zhang

11.4LGJul 1, 2025Code

MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE

Geng Zhang, Yuxuan Han, Yuxuan Lou et al.

Mixture-of-Experts (MoE) enables efficient scaling of large language models by activating only a subset of experts per input token. However, deploying MoE-based models incurs significant memory overhead due to the need to retain all experts in memory. While structured pruning is promising to reduce memory costs, existing methods often show suboptimal performance and unstable degradation in three dimensions: model architectures, calibration data sources, and calibration sample sizes. This paper proposes Mixture-of-Novices-and-Experts (MoNE), a novel expert pruning method that replaces redundant experts with lightweight novices to achieve effective and robust model compression. MoNE evaluates expert redundancy based on two metrics: access frequency and output variance. Experts exhibiting low usage and stable outputs are pruned and replaced with lightweight novices-unbiased estimations of their original outputs-minimizing performance degradation. Extensive experiments demonstrate that MoNE consistently outperforms baseline methods with minimal accuracy degradation across the three dimensions, confirming its effectiveness and robustness. Notably, it improves the average zero shot accuracy across nine downstream tasks by up to 2.71 under 25\% pruning ratio and 3.61 under 50\% pruning. The code is available at https://github.com/zxgx/mode-pd.

1.2CVOct 1, 2020

Using Unlabeled Data for Increasing Low-Shot Classification Accuracy of Relevant and Open-Set Irrelevant Images

Spiridon Kasapis, Geng Zhang, Jonathon Smereka et al.

In search, exploration, and reconnaissance tasks performed with autonomous ground vehicles, an image classification capability is needed for specifically identifying targeted objects (relevant classes) and at the same time recognize when a candidate image does not belong to anyone of the relevant classes (irrelevant images). In this paper, we present an open-set low-shot classifier that uses, during its training, a modest number (less than 40) of labeled images for each relevant class, and unlabeled irrelevant images that are randomly selected at each epoch of the training process. The new classifier is capable of identifying images from the relevant classes, determining when a candidate image is irrelevant, and it can further recognize categories of irrelevant images that were not included in the training (unseen). The proposed low-shot classifier can be attached as a top layer to any pre-trained feature extractor when constructing a Convolutional Neural Network.

Geng Zhang

2 Papers