CLApr 8, 2025Code
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuningHanqing Zeng, Yinglong Xia, Zhuokai Zhao et al.
Fine-tuning pre-trained large language models (LLMs) presents a dual challenge of balancing parameter efficiency and model capacity. Existing methods like low-rank adaptations (LoRA) are efficient but lack flexibility, while Mixture-of-Experts (MoE) enhance model capacity at the cost of more & under-utilized parameters. To address these limitations, we propose Structural Mixture of Residual Experts (S'MoRE), a novel framework that seamlessly integrates the efficiency of LoRA with the flexibility of MoE. Conceptually, S'MoRE employs hierarchical low-rank decomposition of expert weights, yielding residuals of varying orders interconnected in a multi-layer structure. By routing input tokens through sub-trees of residuals, S'MoRE emulates the capacity of numerous experts by instantiating and assembling just a few low-rank matrices. We craft the inter-layer propagation of S'MoRE's residuals as a special type of Graph Neural Network (GNN), and prove that under similar parameter budget, S'MoRE improves structural flexibility of traditional MoE (or Mixture-of-LoRA) by exponential order. Comprehensive theoretical analysis and empirical results demonstrate that S'MoRE achieves superior fine-tuning performance, offering a transformative approach for efficient LLM adaptation. Our implementation is available at: https://github.com/ZimpleX/SMoRE-LLM.
PLMay 25, 2021
Comparative Synthesis: Learning Near-Optimal Network Designs by QueryYanjun Wang, Zixuan Li, Chuan Jiang et al.
When managing wide-area networks, network architects must decide how to balance multiple conflicting metrics, and ensure fair allocations to competing traffic while prioritizing critical traffic. The state of practice poses challenges since architects must precisely encode their intent into formal optimization models using abstract notions such as utility functions, and ad-hoc manually tuned knobs. In this paper, we present the first effort to synthesize optimal network designs with indeterminate objectives using an interactive program-synthesis-based approach. We make three contributions. First, we present comparative synthesis, an interactive synthesis framework which produces near-optimal programs (network designs) through two kinds of queries (Propose and Compare), without an objective explicitly given. Second, we develop the first learning algorithm for comparative synthesis in which a voting-guided learner picks the most informative query in each iteration. We present theoretical analysis of the convergence rate of the algorithm. Third, we implemented Net10Q, a system based on our approach, and demonstrate its effectiveness on four real-world network case studies using black-box oracles and simulation experiments, as well as a pilot user study comprising network researchers and practitioners. Both theoretical and experimental results show the promise of our approach.
IVOct 19, 2019
Gastroscopic Panoramic View: Application to Automatic Polyps Detection under GastroscopyChenfei Shi, Yan Xue, Chuan Jiang et al.
Endoscopic diagnosis is an important means for gastric polyp detection. In this paper, a panoramic image of gastroscopy is developed, which can display the inner surface of the stomach intuitively and comprehensively. Moreover, the proposed automatic detection solution can help doctors locate the polyps automatically, and reduce missed diagnosis. The main contributions of this paper are: firstly, a gastroscopic panorama reconstruction method is developed. The reconstruction does not require additional hardware devices, and can solve the problem of texture dislocation and illumination imbalance properly; secondly, an end-to-end multi-object detection for gastroscopic panorama is trained based on deep learning framework. Compared with traditional solutions, the automatic polyp detection system can locate all polyps in the inner wall of stomach in real time and assist doctors to find the lesions. Thirdly, the system was evaluated in the Affiliated Hospital of Zhejiang University. The results show that the average error of the panorama is less than 2 mm, the accuracy of the polyp detection is 95%, and the recall rate is 99%. In addition, the research roadmap of this paper has guiding significance for endoscopy-assisted detection of other human soft cavities.