Jing Ye

h-index17

21papers

137citations

Novelty52%

AI Score59

Ranked #11,118 of 201,326 authors (top 6%)#2,675 in CL (top 8%)

21 Papers

LGSep 15, 2023Code

CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending

Shiyi Zhu, Jing Ye, Wei Jiang et al.

Self-attention and position embedding are two key modules in transformer-based Large Language Models (LLMs). However, the potential relationship between them is far from well studied, especially for long context window extending. In fact, anomalous behaviors harming long context extrapolation exist between Rotary Position Embedding (RoPE) and vanilla self-attention unveiled by our work. To address this issue, we propose a novel attention mechanism, CoCA (Collinear Constrained Attention). Specifically, we enforce a collinear constraint between $Q$ and $K$ to seamlessly integrate RoPE and self-attention. While only adding minimal computational and spatial complexity, this integration significantly enhances long context window extrapolation ability. We provide an optimized implementation, making it a drop-in replacement for any existing transformer-based models. Extensive experiments show that CoCA performs extraordinarily well in extending context windows. A CoCA-based GPT model, trained with a context length of 512, can seamlessly extend the context window up to 32K (60$\times$), without any fine-tuning. Additionally, by dropping CoCA in LLaMA-7B, we achieve extrapolation up to 32K within only 2K training length. Our code is publicly available at: https://github.com/codefuse-ai/Collinear-Constrained-Attention

IVNov 1, 2023Code

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

Xiaohua Jiang, Yihao Guo, Jian Huang et al.

The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in available datasets and noise artifacts. To surmount these challenges, we introduced Stochastic Defect Injection (SDi) to augment the representational diversity of challenging indistinct-boundary objects within training corpora. Consequently, we propose the Dual-Encoder Fourier Group Harmonics Network (DEFN) to tailor noise filtration, amplify detailed feature recognition, and bolster representation across diverse medical imaging scenarios. By incorporating Dynamic Weight Composing (DWC) loss dynamically adjusts model's focus based on training progression, DEFN achieves SOTA performance on the OIMHS public dataset, showcasing effectiveness in indistinct boundary contexts. Source code for DEFN is available at: https://github.com/IMOP-lab/DEFN-pytorch.

ARMay 27

FT-Pilot: Automated Fault-Tolerant RTL Rewriting via Vulnerability-Guided LLMs

Weixing Liu, Zizhen Liu, Jing Ye et al.

As integrated circuit technologies continue to scale toward advanced process nodes, the continual reduction in node capacitance and supply voltage has made digital systems increasingly vulnerable to soft errors. Although traditional full-chip hardening methods can improve reliability, they often incur unacceptable area and power overhead, making selective hardening a more practical engineering solution. However, existing approaches typically rely on time-consuming fault-injection simulation to determine hardening locations through vulnerability analysis, and still depend heavily on manual strategy selection and RTL modification during the hardening stage, making them ill-suited for efficient automated reliability optimization at early design stages. To address these challenges, this paper proposes FT-Pilot, a GNN-guided LLM framework for automatic RTL soft-error hardening. The framework first employs a GNN to identify critical vulnerable assets directly at the RTL level, and then introduces an LLM-driven rewriting engine composed of an analyzer and a rewriter, which performs RTL-level fault-tolerant code rewriting with the support of dual-knowledge-base retrieval-augmented generation and an automatic repair mechanism. Experimental results show that the proposed framework can automatically generate hardened RTL designs that are syntactically correct, functionally correct, and synthesizable across multiple benchmark circuits, while significantly reducing output error rates under soft-error scenarios. This work provides a practical automated path toward shift-left reliability optimization at the RTL level.

ARApr 8Code

CoverAssert: Iterative LLM Assertion Generation Driven by Functional Coverage via Syntax-Semantic Representations

Yonghao Wang, Yang Yin, Hongqin Lyu et al.

LLMs can generate SystemVerilog assertions (SVAs) from natural language specs, but single-pass outputs often lack functional coverage due to limited IC design understanding. We propose CoverAssert, an iterative framework that clusters semantic and AST-based structural features of assertions, maps them to specifications, and uses functional coverage feedback to guide LLMs in prioritizing uncovered points. Experiments on four open-source designs show that integrating CoverAssert with AssertLLM and Spec2Assertion improves average improvements of 9.57 % in branch coverage, 9.64 % in statement coverage, and 15.69 % in toggle coverage.

AIMar 16

Listening to the Echo: User-Reaction Aware Policy Optimization via Scalar-Verbal Hybrid Reinforcement Learning

Jing Ye, Xinpei Zhao, Lu Xiang et al.

While current emotional support dialogue systems typically rely on expert-defined scalar rewards for alignment, these signals suffer from severe information sparsity. They cannot explain why a response failed or how to adapt to dynamic user states, often diverging from the actual goal of facilitating positive emotional shifts. In practice, the most direct and reliable learning signal emerges from the user's continuous reactions during ongoing interaction. We therefore propose Reaction Aware Policy Optimization (RAPO), a framework that optimizes over interaction consequences rather than rubric scores. RAPO treats dialogue as a reaction-driven process and utilizes simulated user responses to generate dense natural-language feedback through three core components: Hindsight Dialogue Selection, which isolates pivotal turns that meaningfully alter user emotional trajectories; Generative Hindsight Feedback, which transforms user reactions into contrastive ranking signals and natural-language critiques; and Scalar-Verbal Hybrid Policy Optimization, which couples scalar reward optimization for global alignment with verbal feedback distillation for fine-grained semantic refinement. Extensive experiments on ESC and Sotopia demonstrate that RAPO significantly outperforms strong reinforcement learning baselines in driving positive interaction outcomes.

ARApr 17

Pecker: Bug Localization Framework for Sequential Designs via Causal Chain Reconstruction

Jiaping Tang, Jianan Mu, Tianyun Ma et al.

Debugging represents a time-consuming and labor-intensive task in hardware design, with bug localization constituting a substantial portion of this process. While spectrum-based bug localization techniques have achieved remarkable success in software domains and shown promise for hardware description languages, their effectiveness severely degrades in sequential designs. Unlike software programs, hardware designs exhibit intrinsic temporal characteristics that create fundamental challenges: timing misalignment between bug activation and observation, and progressive error propagation through state elements that obscures the root cause. To address these limitations, we propose Pecker, a novel bug localization framework that reconstructs the broken causal chain in sequential designs. Our approach introduces two key innovations: temporal backtracking using Estimated Minimal Propagation Cycles to identify potential activation cycles, strategic trace pruning to eliminate state pollution effects. We evaluate Pecker on comprehensive benchmarks comprising both combinational and sequential circuits. Experimental results demonstrate that Pecker effectively localizes 51%/80%/85% bugs within Top-1/3/5 ranks respectively, significantly outperforming state-of-the-art techniques. Notably, Pecker maintains robust performance across circuit complexities while existing methods exhibit severe degradation on sequential designs.

CLMay 29, 2025Code

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Feiteng Fang, Ting-En Lin, Yuchuan Wu et al.

Role-Playing Language Agents (RPLAs) aim to simulate characters for realistic and engaging human-computer interactions. However, traditional reward models often struggle with scalability and adapting to subjective conversational preferences. We propose ChARM, a Character-based Act-adaptive Reward Model, addressing these challenges through two innovations: (1) an act-adaptive margin that significantly enhances learning efficiency and generalizability, and (2) a self-evolution mechanism leveraging large-scale unlabeled data to improve training coverage. Additionally, we introduce RoleplayPref, the first large-scale preference dataset specifically for RPLAs, featuring 1,108 characters, 13 subcategories, and 16,888 bilingual dialogues, alongside RoleplayEval, a dedicated evaluation benchmark. Experimental results show a 13% improvement over the conventional Bradley-Terry model in preference rankings. Furthermore, applying ChARM-generated rewards to preference learning techniques (e.g., direct preference optimization) achieves state-of-the-art results on CharacterEval and RoleplayEval. Code and dataset are available at https://github.com/calubkk/ChARM.

CRMar 24

On the Vulnerability of FHE Computation to Silent Data Corruption

Jianan Mu, Ge Yu, Zhaoxuan Kan et al.

Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations mature and begin moving toward practical deployment in domains such as secure finance, biomedical analytics, and privacy-preserving AI, a critical question remains insufficiently explored: how reliable is FHE computation on real hardware? This question is especially important because, compared with plaintext computation, FHE incurs much higher computational overhead, making it more susceptible to transient hardware faults. Moreover, data corruptions are likely to remain silent: the FHE service has no access to the underlying plaintext, causing unawareness even though the corresponding decrypted result has already been corrupted. To this end, we conduct a comprehensive evaluation of SDCs in FHE ciphertext computation. Through large-scale fault-injection experiments, we characterize the vulnerability of FHE to transient faults, and through a theoretical analysis of error-propagation behaviors, we gain deeper algorithmic insight into the mechanisms underlying this vulnerability. We further assess the effectiveness of different fault-tolerance mechanisms for mitigating these faults.

SEOct 9, 2025Code

Faver: Boosting LLM-based RTL Generation with Function Abstracted Verifiable Middleware

Jianan Mu, Mingyu Shi, Yining Wang et al.

LLM-based RTL generation is an interesting research direction, as it holds the potential to liberate the least automated stage in the current chip design. However, due to the substantial semantic gap between high-level specifications and RTL, coupled with limited training data, existing models struggle with generation accuracy. Drawing on human experience, design with verification helps improving accuracy. However, as the RTL testbench data are even more scarce, it is not friendly for LLMs. Although LLMs excel at higher-level languages like Python/C, they have a huge semantic gap from RTL. When implementing the same functionality, Python/C code and hardware code differ significantly in the spatiotemporal granularity, requiring the LLM not only to consider high-level functional semantics but also to ensure the low-level details align with the circuit code. It is not an easy task. In this paper, we propose a function abstracted verifiable middleware (Faver) that streamlines RTL verification in LLM-based workflows. By mixing LLM-friendly code structures with a rule-based template, Faver decouples the details of circuit verification, allowing the LLM to focus on the functionality itself. In our experiments on the SFT model and open-source models, Faver improved the model's generation accuracy by up to 14%.

IVJul 29, 2025Code

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Yutao Hu, Ying Zheng, Shumei Miao et al.

Foundation models have demonstrated remarkable potential in medical domain. However, their application to complex cardiovascular diagnostics remains underexplored. In this paper, we present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images. Cardiac-CLIP is developed through a two-stage pre-training strategy. The first stage employs a 3D masked autoencoder (MAE) to perform self-supervised representation learning from large-scale unlabeled volumetric data, enabling the visual encoder to capture rich anatomical and contextual features. In the second stage, contrastive learning is introduced to align visual and textual representations, facilitating cross-modal understanding. To support the pre-training, we collect 16641 real clinical CT scans, supplemented by 114k publicly available data. Meanwhile, we standardize free-text radiology reports into unified templates and construct the pathology vectors according to diagnostic attributes, based on which the soft-label matrix is generated to supervise the contrastive learning process. On the other hand, to comprehensively evaluate the effectiveness of Cardiac-CLIP, we collect 6,722 real-clinical data from 12 independent institutions, along with the open-source data to construct the evaluation dataset. Specifically, Cardiac-CLIP is comprehensively evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis. Experimental results demonstrate that Cardiac-CLIP achieves state-of-the-art performance across various downstream tasks in both internal and external data. Particularly, Cardiac-CLIP exhibits great effectiveness in supporting complex clinical tasks such as the prospective prediction of acute coronary syndrome, which is notoriously difficult in real-world scenarios.

CLMay 9, 2024Code

OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Dan Qiao, Yi Su, Pinzheng Wang et al.

Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived from multi-stage compression and continual pre-training from the original 15B OpenBA model. OpenBA-V2 utilizes more data, more flexible training objectives, and techniques such as layer pruning, neural pruning, and vocabulary pruning to achieve a compression rate of 77.3\% with minimal performance loss. OpenBA-V2 demonstrates competitive performance compared to other open-source models of similar size, achieving results close to or on par with the 15B OpenBA model in downstream tasks such as common sense reasoning and Named Entity Recognition (NER). OpenBA-V2 illustrates that LLMs can be compressed into smaller ones with minimal performance loss by employing advanced training objectives and data strategies, which may help deploy LLMs in resource-limited scenarios.

CLDec 11, 2024

SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support Agent

Jing Ye, Lu Xiang, Yaping Zhang et al.

Large Language Models (LLMs) have demonstrated promising potential in providing empathetic support during interactions. However, their responses often become verbose or overly formulaic, failing to adequately address the diverse emotional support needs of real-world scenarios. To tackle this challenge, we propose an innovative strategy-enhanced role-playing framework, designed to simulate authentic emotional support conversations. Specifically, our approach unfolds in two steps: (1) Strategy-Enhanced Role-Playing Interactions, which involve three pivotal roles -- Seeker, Strategy Counselor, and Supporter -- engaging in diverse scenarios to emulate real-world interactions and promote a broader range of dialogues; and (2) Emotional Support Agent Training, achieved through fine-tuning LLMs using our specially constructed dataset. Within this framework, we develop the \textbf{ServeForEmo} dataset, comprising an extensive collection of 3.7K+ multi-turn dialogues and 62.8K+ utterances. We further present \textbf{SweetieChat}, an emotional support agent capable of handling diverse open-domain scenarios. Extensive experiments and human evaluations confirm the framework's effectiveness in enhancing emotional support, highlighting its unique ability to provide more nuanced and tailored assistance.

CLMar 26, 2024

MapGuide: A Simple yet Effective Method to Reconstruct Continuous Language from Brain Activities

Xinpei Zhao, Jingyuan Sun, Shaonan Wang et al.

Decoding continuous language from brain activity is a formidable yet promising field of research. It is particularly significant for aiding people with speech disabilities to communicate through brain signals. This field addresses the complex task of mapping brain signals to text. The previous best attempt reverse-engineered this process in an indirect way: it began by learning to encode brain activity from text and then guided text generation by aligning with predicted brain responses. In contrast, we propose a simple yet effective method that guides text reconstruction by directly comparing them with the predicted text embeddings mapped from brain activities. Comprehensive experiments reveal that our method significantly outperforms the current state-of-the-art model, showing average improvements of 77% and 54% on BLEU and METEOR scores. We further validate the proposed modules through detailed ablation studies and case analyses and highlight a critical correlation: the more precisely we map brain activities to text embeddings, the better the text reconstruction results. Such insight can simplify the task of reconstructing language from brain activities for future work, emphasizing the importance of improving brain-to-text-embedding mapping techniques.

AIFeb 5

ALIVE: Awakening LLM Reasoning via Adversarial Learning and Instructive Verbal Evaluation

Yiwen Duan, Jing Ye, Xinpei Zhao

The quest for expert-level reasoning in Large Language Models (LLMs) has been hampered by a persistent \textit{reward bottleneck}: traditional reinforcement learning (RL) relies on scalar rewards that are \textbf{costly} to scale, \textbf{brittle} across domains, and \textbf{blind} to the underlying logic of a solution. This reliance on external, impoverished signals prevents models from developing a deep, self-contained understanding of reasoning principles. We introduce \textbf{ALIVE} (\emph{Adversarial Learning with Instructive Verbal Evaluation}), a hands-free alignment framework that moves beyond scalar reward optimization toward intrinsic reasoning acquisition. Grounded in the principle of \emph{Cognitive Synergy}, ALIVE unifies problem posing, solving, and judging within a single policy model to internalize the logic of correctness. By coupling adversarial learning with instructive verbal feedback, ALIVE enables models to internalize evaluative criteria directly from raw corpora, effectively transforming external critiques into an endogenous reasoning faculty. Empirical evaluations across mathematical reasoning, code generation, and general logical inference benchmarks demonstrate that ALIVE consistently mitigates reward signal limitations. With identical data and compute, it achieves accuracy gains, markedly improved cross-domain generalization, and higher self-correction rates. These results indicate that the reasoning trinity fosters a self-sustaining trajectory of capability growth, positioning ALIVE as a scalable foundation for general-purpose reasoning alignment without human-in-the-loop supervision.

AIJan 26

Beyond Text-to-SQL: Can LLMs Really Debug Enterprise ETL SQL?

Jing Ye, Yiwen Duan, Yonghong Yu et al.

SQL is central to enterprise data engineering, yet generating fully correct SQL code in a single attempt remains difficult, even for experienced developers and advanced text-to-SQL LLMs, often requiring multiple debugging iterations. We introduce OurBench, the first benchmark for enterprise-level SQL reasoning and debugging. Our benchmark is built on two key innovations: (1) an automated construction workflow that uses reverse engineering to systematically inject realistic bugs into large-scale SQL code, enabling scalable and diverse benchmark generation; and (2) an execution-free evaluation framework tailored to enterprise settings, providing fast, accurate, and resource-efficient assessment. OurBench comprises 469 OurBenchSyn queries featuring syntax errors with explicit error messages, and 516 OurBenchSem queries targeting semantic errors in which the code fails to meet user intent. The queries are highly complex, averaging over 140 lines and featuring deep and wide abstract syntax trees. Evaluation of nearly 30 LLMs reveals a substantial performance gap: the best-performing model, Claude-4-Sonnet, achieves only 36.46 percent accuracy on OurBenchSyn and 32.17 percent on OurBenchSem, while most models score below 20 percent. We further explore four solution strategies, identify key challenges, and outline promising directions for enterprise SQL debugging with LLMs.

CLMay 22, 2025

From Generic Empathy to Personalized Emotional Support: A Self-Evolution Framework for User Preference Alignment

Jing Ye, Lu Xiang, Yaping Zhang et al.

Effective emotional support hinges on understanding users' emotions and needs to provide meaningful comfort during multi-turn interactions. Large Language Models (LLMs) show great potential for expressing empathy; however, they often deliver generic and one-size-fits-all responses that fail to address users' specific needs. To tackle this issue, we propose a self-evolution framework designed to help LLMs improve their responses to better align with users' implicit preferences concerning user profiles (personalities), emotional states, and specific situations. Our framework consists of two distinct phases: \textit{(1)} \textit{Emotional Support Experience Acquisition}, where LLMs are fine-tuned on limited emotional support conversation data to provide basic support, and \textit{(2)} \textit{Self-Improvement for Personalized Emotional Support}, where LLMs leverage self-reflection and self-refinement to generate personalized responses. Through iterative direct preference optimization between the pre- and post-refined responses, our model generates responses that reflect a better understanding of the user's implicit preferences. Extensive experiments and evaluations demonstrate that our method significantly enhances the model's performance in emotional support, reducing unhelpful responses and minimizing discrepancies between user preferences and model outputs.

CLJan 4

EmoHarbor: Evaluating Personalized Emotional Support by Simulating the User's Internal World

Jing Ye, Lu Xiang, Yaping Zhang et al.

Current evaluation paradigms for emotional support conversations tend to reward generic empathetic responses, yet they fail to assess whether the support is genuinely personalized to users' unique psychological profiles and contextual needs. We introduce EmoHarbor, an automated evaluation framework that adopts a User-as-a-Judge paradigm by simulating the user's inner world. EmoHarbor employs a Chain-of-Agent architecture that decomposes users' internal processes into three specialized roles, enabling agents to interact with supporters and complete assessments in a manner similar to human users. We instantiate this benchmark using 100 real-world user profiles that cover a diverse range of personality traits and situations, and define 10 evaluation dimensions of personalized support quality. Comprehensive evaluation of 20 advanced LLMs on EmoHarbor reveals a critical insight: while these models excel at generating empathetic responses, they consistently fail to tailor support to individual user contexts. This finding reframes the central challenge, shifting research focus from merely enhancing generic empathy to developing truly user-aware emotional support. EmoHarbor provides a reproducible and scalable framework to guide the development and evaluation of more nuanced and user-aware emotional support systems.

LGNov 28, 2025

ParaGate: Parasitic-Driven Domain Adaptation Transfer Learning for Netlist Performance Prediction

Bin Sun, Jingyi Zhou, Jianan Mu et al.

In traditional EDA flows, layout-level performance metrics are only obtainable after placement and routing, hindering global optimization at earlier stages. Although some neural-network-based solutions predict layout-level performance directly from netlists, they often face generalization challenges due to the black-box heuristics of commercial placement-and-routing tools, which create disparate data across designs. To this end, we propose ParaGate, a three-step cross-stage prediction framework that infers layout-level timing and power from netlists. First, we propose a two-phase transfer-learning approach to predict parasitic parameters, pre-training on mid-scale circuits and fine-tuning on larger ones to capture extreme conditions. Next, we rely on EDA tools for timing analysis, offloading the long-path numerical reasoning. Finally, ParaGate performs global calibration using subgraph features. Experiments show that ParaGate achieves strong generalization with minimal fine-tuning data: on openE906, its arrival-time R2 from 0.119 to 0.897. These results demonstrate that ParaGate could provide guidance for global optimization in the synthesis and placement stages.

ARNov 25, 2025

InF-ATPG: Intelligent FFR-Driven ATPG with Advanced Circuit Representation Guided Reinforcement Learning

Bin Sun, Rengang Zhang, Zhiteng Chao et al.

Automatic test pattern generation (ATPG) is a crucial process in integrated circuit (IC) design and testing, responsible for efficiently generating test patterns. As semiconductor technology progresses, traditional ATPG struggles with long execution times to achieve the expected fault coverage, which impacts the time-to-market of chips. Recent machine learning techniques, like reinforcement learning (RL) and graph neural networks (GNNs), show promise but face issues such as reward delay in RL models and inadequate circuit representation in GNN-based methods. In this paper, we propose InF-ATPG, an intelligent FFR-driven ATPG framework that overcomes these challenges by using advanced circuit representation to guide RL. By partitioning circuits into fanout-free regions (FFRs) and incorporating ATPG-specific features into a novel QGNN architecture, InF-ATPG enhances test pattern generation efficiency. Experimental results show InF-ATPG reduces backtracks by 55.06\% on average compared to traditional methods and 38.31\% compared to the machine learning approach, while also improving fault coverage.

CVJun 24, 2025

AMF-MedIT: An Efficient Align-Modulation-Fusion Framework for Medical Image-Tabular Data

Congjing Yu, Jing Ye, Yang Liu et al.

Multimodal medical analysis combining image and tabular data has gained increasing attention. However, effective fusion remains challenging due to cross-modal discrepancies in feature dimensions and modality contributions, as well as the noise from high-dimensional tabular inputs. To address these problems, we present AMF-MedIT, an efficient Align-Modulation-Fusion framework for medical image and tabular data integration, particularly under data-scarce conditions. Built upon a self-supervised learning strategy, we introduce the Adaptive Modulation and Fusion (AMF) module, a novel, streamlined fusion paradigm that harmonizes dimension discrepancies and dynamically balances modality contributions. It integrates prior knowledge to guide the allocation of modality contributions in the fusion and employs feature masks together with magnitude and leakage losses to adjust the dimensionality and magnitude of unimodal features. Additionally, we develop FT-Mamba, a powerful tabular encoder leveraging a selective mechanism to handle noisy medical tabular data efficiently. Extensive experiments, including simulations of clinical noise, demonstrate that AMF-MedIT achieves superior accuracy, robustness, and data efficiency across multimodal classification tasks. Interpretability analyses further reveal how FT-Mamba shapes multimodal pretraining and enhances the image encoder's attention, highlighting the practical value of our framework for reliable and efficient clinical artificial intelligence applications.

CRNov 24, 2021

SASH: Efficient Secure Aggregation Based on SHPRG For Federated Learning

Zizhen Liu, Si Chen, Jing Ye et al.

To prevent private training data leakage in Fed?erated Learning systems, we propose a novel se?cure aggregation scheme based on seed homomor?phic pseudo-random generator (SHPRG), named SASH. SASH leverages the homomorphic property of SHPRG to simplify the masking and demask?ing scheme, which for each of the clients and for the server, entails an overhead linear w.r.t model size and constant w.r.t number of clients. We prove that even against worst-case colluding adversaries, SASH preserves training data privacy, while being resilient to dropouts without extra overhead. We experimentally demonstrate SASH significantly improves the efficiency to 20 times over baseline, especially in the more realistic case where the numbers of clients and model size become large, and a cer?tain percentage of clients drop out from the system.