PLMay 26
PoTo: A Hybrid Andersen's Points-to Analysis for PythonIngkarat Rak-amnouykit, Ana Milanova, Guillaume Baudart et al.
As Python is increasingly being adopted for large and complex programs, the importance of static analysis for Python (such as type inference) grows. Unfortunately, static analysis for Python remains a challenging task due to its dynamic language features and its abundant external libraries. To help fill this gap, this paper presents PoTo, an Andersen-style context-insensitive and flow-insensitive points-to analysis for Python. PoTo addresses Python-specific challenges and works for large programs via a novel hybrid evaluation, integrating traditional static points-to analysis with concrete evaluation in the Python interpreter for external library calls. Next, this paper presents PoTo+, a static type inference for Python built on the points-to analysis. We evaluate PoTo+ and compare it to two state-of-the-art Python type inference techniques: (1) the static rule-based Pytype and (2) the deep-learning based DLInfer. Our results show that PoTo+ outperforms both Pytype and DLInfer on existing Python packages.
CRMay 8
Improving Parameter-Efficient Federated Learning with Differentially Private RefactorizationLinh Tran, Ana Milanova, Stacy Patterson
Federated Learning (FL) with parameter-efficient fine-tuning, such as Low-Rank Adaptation (LoRA), enables scalable model training on distributed data. However, when combined with Differential Privacy (DP), LoRA often introduces errors during global aggregation and amplifies the negative effect of DP noise. Existing cross-silo FL approaches mitigate the aggregation error by freezing one LoRA module and applying output perturbation. However, in a restricted low-rank subspaces, this additive noise frequently overwhelms the signals of the weight matrices, leading to suboptimal accuracy. To address this vulnerability, we propose FedPower, a differentially private cross-silo FL framework that reshapes server-side aggregation. Instead of perturbing mismatched low-rank factors, FedPower explicitly reconstructs and clips full-rank client updates to bound the sensitivity. The server then projects the exact aggregated update back into a secure low-rank space using PowerDP, a novel differentially private low-rank factorization mechanism. Based on simultaneous subspace iteration, PowerDP injects calibrated DP noise prior to the final orthonormalization step, effectively mitigates the negative effect of DP noise by preserving matrix orthogonality. We provide rigorous theoretical analyses establishing sensitivity bounds for subspace projections, proving that FedPower achieves both sample-level and client-level DP. Extensive experiments on various language understanding tasks in cross-silo FL settings show that FedPower is robust against tight privacy budgets while adding negligible computational overheads. Additional empirical study on different DP noise injection schemes validates the effectiveness of PowerDP in improving the tradeoff in accuracy and privacy. Evaluation on three different membership inference attacks validates the robustness and privacy-preserving capability of the proposed framework.
LGJan 23, 2025
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language ModelsLinh Tran, Wei Sun, Stacy Patterson et al.
Multimodal Large Language Models (LLMs) are pivotal in revolutionizing customer support and operations by integrating multiple modalities such as text, images, and audio. Federated Prompt Learning (FPL) is a recently proposed approach that combines pre-trained multimodal LLMs such as vision-language models with federated learning to create personalized, privacy-preserving AI systems. However, balancing the competing goals of personalization, generalization, and privacy remains a significant challenge. Over-personalization can lead to overfitting, reducing generalizability, while stringent privacy measures, such as differential privacy, can hinder both personalization and generalization. In this paper, we propose a Differentially Private Federated Prompt Learning (DP-FPL) approach to tackle this challenge by leveraging a low-rank factorization scheme to capture generalization while maintaining a residual term that preserves expressiveness for personalization. To ensure privacy, we introduce a novel method where we apply local differential privacy to the two low-rank components of the local prompt, and global differential privacy to the global prompt. Our approach mitigates the impact of privacy noise on the model performance while balancing the tradeoff between personalization and generalization. Extensive experiments demonstrate the effectiveness of our approach over other benchmarks.
LGJan 23, 2025
PBM-VFL: Vertical Federated Learning with Feature and Sample PrivacyLinh Tran, Timothy Castiglia, Stacy Patterson et al.
We present Poisson Binomial Mechanism Vertical Federated Learning (PBM-VFL), a communication-efficient Vertical Federated Learning algorithm with Differential Privacy guarantees. PBM-VFL combines Secure Multi-Party Computation with the recently introduced Poisson Binomial Mechanism to protect parties' private datasets during model training. We define the novel concept of feature privacy and analyze end-to-end feature and sample privacy of our algorithm. We compare sample privacy loss in VFL with privacy loss in HFL. We also provide the first theoretical characterization of the relationship between privacy budget, convergence error, and communication cost in differentially-private VFL. Finally, we empirically show that our model performs well with high levels of privacy.
PLDec 8, 2019
Formalizing Event-Driven Behavior of Serverless ApplicationsMatthew Obetz, Stacy Patterson, Ana Milanova
We present new operational semantics for serverless computing that model the event-driven relationships between serverless functions, as well as their interaction with platforms services such as databases and object stores. These semantics precisely encapsulate how control transfers between functions, both directly and through reads and writes to platform services. We use these semantics to define the notion of the service call graph for serverless applications that captures program flows through functions and services. Finally, we construct service call graphs for twelve serverless JavaScript applications, using a prototype of our call graph construction algorithm, and we evaluate their accuracy.