CRApr 17
Blueprint, Bootstrap, and Bridge: A Security Look at NVIDIA GPU Confidential ComputingZhongshu Gu, Enriquillo Valdez, Salman Ahmed et al. · ibm-research
NVIDIA GPU Confidential Computing (GPU-CC) aims to provide secure execution for AI workloads. For end users, enabling GPU-CC is seamless and requires no modifications to existing applications. However, this ease of adoption relies on a proprietary and highly complex system that is difficult to inspect, creating challenges for researchers seeking to understand its architecture and security landscape. In this work, we provide a security look at GPU-CC by reconstructing a coherent view of the system. We first examine the system's blueprint, focusing on the specialized architectural engines that support its security mechanisms. We then analyze the bootstrap process, which coordinates hardware and software components to establish these protections. Finally, we conduct targeted experiments to assess whether, under the GPU-CC threat model, data transfers along different paths remain protected across the bridge between trusted CPU and GPU domains. We responsibly disclosed all security findings presented in this paper to the NVIDIA Product Security Incident Response Team (PSIRT).
ETMar 11
Reference Architecture of a Quantum-Centric SupercomputerSeetharami Seelam, Jerry M. Chow, Antonio Córcoles et al.
Quantum computers have demonstrated utility in simulating quantum systems beyond brute-force classical approaches. As the community builds on these demonstrations to explore using quantum computing for applied research, algorithms and workflows have emerged that require leveraging both quantum computers and classical high-performance computing (HPC) systems to scale applications, especially in chemistry and materials, beyond what either system can simulate alone. Today, these disparate systems operate in isolation, forcing users to manually orchestrate workloads, coordinate job scheduling, and transfer data between systems -- a cumbersome process that hinders productivity and severely limits rapid algorithmic exploration. These challenges motivate the need for flexible and high-performance Quantum-Centric Supercomputing (QCSC) systems that integrate Quantum Processing Units (QPUs), Graphics Processing Units (GPUs), and Central Processing Units (CPUs) to accelerate discovery of such algorithms across applications. These systems will be co-designed across quantum and classical HPC infrastructure, middleware, and application layers to accelerate the adoption of quantum computing for solving critical computational problems. We envision QCSC evolution through three distinct phases: (1) quantum systems as specialized compute offload engines within existing HPC complexes; (2) heterogeneous quantum and classical HPC systems coupled through advanced middleware, enabling seamless execution of hybrid quantum-classical algorithms; and (3) fully co-designed heterogeneous quantum-HPC systems for hybrid computational workflows. This article presents a reference architecture and roadmap for these QCSC systems.
CRMay 19, 2021
Separation of Powers in Federated LearningPau-Chen Cheng, Kevin Eykholt, Zhongshu Gu et al.
Federated Learning (FL) enables collaborative training among mutually distrusting parties. Model updates, rather than training data, are concentrated and fused in a central aggregation server. A key security challenge in FL is that an untrustworthy or compromised aggregation process might lead to unforeseeable information leakage. This challenge is especially acute due to recently demonstrated attacks that have reconstructed large fractions of training data from ostensibly "sanitized" model updates. In this paper, we introduce TRUDA, a new cross-silo FL system, employing a trustworthy and decentralized aggregation architecture to break down information concentration with regard to a single aggregator. Based on the unique computational properties of model-fusion algorithms, all exchanged model updates in TRUDA are disassembled at the parameter-granularity and re-stitched to random partitions designated for multiple TEE-protected aggregators. Thus, each aggregator only has a fragmentary and shuffled view of model updates and is oblivious to the model architecture. Our new security mechanisms can fundamentally mitigate training reconstruction attacks, while still preserving the final accuracy of trained models and keeping performance overheads low.
CRDec 7, 2018
Reaching Data Confidentiality and Model Accountability on the CalTrainZhongshu Gu, Hani Jamjoom, Dong Su et al.
Distributed collaborative learning (DCL) paradigms enable building joint machine learning models from distrusting multi-party participants. Data confidentiality is guaranteed by retaining private training data on each participant's local infrastructure. However, this approach to achieving data confidentiality makes today's DCL designs fundamentally vulnerable to data poisoning and backdoor attacks. It also limits DCL's model accountability, which is key to backtracking the responsible "bad" training data instances/contributors. In this paper, we introduce CALTRAIN, a Trusted Execution Environment (TEE) based centralized multi-party collaborative learning system that simultaneously achieves data confidentiality and model accountability. CALTRAIN enforces isolated computation on centrally aggregated training data to guarantee data confidentiality. To support building accountable learning models, we securely maintain the links between training instances and their corresponding contributors. Our evaluation shows that the models generated from CALTRAIN can achieve the same prediction accuracy when compared to the models trained in non-protected environments. We also demonstrate that when malicious training participants tend to implant backdoors during model training, CALTRAIN can accurately and precisely discover the poisoned and mislabeled training data that lead to the runtime mispredictions.
CRJul 3, 2018
Confidential Inference via Ternary Model PartitioningZhongshu Gu, Heqing Huang, Jialong Zhang et al.
Today's cloud vendors are competing to provide various offerings to simplify and accelerate AI service deployment. However, cloud users always have concerns about the confidentiality of their runtime data, which are supposed to be processed on third-party's compute infrastructures. Information disclosure of user-supplied data may jeopardize users' privacy and breach increasingly stringent data protection regulations. In this paper, we systematically investigate the life cycles of inference inputs in deep learning image classification pipelines and understand how the information could be leaked. Based on the discovered insights, we develop a Ternary Model Partitioning mechanism and bring trusted execution environments to mitigate the identified information leakages. Our research prototype consists of two co-operative components: (1) Model Assessment Framework, a local model evaluation and partitioning tool that assists cloud users in deployment preparation; (2) Infenclave, an enclave-based model serving system for online confidential inference in the cloud. We have conducted comprehensive security and performance evaluation on three representative ImageNet-level deep learning models with different network depths and architectural complexity. Our results demonstrate the feasibility of launching confidential inference services in the cloud with maximized confidentiality guarantees and low performance costs.