Shunfan Zhou

CR
h-index7
6papers
15citations
Novelty43%
AI Score52

6 Papers

CRJun 3Code
Implement Kubernetes Pod-Level Remote Attestation for Confidential Workloads on dstack

Yang Yang, Kevin Wang, Yuanhai Luo et al.

The rise of LLM-as-a-Service and other confidential cloud workloads demands cryptographic proof that user data is processed in a trusted, untampered environment. Existing solutions, notably Confidential Containers (CoCo), enforce a strict "one Pod per VM" model that attests only the Guest OS stack, leaving container-level identity unverified and incurring prohibitive per-VM resource overhead. We present dstack-capsule, a Kubernetes platform that enables Pod-level remote attestation on Intel TDX by allowing multiple Pods to share a single Confidential VM while each retains independent, hardware-backed proof of identity. Our key insight is a two-layer attestation architecture: static platform measurements are frozen in RTMR[3] via an irreversible privilege fuse, while dynamic Pod identities (pod_uid, pod_spec_hash, workload_id) are embedded in the TDX Quote's report_data field and signed by hardware on every request. dstack-capsule introduces (1) a Pod-level attestation protocol binding Pod spec digests to hardware-signed Quotes; (2) a privilege fuse mechanism that atomically transitions a node from setup mode to secure mode; (3) a multi-layer sandbox spanning storage, runtime, admission, API, and network isolation layers; and (4) a complete open-source implementation based on Kubernetes 1.32, Intel TDX, and Sysbox. We evaluate the security properties, attestation correctness, and performance characteristics of dstack-capsule, demonstrating that it achieves Pod-granularity verification without the resource overhead of per-VM isolation.

DCSep 6, 2024
Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study

Jianwei Zhu, Hang Yin, Peng Deng et al.

This report evaluates the performance impact of enabling Trusted Execution Environments (TEE) on NVIDIA Hopper GPUs for large language model (LLM) inference tasks. We benchmark the overhead introduced by TEE mode across various LLMs and token lengths, with a particular focus on the bottleneck caused by CPU-GPU data transfers via PCIe. Our results indicate that while there is minimal computational overhead within the GPU, the overall performance penalty is primarily attributable to data transfer. For the majority of typical LLM queries, the overhead remains below 7%, with larger models and longer sequences experiencing nearly zero overhead.

CRApr 15
Persistent BitTorrent Trackers

François-Xavier Wicht, Zhengwei Tong, Shunfan Zhou et al.

Private BitTorrent trackers enforce upload-to-download ratios to prevent free-riding, but suffer from three critical weaknesses: reputation cannot move between trackers, centralized servers create single points of failure, and upload statistics are self-reported and unverifiable. When a tracker shuts down, users lose their contribution history and cannot prove their standing to new communities. We address these problems by storing reputation in smart contracts and replacing self-reports with cryptographic attestations. Peers sign receipts for received pieces; the tracker aggregates them via BLS signatures and updates reputation. If a tracker is unavailable, peers fall back to an authenticated distributed hash table (DHT): stored reputation acts as a public key infrastructure (PKI), preserving access control without the tracker. Reputation is portable across tracker failures through single-hop migration in factory-deployed contracts. We also address the privacy implications of publishing public keys and reputations tied to private trackers on a public ledger: we propose ephemeral session keys to prevent linking peer identities, zero-knowledge membership proofs for anonymous DHT participation, and confidential reputation using homomorphic commitments. We formalize the security requirements, prove four security properties under standard cryptographic assumptions, and evaluate a prototype. Measurements show that transfer receipts add less than 5\% end-to-end overhead with typical piece sizes. To minimize signing overhead, we adopt a hybrid signature scheme: ECDSA signs individual piece receipts at transfer time for low per-operation latency, while BLS serves as the overarching scheme, enabling compact aggregation of many receipts into a single proof at report time. This design reduces client-side signing cost by an order of magnitude compared to using BLS throughout.

CLMay 11
LegalCiteBench: Evaluating Citation Reliability in Legal Language Models

Sijia Chen, Hang Yin, Shunfan Zhou

Large language models (LLMs) are increasingly integrated into legal drafting and research workflows, where incorrect citations or fabricated precedents can cause serious professional harm. Existing legal benchmarks largely emphasize statutory reasoning, contract understanding, or general legal question answering, but they do not directly study a central common-law failure mode: when asked to provide case authorities without external grounding, models may return plausible-looking but incorrect citations or cases. We introduce LegalCiteBench, a benchmark for studying closed-book citation recovery, citation verification, and case matching in legal language models. LegalCiteBench contains approximately 24K evaluation instances constructed from 1,000 real U.S. judicial opinions from the Case Law Access Project. The benchmark covers five citation-centric tasks: citation retrieval, citation completion, citation error detection, case matching, and case verification and correction. Across 21 LLMs, exact citation recovery remains highly challenging in this closed-book setting: even the strongest models score below 7/100 on citation retrieval and completion. Within the evaluated models, scale and legal-domain pretraining provide limited gains and do not resolve this difficulty. Models also frequently provide concrete but incorrect or low-overlap authorities under our evaluation protocol, with Misleading Answer Rates (MAR) exceeding 94% for 20 of 21 evaluated models on retrieval-heavy tasks. A prompt-only abstention experiment shows that explicit uncertainty instructions reduce some confident fabrication but do not improve citation correctness. LegalCiteBench is intended as a diagnostic framework for studying authority generation failures, verification behavior, and abstention when external grounding is absent, incomplete, or bypassed.

CRApr 24
PASS: A Provenanced Access Subaccount System for Blockchain Wallets

Jay Yu, Shunfan Zhou, Hang Yin et al.

Blockchain wallets conventionally follow an ownership model where possession of a private key grants unilateral control. However, this assumption is brittle for emerging settings such as AI agent wallets, organizational custody, and enterprise payroll, where multiple actors must coordinate without exposing secrets or leaking internal activity. We present PASS, a Provenanced Access Subaccount System that replaces role-based or identity-based control with provenance-based control: assets can only be used by subaccounts that can trace custody back to a valid deposit. A simple Inbox-Outbox mechanism ensures all external actions have verifiable lineage, while internal transfers remain private and indistinguishable from ordinary EOAs. We formalize PASS in Lean 4 and prove core invariants, including privacy of internal transfers, asset accessibility, and provenance integrity. We implement a prototype with enclave backends on AWS Nitro Enclaves and dstack Intel TDX, integrate with WalletConnect, and benchmark throughput across wallet operations. These results show that provenance-based wallets are both implementable and efficient. PASS bridges today's gap between strict self-custody and flexible shared access, advancing the design space for practical, privacy-preserving custody.

CRSep 15, 2025
Dstack: A Zero Trust Framework for Confidential Containers

Shunfan Zhou, Kevin Wang, Hang Yin

Web3 applications require execution platforms that maintain confidentiality and integrity without relying on centralized trust authorities. While Trusted Execution Environments (TEEs) offer promising capabilities for confidential computing, current implementations face significant limitations when applied to Web3 contexts, particularly in security reliability, censorship resistance, and vendor independence. This paper presents dstack, a comprehensive framework that transforms raw TEE technology into a true Zero Trust platform. We introduce three key innovations: (1) Portable Confidential Containers that enable seamless workload migration across heterogeneous TEE environments while maintaining security guarantees, (2) Decentralized Code Management that leverages smart contracts for transparent governance of TEE applications, and (3) Verifiable Domain Management that ensures secure and verifiable application identity without centralized authorities. These innovations are implemented through three core components: dstack-OS, dstack-KMS, and dstack-Gateway. Together, they demonstrate how to achieve both the performance advantages of VM-level TEE solutions and the trustless guarantees required by Web3 applications. Our evaluation shows that dstack provides comprehensive security guarantees while maintaining practical usability for real-world applications.