DCDec 1, 2025
Tangram: Accelerating Serverless LLM Loading through GPU Memory Reuse and AffinityWenbin Zhu, Zhaoyan Shen, Zili Shao et al.
Serverless Large Language Models (LLMs) have emerged as a cost-effective solution for deploying AI services by enabling a 'pay-as-you-go' pricing model through GPU resource sharing. However, cold-start latency, especially the model loading phase, has become a critical performance bottleneck, as it scales linearly with model size and severely limits the practical deployment of large-scale LLM services. This paper presents Tangram, a novel system that accelerates Serverless LLM loading through efficient GPU memory reuse. By leveraging the unused GPU memory to retain model parameters, Tangram significantly reduces model transfer time and cold-start latency. Its design includes three key components: unified GPU memory pool for tensor-level parameter sharing across models, on-demand KV cache allocation for dynamic memory management, and GPU-affinity-aware scheduling for maximizing resource utilization. These techniques collectively address the critical challenges of inefficient memory usage and the cold-start problem in Serverless LLM platforms. We have implemented a fully functional prototype, and experiments show that Tangram achieves up to 6.2 times faster loading and reduces Time-To-First-Token (TTFT) during cold-start by 23--55% over state-of-the-art methods.
CRNov 25, 2021
A Survey of Blockchain Data Management SystemsQian Wei, Bingzhe Li, Wanli Chang et al.
Blockchain has been widely deployed in various sectors, such as finance, education, and public services. Since blockchain runs as an immutable distributed ledger, it has decentralized mechanisms with persistency, anonymity, and auditability, where transactions are jointly performed through cryptocurrency-based consensus algorithms by worldwide distributed nodes. There have been many survey papers reviewing the blockchain technologies from different perspectives, e.g., digital currencies, consensus algorithms, and smart contracts. However, none of them have focused on the blockchain data management systems. To fill in this gap, we have conducted a comprehensive survey on the data management systems, based on three typical types of blockchain, i.e., standard blockchain, hybrid blockchain, and DAG (Directed Acyclic Graph)-based blockchain. We categorize their data management mechanisms into three layers: blockchain architecture, blockchain data structure, and blockchain storage engine, where block architecture indicates how to record transactions on a distributed ledger, blockchain data structure refers to the internal structure of each block, and blockchain storage engine specifies the storage form of data on the blockchain system. For each layer, the works advancing the state-of-the-art are discussed together with technical challenges. Furthermore, we lay out the future research directions for the blockchain data management systems.