Raj Joshi

2.3DCSep 21, 2025

ShadowServe: Interference-Free KV Cache Fetching for Distributed Prefix Caching

Xingyu Xiang, Raj Joshi, Yuhan Liu et al.

Distributed prefix caching accelerates long-context LLM serving by reusing KV cache entries for common context prefixes. However, KV cache fetches can become a bottleneck when network bandwidth is limited. Compression mitigates the bandwidth issue, but can degrade overall performance when decompression interferes with model computation. We present ShadowServe, the first SmartNIC-accelerated, interference-free prefix caching system for LLM serving. ShadowServe separates a control plane on the host and a data plane fully offloaded to the SmartNIC, which eliminates interference to both host GPU and CPU. To overcome the SmartNIC's limited compute and memory resources, we design a chunked pipeline that parallelizes data plane operations across the SmartNIC's compute resources, and a minimal-copy memory management scheme that reduces memory pressure on the SmartNIC. Compared to state-of-the-art solutions, ShadowServe achieves up to 2.2x lower loaded time-per-output-token (TPOT), and reduces time-to-first-token (TTFT) by up to 1.38x in low-bandwidth scenarios (<= 20 Gbps), translating to up to 1.35x higher throughput.

3.0CRMar 12, 2012

Estimating strength of DDoS attack using various regression models

B. B. Gupta, R. C. Joshi, Manoj Misra

Anomaly-based DDoS detection systems construct profile of the traffic normally seen in the network, and identify anomalies whenever traffic deviate from normal profile beyond a threshold. This extend of deviation is normally not utilised. This paper reports the evaluation results of proposed approach that utilises this extend of deviation from detection threshold to estimate strength of DDoS attack using various regression models. A relationship is established between number of zombies and observed deviation in sample entropy. Various statistical performance measures, such as coefficient of determination (R2), coefficient of correlation (CC), sum of square error (SSE), mean square error (MSE), root mean square error (RMSE), normalised mean square error (NMSE), Nash-Sutcliffe efficiency index (η) and mean absolute error (MAE) are used to measure the performance of various regression models. Internet type topologies used for simulation are generated using transit-stub model of GT-ITM topology generator. NS-2 network simulator on Linux platform is used as simulation test bed for launching DDoS attacks with varied attack strength. A comparative study is performed using different regression models for estimating strength of DDoS attack. The simulation results are promising as we are able to estimate strength of DDoS attack efficiently with very less error rate using various regression models.

Raj Joshi

2 Papers