Jiangchuan Liu

MM
h-index70
28papers
1,572citations
Novelty49%
AI Score57

28 Papers

CVApr 2Code
TrackerSplat: Exploiting Point Tracking for Fast and Robust Dynamic 3D Gaussians Reconstruction

Daheng Yin, Isaac Ding, Yili Jin et al.

Recent advancements in 3D Gaussian Splatting (3DGS) have demonstrated its potential for efficient and photorealistic 3D reconstructions, which is crucial for diverse applications such as robotics and immersive media. However, current Gaussian-based methods for dynamic scene reconstruction struggle with large inter-frame displacements, leading to artifacts and temporal inconsistencies under fast object motions. To address this, we introduce \textit{TrackerSplat}, a novel method that integrates advanced point tracking methods to enhance the robustness and scalability of 3DGS for dynamic scene reconstruction. TrackerSplat utilizes off-the-shelf point tracking models to extract pixel trajectories and triangulate per-view pixel trajectories onto 3D Gaussians to guide the relocation, rotation, and scaling of Gaussians before training. This strategy effectively handles large displacements between frames, dramatically reducing the fading and recoloring artifacts prevalent in prior methods. By accurately positioning Gaussians prior to gradient-based optimization, TrackerSplat overcomes the quality degradation associated with large frame gaps when processing multiple adjacent frames in parallel across multiple devices, thereby boosting reconstruction throughput while preserving rendering quality. Experiments on real-world datasets confirm the robustness of TrackerSplat in challenging scenarios with significant displacements, achieving superior throughput under parallel settings and maintaining visual quality compared to baselines. The code is available at https://github.com/yindaheng98/TrackerSplat.

MMAug 12, 2024
Palantir: Towards Efficient Super Resolution for Ultra-high-definition Live Streaming

Xinqi Jin, Zhui Zhu, Xikai Sun et al.

Neural enhancement through super-resolution (SR) deep neural networks (DNNs) opens up new possibilities for ultra-high-definition (UHD) live streaming over existing encoding and networking infrastructure. Yet, the heavy SR DNN inference overhead leads to severe deployment challenges. To reduce the overhead, existing systems propose to apply DNN-based SR only on carefully selected anchor frames while upscaling non-anchor frames via the lightweight reusing-based SR approach. However, frame-level scheduling is coarse-grained and fails to deliver optimal efficiency. In this work, we propose Palantir, the first neural-enhanced UHD live streaming system with fine-grained patch-level scheduling. Two novel techniques are incorporated into Palantir to select the most beneficial anchor patches and support latency-sensitive UHD live streaming applications. Firstly, under the guidance of our pioneering and theoretical analysis, Palantir constructs a directed acyclic graph (DAG) for lightweight yet accurate SR quality estimation under any possible anchor patch set. Secondly, to further optimize the scheduling latency, Palantir improves parallelizability by refactoring the computation subprocedure of the estimation process into a sparse matrix-matrix multiplication operation. The evaluation results suggest that Palantir incurs a negligible scheduling latency accounting for less than 5.7% of the end-to-end latency requirement. When compared to the naive method of applying DNN-based SR on all the frames, Palantir can reduce the SR DNN inference overhead by 20 times (or 60 times) while preserving 54.0-82.6% (or 32.8-64.0%) of the quality gain. When compared to the state-of-the-art real-time frame-level scheduling strategy, Palantir can reduce the SR DNN inference overhead by 80.1% at most (and 38.4% on average) without sacrificing the video quality.

NIJan 29
ViTMAlis: Towards Latency-Critical Mobile Video Analytics with Vision Transformers

Miao Zhang, Guanzhen Wu, Hao Fang et al.

Edge-assisted mobile video analytics (MVA) applications are increasingly shifting from using vision models based on convolutional neural networks (CNNs) to those built on vision transformers (ViTs) to leverage their superior global context modeling and generalization capabilities. However, deploying these advanced models in latency-critical MVA scenarios presents significant challenges. Unlike traditional CNN-based offloading paradigms where network transmission is the primary bottleneck, ViT-based systems are constrained by substantial inference delays, particularly for dense prediction tasks where the need for high-resolution inputs exacerbates the inherent quadratic computational complexity of ViTs. To address these challenges, we propose a dynamic mixed-resolution inference strategy tailored for ViT-backboned dense prediction models, enabling flexible runtime trade-offs between speed and accuracy. Building on this, we introduce ViTMAlis, a ViT-native device-to-edge offloading framework that dynamically adapts to network conditions and video content to jointly reduce transmission and inference delays. We implement a fully functional prototype of ViTMAlis on commodity mobile and edge devices. Extensive experiments demonstrate that, compared to state-of-the-art accuracy-centric, content-aware, and latency-adaptive baselines, ViTMAlis significantly reduces end-to-end offloading latency while improving user-perceived rendering accuracy, providing a practical foundation for next-generation mobile intelligence.

LGAug 16, 2024
The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

Jiating Ma, Yipeng Zhou, Qi Li et al.

To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an essential but largely overlooked problem in DPFL is the heterogeneity of clients' privacy requirement, which can vary significantly between clients and extremely complicates the client selection problem in DPFL. In other words, both the data quality and the influence of DP noises should be taken into account when selecting clients. To address this problem, we conduct convergence analysis of DPFL under heterogeneous privacy, a generic client selection strategy, popular DP mechanisms and convex loss. Based on convergence analysis, we formulate the client selection problem to minimize the value of loss function in DPFL with heterogeneous privacy, which is a convex optimization problem and can be solved efficiently. Accordingly, we propose the DPFL-BCS (biased client selection) algorithm. The extensive experiment results with real datasets under both convex and non-convex loss functions indicate that DPFL-BCS can remarkably improve model utility compared with the SOTA baselines.

MMMar 19
Rethink Web Service Resilience in Space: A Radiation-Aware and Sustainable Transmission Solution

Long Chen, Hao Fang, Yi Ching Chou et al.

Low Earth Orbit (LEO) satellite networks such as Starlink and Project Kuiper are increasingly integrated with cloud infrastructures, forming an important internet backbone for global web services. By extending connectivity to remote regions, oceans, and disaster zones, these networks enable reliable access to applications ranging from real-time WebRTC communication to emergency response portals. Yet the resilience of these web services is threatened by space radiation: it degrades hardware, drains batteries, and disrupts continuity, even if the space-cloud integrated providers use machine learning to analyze space weather and radiation data. Specifically, conventional fixes like altitude adjustments and thermal annealing consume energy; neglecting this energy use results in deep discharge and faster battery aging, whereas sleep modes risk abrupt web session interruptions. Efficient network-layer mitigation remains a critical gap. We propose RALT (Radiation-Aware LEO Transmission), a control-plane solution that dynamically reroutes traffic during radiation events, accounting for energy constraints to minimize battery degradation and sustain service performance. Our work shows that unlocking space-based web services' full potential for global reliable connectivity requires rethinking resilience through the lens of the space environment itself.

AIMay 15
Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents

Jiaxing Li, Hao Fang, Chi Xu et al.

Rapid biodiversity loss underscore the urgency of effective monitoring, yet manual surveys remain resource-intensive. While on-device AI offers a scalable alternative, its performance in the wild is often challenged by environmental variability. Current methods rely heavily on cloud resource, which requires continuous uploading of field data for model retraining. This approach is unsuitable for remote deployments because it consumes limited power and network connectivity. To address these constraints, this research proposes a shift from model adaptation to knowledge adaptation. We introduce an architecture that separates visual perception from reasoning, combining a visual encoder with a dynamic knowledge base. We uses an explicit knowledge base to replace implicitly encoding expert knowledge into model parameters. This method also supports knowledge sustainability by preserving expert insights in a structured form. Through cross-disciplinary collaboration with biologists and Indigenous communities, this work advances ethical AI co-development, fostering responsible and culturally informed ecosystem management.

MMAug 22, 2025Code
Beyond Interpretability: Exploring the Comprehensibility of Adaptive Video Streaming through Large Language Models

Lianchen Jia, Chaoyang Li, Ziqi Yuan et al.

Over the past decade, adaptive video streaming technology has witnessed significant advancements, particularly driven by the rapid evolution of deep learning techniques. However, the black-box nature of deep learning algorithms presents challenges for developers in understanding decision-making processes and optimizing for specific application scenarios. Although existing research has enhanced algorithm interpretability through decision tree conversion, interpretability does not directly equate to developers' subjective comprehensibility. To address this challenge, we introduce \texttt{ComTree}, the first bitrate adaptation algorithm generation framework that considers comprehensibility. The framework initially generates the complete set of decision trees that meet performance requirements, then leverages large language models to evaluate these trees for developer comprehensibility, ultimately selecting solutions that best facilitate human understanding and enhancement. Experimental results demonstrate that \texttt{ComTree} significantly improves comprehensibility while maintaining competitive performance, showing potential for further advancement. The source code is available at https://github.com/thu-media/ComTree.

SYMay 17, 2024
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

Hao Zhou, Chengming Hu, Ye Yuan et al.

Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks based on human instructions, paving the way to artificial general intelligence (AGI)-enabled 6G. Given the great potential of LLM technologies, this work aims to provide a comprehensive overview of LLM-enabled telecom networks. In particular, we first present LLM fundamentals, including model architecture, pre-training, fine-tuning, inference and utilization, model evaluation, and telecom deployment. Then, we introduce LLM-enabled key techniques and telecom applications in terms of generation, classification, optimization, and prediction problems. Specifically, the LLM-enabled generation applications include telecom domain knowledge, code, and network configuration generation. After that, the LLM-based classification applications involve network security, text, image, and traffic classification problems. Moreover, multiple LLM-enabled optimization techniques are introduced, such as automated reward function design for reinforcement learning and verbal reinforcement learning. Furthermore, for LLM-aided prediction problems, we discussed time-series prediction models and multi-modality prediction problems for telecom. Finally, we highlight the challenges and identify the future directions of LLM-enabled telecom networks.

GRMay 10
CAGS: Color-Adaptive Volumetric Video Streaming with Dynamic 3D Gaussian Splatting

Daheng Yin, Yili Jin, Jianxin Shi et al.

Volumetric video (VV) streaming enables real-time, immersive access to remote 3D environments, powering telepresence, ecological monitoring, and robotic teleoperation. These applications turn VV streaming into a real-time interface to remote physical environments, imposing new system-level demands for photorealistic scene representation, low-latency interaction, and robust performance under heterogeneous networks. 3D Gaussian Splatting (3DGS) has been widely used for real-time photorealistic rendering, offering superior visual quality and rendering performance, but it faces challenges due to bandwidth consumption. Furthermore, as the foundation of adaptive VV streaming, existing Levels of Detail (LoD) methods based on density are not well-suited to Gaussian representations, leading to visible gaps and severe quality degradation. Recent studies have also explored attribute compression techniques to reduce bandwidth consumption. Our preliminary studies reveal that aggressive attribute compression primarily causes color distortion, which can be effectively corrected in the rendered image using a reference image. Motivated by these findings, we propose a novel Color-Adaptive scheme for adaptive VV streaming that uses vector quantization (VQ) to establish LoDs and correct color distortions with low-resolution reference images. We further present CAGS, an adaptive VV streaming system compatible with diverse Gaussian representations, which integrates the Color-Adaptive scheme by rendering reference images on the streaming server and performing color restoration on the client. Extensive experiments on our prototype system demonstrate that CAGS outperforms the existing adaptive streaming systems in PSNR by 5$\sim$20 dB under fluctuating bandwidth, operates significantly faster than existing scalable Gaussian compression methods, and generalizes across different Gaussian representations.

AIOct 21, 2025Code
Crucible: Quantifying the Potential of Control Algorithms through LLM Agents

Lianchen Jia, Chaoyang Li, Qian Houde et al.

Control algorithms in production environments typically require domain experts to tune their parameters and logic for specific scenarios. However, existing research predominantly focuses on algorithmic performance under ideal or default configurations, overlooking the critical aspect of Tuning Potential. To bridge this gap, we introduce Crucible, an agent that employs an LLM-driven, multi-level expert simulation to turn algorithms and defines a formalized metric to quantitatively evaluate their Tuning Potential. We demonstrate Crucible's effectiveness across a wide spectrum of case studies, from classic control tasks to complex computer systems, and validate its findings in a real-world deployment. Our experimental results reveal that Crucible systematically quantifies the tunable space across different algorithms. Furthermore, Crucible provides a new dimension for algorithm analysis and design, which ultimately leads to performance improvements. Our code is available at https://github.com/thu-media/Crucible.

NIMay 4
Renewables Power the Orbit? Achieving Sustainable Space Edge Computing via QoS-Aware Offloading

Xiaoyi Fan, Yi Ching Chou, Hao Fang et al.

Low-Earth-Orbit (LEO) satellite constellations are becoming integral to 6G infrastructure, but increasing in-orbit computation accelerates battery degradation and raises sustainability concerns. Meanwhile, renewable-heavy regions worldwide experience persistent energy curtailment due to transmission bottlenecks, leaving substantial clean energy stranded near generation sites. We identify a satellite-grid co-design opportunity: adaptively offloading task-critical data from satellite to data centers co-located with renewable power plants. However, realizing this vision requires jointly considering intermittent and capacity-limited communication windows, as well as time-varying electricity budgets. In this paper, we propose SQSO, a Sustainable and QoS-aware Satellite Offloading framework that models per-interval task offloading as a constrained optimization over dynamic topology and electricity prices. Under this framework, we design $\text{AO}^2$, an adaptive offloading orchestration algorithm to solve the formulated optimization problem. Using Starlink-scale simulations and real-world electricity price traces, $\text{AO}^2$ reduces energy consumption by up to 76.03% and battery life consumption by up to 76.85% compared to state-of-the-art schemes, while also lowering task delay. This work highlights that sustainable scaling of LEO constellations requires co-design of space networking and renewable energy infrastructure, while our solution promotes renewable-aware task offloading and cross-domain collaboration for space-energy integration in the 6G era.

CLMay 24, 2024
SCALM: Towards Semantic Caching for Automated Chat Services with Large Language Models

Jiaxing Li, Chi Xu, Feng Wang et al.

Large Language Models (LLMs) have become increasingly popular, transforming a wide range of applications across various domains. However, the real-world effectiveness of their query cache systems has not been thoroughly investigated. In this work, we for the first time conducted an analysis on real-world human-to-LLM interaction data, identifying key challenges in existing caching solutions for LLM-based chat services. Our findings reveal that current caching methods fail to leverage semantic connections, leading to inefficient cache performance and extra token costs. To address these issues, we propose SCALM, a new cache architecture that emphasizes semantic analysis and identifies significant cache entries and patterns. We also detail the implementations of the corresponding cache storage and eviction strategies. Our evaluations show that SCALM increases cache hit ratios and reduces operational costs for LLMChat services. Compared with other state-of-the-art solutions in GPTCache, SCALM shows, on average, a relative increase of 63% in cache hit ratio and a relative improvement of 77% in tokens savings.

NIAug 19, 2025
OmniSense: Towards Edge-Assisted Online Analytics for 360-Degree Videos

Miao Zhang, Yifei Zhu, Linfeng Shen et al.

With the reduced hardware costs of omnidirectional cameras and the proliferation of various extended reality applications, more and more $360^\circ$ videos are being captured. To fully unleash their potential, advanced video analytics is expected to extract actionable insights and situational knowledge without blind spots from the videos. In this paper, we present OmniSense, a novel edge-assisted framework for online immersive video analytics. OmniSense achieves both low latency and high accuracy, combating the significant computation and network resource challenges of analyzing $360^\circ$ videos. Motivated by our measurement insights into $360^\circ$ videos, OmniSense introduces a lightweight spherical region of interest (SRoI) prediction algorithm to prune redundant information in $360^\circ$ frames. Incorporating the video content and network dynamics, it then smartly scales vision models to analyze the predicted SRoIs with optimized resource utilization. We implement a prototype of OmniSense with commodity devices and evaluate it on diverse real-world collected $360^\circ$ videos. Extensive evaluation results show that compared to resource-agnostic baselines, it improves the accuracy by $19.8\%$ -- $114.6\%$ with similar end-to-end latencies. Meanwhile, it hits $2.0\times$ -- $2.4\times$ speedups while keeping the accuracy on par with the highest accuracy of baselines.

DCOct 16, 2024
Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges

Handi Chen, Weipeng Deng, Shuo Yang et al.

Edge Intelligence (EI) has been instrumental in delivering real-time, localized services by leveraging the computational capabilities of edge networks. The integration of Large Language Models (LLMs) empowers EI to evolve into the next stage: Edge General Intelligence (EGI), enabling more adaptive and versatile applications that require advanced understanding and reasoning capabilities. However, systematic exploration in this area remains insufficient. This survey delineates the distinctions between EGI and traditional EI, categorizing LLM-empowered EGI into three conceptual systems: centralized, hybrid, and decentralized. For each system, we detail the framework designs and review existing implementations. Furthermore, we evaluate the performance and throughput of various Small Language Models (SLMs) that are more suitable for development on edge devices. This survey provides researchers with a comprehensive vision of EGI, offering insights into its vast potential and establishing a foundation for future advancements in this rapidly evolving field.

LGFeb 6, 2024
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes

Xiaoxin Su, Yipeng Zhou, Laizhong Cui et al.

In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds, without touching private data owned by individual clients. FL is appealing in preserving data privacy; yet the communication between the PS and scattered clients can be a severe bottleneck. Model compression algorithms, such as quantization and sparsification, have been suggested but they generally assume a fixed code length, which does not reflect the heterogeneity and variability of model updates. In this paper, through both analysis and experiments, we show strong evidences that variable-length is beneficial for compression in FL. We accordingly present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response of the dynamics of model updates. We develop optimal tuning strategy that minimizes the loss function (equivalent to maximizing the model utility) subject to the budget for communication. We further demonstrate that Fed-CVLC is indeed a general compression design that bridges quantization and sparsification, with greater flexibility. Extensive experiments have been conducted with public datasets to demonstrate that Fed-CVLC remarkably outperforms state-of-the-art baselines, improving model utility by 1.50%-5.44%, or shrinking communication traffic by 16.67%-41.61%.

NIApr 6
OrbitTransit: Traffic Delivery and Diffusion for Earth Observation via Satellite Mobility

Haoyuan Zhao, Long Chen, Yi Ching Chou et al.

The emerging demand for Earth observation (EO) to address environmental challenges has driven unprecedented growth in its primary carrier, Low Earth Orbit satellites, in recent years. Ground stations (GSs), the egress points of these networks, are congested due to the massive volume of EO traffic, and their deployment is constrained by geographic, political, and budgetary factors. Although inter-satellite links (ISLs) can partially relieve this congestion by forwarding traffic to alternative GSs, existing ISL-based approaches can hardly address traffic contention caused by biased GS distribution and may also raise sustainability concerns due to prolonged ISL paths. In this paper, we propose OrbitTransit, a pickup-carry-offload (PCO) approach that leverages satellite mobility for data \textit{delivery} and integrates ISLs for traffic \textit{diffusion} to alleviate the resource contention inherent in PCO delivery. The proposed orbit-as-node framework and contention-avoidant delivery jointly determine the optimal hybrid PCO-ISL path, minimizing energy consumption and balancing GS traffic. Extensive experiments show that OrbitTransit reduces battery consumption by $47.16\%$, decreases task failures by $1.09\times$, and improves GS load balancing compared with state-of-the-art GS selection and routing algorithms.

CVDec 14, 2025
StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding

Xinqi Jin, Hanxun Yu, Bohan Yu et al.

Online video understanding is essential for applications like public surveillance and AI glasses. However, applying Multimodal Large Language Models (MLLMs) to this domain is challenging due to the large number of video frames, resulting in high GPU memory usage and computational latency. To address these challenges, we propose token pruning as a means to reduce context length while retaining critical information. Specifically, we introduce a novel redundancy metric, Maximum Similarity to Spatially Adjacent Video Tokens (MSSAVT), which accounts for both token similarity and spatial position. To mitigate the bidirectional dependency between pruning and redundancy, we further design a masked pruning strategy that ensures only mutually unadjacent tokens are pruned. We also integrate an existing temporal redundancy-based pruning method to eliminate temporal redundancy of the video modality. Experimental results on multiple online and offline video understanding benchmarks demonstrate that our method significantly improves the accuracy (i.e., by 4\% at most) while incurring a negligible pruning latency (i.e., less than 1ms). Our full implementation will be made publicly available.

IVNov 17, 2025
Self-Supervised Compression and Artifact Correction for Streaming Underwater Imaging Sonar

Rongsheng Qian, Chi Xu, Xiaoqiang Ma et al.

Real-time imaging sonar has become an important tool for underwater monitoring in environments where optical sensing is unreliable. Its broader use is constrained by two coupled challenges: highly limited uplink bandwidth and severe sonar-specific artifacts (speckle, motion blur, reverberation, acoustic shadows) that affect up to 98% of frames. We present SCOPE, a self-supervised framework that jointly performs compression and artifact correction without clean-noise pairs or synthetic assumptions. SCOPE combines (i) Adaptive Codebook Compression (ACC), which learns frequency-encoded latent representations tailored to sonar, with (ii) Frequency-Aware Multiscale Segmentation (FAMS), which decomposes frames into low-frequency structure and sparse high-frequency dynamics while suppressing rapidly fluctuating artifacts. A hedging training strategy further guides frequency-aware learning using low-pass proxy pairs generated without labels. Evaluated on months of in-situ ARIS sonar data, SCOPE achieves a structural similarity index (SSIM) of 0.77, representing a 40% improvement over prior self-supervised denoising baselines, at bitrates down to <= 0.0118 bpp. It reduces uplink bandwidth by more than 80% while improving downstream detection. The system runs in real time, with 3.1 ms encoding on an embedded GPU and 97 ms full multi-layer decoding on the server end. SCOPE has been deployed for months in three Pacific Northwest rivers to support real-time salmon enumeration and environmental monitoring in the wild. Results demonstrate that learning frequency-structured latents enables practical, low-bitrate sonar streaming with preserved signal details under real-world deployment conditions.

AIMay 10, 2025
Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers

Chi Xu, Yili Jin, Sami Ma et al.

Wild salmon are essential to the ecological, economic, and cultural sustainability of the North Pacific Rim. Yet climate variability, habitat loss, and data limitations in remote ecosystems that lack basic infrastructure support pose significant challenges to effective fisheries management. This project explores the integration of multimodal foundation AI and expert-in-the-loop frameworks to enhance wild salmon monitoring and sustainable fisheries management in Indigenous rivers across Pacific Northwest. By leveraging video and sonar-based monitoring, we develop AI-powered tools for automated species identification, counting, and length measurement, reducing manual effort, expediting delivery of results, and improving decision-making accuracy. Expert validation and active learning frameworks ensure ecological relevance while reducing annotation burdens. To address unique technical and societal challenges, we bring together a cross-domain, interdisciplinary team of university researchers, fisheries biologists, Indigenous stewardship practitioners, government agencies, and conservation organizations. Through these collaborations, our research fosters ethical AI co-development, open data sharing, and culturally informed fisheries management.

LGDec 13, 2021
Optimal Rate Adaption in Federated Learning with Compressed Communications

Laizhong Cui, Xiaoxin Su, Yipeng Zhou et al.

Federated Learning (FL) incurs high communication overhead, which can be greatly alleviated by compression for model updates. Yet the tradeoff between compression and model accuracy in the networked environment remains unclear and, for simplicity, most implementations adopt a fixed compression rate only. In this paper, we for the first time systematically examine this tradeoff, identifying the influence of the compression error on the final model accuracy with respect to the learning rate. Specifically, we factor the compression error of each global iteration into the convergence rate analysis under both strongly convex and non-convex loss functions. We then present an adaptation framework to maximize the final model accuracy by strategically adjusting the compression rate in each iteration. We have discussed the key implementation issues of our framework in practical networks with representative compression algorithms. Experiments over the popular MNIST and CIFAR-10 datasets confirm that our solution effectively reduces network traffic yet maintains high model accuracy in FL.

LGDec 27, 2020
Federated Unlearning

Gaoyang Liu, Xiaoqiang Ma, Yang Yang et al.

Federated learning (FL) has recently emerged as a promising distributed machine learning (ML) paradigm. Practical needs of the "right to be forgotten" and countering data poisoning attacks call for efficient techniques that can remove, or unlearn, specific training data from the trained FL model. Existing unlearning techniques in the context of ML, however, are no longer in effect for FL, mainly due to the inherent distinction in the way how FL and ML learn from data. Therefore, how to enable efficient data removal from FL models remains largely under-explored. In this paper, we take the first step to fill this gap by presenting FedEraser, the first federated unlearning methodology that can eliminate the influence of a federated client's data on the global FL model while significantly reducing the time used for constructing the unlearned FL model.The basic idea of FedEraser is to trade the central server's storage for unlearned model's construction time, where FedEraser reconstructs the unlearned model by leveraging the historical parameter updates of federated clients that have been retained at the central server during the training process of FL. A novel calibration method is further developed to calibrate the retained updates, which are further used to promptly construct the unlearned model, yielding a significant speed-up to the reconstruction of the unlearned model while maintaining the model efficacy. Experiments on four realistic datasets demonstrate the effectiveness of FedEraser, with an expected speed-up of $4\times$ compared with retraining from the scratch. We envision our work as an early step in FL towards compliance with legal and ethical criteria in a fair and transparent manner.

LGJul 7, 2020
Personalized Cross-Silo Federated Learning on Non-IID Data

Yutao Huang, Lingyang Chu, Zirui Zhou et al.

Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive message passing to facilitate similar clients to collaborate more. We establish the convergence of FedAMP for both convex and non-convex models, and propose a heuristic method to further improve the performance of FedAMP when clients adopt deep neural networks as personalized models. Our extensive experiments on benchmark data sets demonstrate the superior performance of the proposed methods.

MMAug 31, 2016
Towards Hybrid Cloud-assisted Crowdsourced Live Streaming: Measurement and Analysis

Cong Zhang, Jiangchuan Liu, Haiyang Wang

Crowdsourced Live Streaming (CLS), most notably Twitch.tv, has seen explosive growth in its popularity in the past few years. In such systems, any user can lively broadcast video content of interest to others, e.g., from a game player to many online viewers. To fulfill the demands from both massive and heterogeneous broadcasters and viewers, expensive server clusters have been deployed to provide video ingesting and transcoding services. Despite the existence of highly popular channels, a significant portion of the channels is indeed unpopular. Yet as our measurement shows, these broadcasters are consuming considerable system resources; in particular, 25% (resp. 30%) of bandwidth (resp. computation) resources are used by the broadcasters who do not have any viewers at all. In this paper, we closely examine the challenge of handling unpopular live-broadcasting channels in CLS systems and present a comprehensive solution for service partitioning on hybrid cloud. The trace-driven evaluation shows that our hybrid cloud-assisted design can smartly assign ingesting and transcoding tasks to the elastic cloud virtual machines, providing flexible system deployment cost-effectively.

MMMay 29, 2016
Improving Crowdsourced Live Streaming with Aggregated Edge Networks

Chenglei Wu, Zhi Wang, Jiangchuan Liu et al.

Recent years have witnessed a dramatic increase of user-generated video services. In such user-generated video services, crowdsourced live streaming (e.g., Periscope, Twitch) has significantly challenged today's edge network infrastructure: today's edge networks (e.g., 4G, Wi-Fi) have limited uplink capacity support, making high-bitrate live streaming over such links fundamentally impossible. In this paper, we propose to let broadcasters (i.e., users who generate the video) upload crowdsourced video streams using aggregated network resources from multiple edge networks. There are several challenges in the proposal: First, how to design a framework that aggregates bandwidth from multiple edge networks? Second, how to make this framework transparent to today's crowdsourced live streaming services? Third, how to maximize the streaming quality for the whole system? We design a multi-objective and deployable bandwidth aggregation system BASS to address these challenges: (1) We propose an aggregation framework transparent to today's crowdsourced live streaming services, using an edge proxy box and aggregation cloud paradigm; (2) We dynamically allocate geo-distributed cloud aggregation servers to enable MPTCP (i.e., multi-path TCP), according to location and network characteristics of both broadcasters and the original streaming servers; (3) We maximize the overall performance gain for the whole system, by matching streams with the best aggregation paths.

MMFeb 23, 2015
Crowdsourced Live Streaming over the Cloud

Fei Chen, Cong Zhang, Feng Wang et al.

Empowered by today's rich tools for media generation and distribution, and the convenient Internet access, crowdsourced streaming generalizes the single-source streaming paradigm by including massive contributors for a video channel. It calls a joint optimization along the path from crowdsourcers, through streaming servers, to the end-users to minimize the overall latency. The dynamics of the video sources, together with the globalized request demands and the high computation demand from each sourcer, make crowdsourced live streaming challenging even with powerful support from modern cloud computing. In this paper, we present a generic framework that facilitates a cost-effective cloud service for crowdsourced live streaming. Through adaptively leasing, the cloud servers can be provisioned in a fine granularity to accommodate geo-distributed video crowdsourcers. We present an optimal solution to deal with service migration among cloud instances of diverse lease prices. It also addresses the location impact to the streaming quality. To understand the performance of the proposed strategies in the realworld, we have built a prototype system running over the planetlab and the Amazon/Microsoft Cloud. Our extensive experiments demonstrate that the effectiveness of our solution in terms of deployment cost and streaming quality.

MMFeb 16, 2015
On Crowdsourced Interactive Live Streaming: A Twitch.TV-Based Measurement Study

Cong Zhang, Jiangchuan Liu

Empowered by today's rich tools for media generation and collaborative production, the multimedia service paradigm is shifting from the conventional single source, to multi-source, to many sources, and now toward {\em crowdsource}. Such crowdsourced live streaming platforms as Twitch.tv allow general users to broadcast their content to massive viewers, thereby greatly expanding the content and user bases. The resources available for these non-professional broadcasters however are limited and unstable, which potentially impair the streaming quality and viewers' experience. The diverse live interactions among the broadcasters and viewers can further aggravate the problem. In this paper, we present an initial investigation on the modern crowdsourced live streaming systems. Taking Twitch as a representative, we outline their inside architecture using both crawled data and captured traffic of local broadcasters/viewers. Closely examining the access data collected in a two-month period, we reveal that the view patterns are determined by both events and broadcasters' sources. Our measurements explore the unique source- and event-driven views, showing that the current delay strategy on the viewer's side substantially impacts the viewers' interactive experience, and there is significant disparity between the long broadcast latency and the short live messaging latency. On the broadcaster's side, the dynamic uploading capacity is a critical challenge, which noticeably affects the smoothness of live streaming for viewers.

MMDec 24, 2014
Mobile Instant Video Clip Sharing: Modeling and Enhancing View Experience

Lei Zhang, Feng Wang, Jiangchuan Liu

With the rapid development of wireless networking and mobile devices, anytime and anywhere data access becomes readily available nowadays. Given the crowdsourced content capturing and sharing, the preferred content length becomes shorter and shorter, even for such multimedia data as video. A representative is Twitter's Vine service, which, mainly targeting mobile users, enables them to create ultra-short video clips and instantly post and share with their followers. In this paper, we present an initial study on this new generation of instant video clip sharing service enabled by mobile platforms and explore the potentials towards its further enhancement. We closely investigate its unique mobile interface, revealing the key differences between Vine-enabled anytime anywhere data access patterns and that of traditional counterparts. We then examine the scheduling policy to maximize the user watching experience as well as the efficiency on the monetary and energy costs. We show that the generic scheduling problem involves two subproblems, namely, pre-fetching scheduling and watch-time download scheduling, and develop effective solutions towards both of them. The superiority of our solution is demonstrated by extensive trace-driven simulations. To the best of our knowledge, this is the first work on modeling and optimizing the instant video clip sharing on mobile devices.

LGMar 22, 2014
Forecasting Popularity of Videos using Social Media

Jie Xu, Mihaela van der Schaar, Jiangchuan Liu et al.

This paper presents a systematic online prediction method (Social-Forecast) that is capable to accurately forecast the popularity of videos promoted by social media. Social-Forecast explicitly considers the dynamically changing and evolving propagation patterns of videos in social media when making popularity forecasts, thereby being situation and context aware. Social-Forecast aims to maximize the forecast reward, which is defined as a tradeoff between the popularity prediction accuracy and the timeliness with which a prediction is issued. The forecasting is performed online and requires no training phase or a priori knowledge. We analytically bound the prediction performance loss of Social-Forecast as compared to that obtained by an omniscient oracle and prove that the bound is sublinear in the number of video arrivals, thereby guaranteeing its short-term performance as well as its asymptotic convergence to the optimal performance. In addition, we conduct extensive experiments using real-world data traces collected from the videos shared in RenRen, one of the largest online social networks in China. These experiments show that our proposed method outperforms existing view-based approaches for popularity prediction (which are not context-aware) by more than 30% in terms of prediction rewards.