SYOct 15, 2023Code
BONES: Near-Optimal Neural-Enhanced Video StreamingLingdong Wang, Simran Singh, Jacob Chakareski et al.
Accessing high-quality video content can be challenging due to insufficient and unstable network bandwidth. Recent advances in neural enhancement have shown promising results in improving the quality of degraded videos through deep learning. Neural-Enhanced Streaming (NES) incorporates this new approach into video streaming, allowing users to download low-quality video segments and then enhance them to obtain high-quality content without violating the playback of the video stream. We introduce BONES, an NES control algorithm that jointly manages the network and computational resources to maximize the quality of experience (QoE) of the user. BONES formulates NES as a Lyapunov optimization problem and solves it in an online manner with near-optimal performance, making it the first NES algorithm to provide a theoretical performance guarantee. Comprehensive experimental results indicate that BONES increases QoE by 5\% to 20\% over state-of-the-art algorithms with minimal overhead. Our code is available at https://github.com/UMass-LIDS/bones.
CVSep 12, 2022Code
CU-Net: Real-Time High-Fidelity Color Upsampling for Point CloudsLingdong Wang, Mohammad Hajiesmaili, Jacob Chakareski et al.
Point cloud upsampling is essential for high-quality augmented reality, virtual reality, and telepresence applications, due to the capture, processing, and communication limitations of existing technologies. Although geometry upsampling to densify a point cloud's coordinates has been well studied, the upsampling of the color attributes has been largely overlooked. In this paper, we propose CU-Net, the first deep-learning point cloud color upsampling model that enables low latency and high visual fidelity operation. CU-Net achieves linear time and space complexity by leveraging a feature extractor based on sparse convolution and a color prediction module based on neural implicit function. Therefore, CU-Net is theoretically guaranteed to be more efficient than most existing methods with quadratic complexity. Experimental results demonstrate that CU-Net can colorize a photo-realistic point cloud with nearly a million points in real time, while having notably better visual performance than baselines. Besides, CU-Net can adapt to arbitrary upsampling ratios and unseen objects without retraining. Our source code is available at https://github.com/UMass-LIDS/cunet.
DSApr 9, 2013
Dynamic Provisioning in Next-Generation Data Centers with On-site Power ProductionJinlong Tu, Lian Lu, Minghua Chen et al.
The critical need for clean and economical sources of energy is transforming data centers that are primarily energy consumers to also energy producers. We focus on minimizing the operating costs of next-generation data centers that can jointly optimize the energy supply from on-site generators and the power grid, and the energy demand from servers as well as power conditioning and cooling systems. We formulate the cost minimization problem and present an offline optimal algorithm. For "on-grid" data centers that use only the grid, we devise a deterministic online algorithm that achieves the best possible competitive ratio of $2-α_{s}$, where $α_{s}$ is a normalized look-ahead window size. For "hybrid" data centers that have on-site power generation in addition to the grid, we develop an online algorithm that achieves a competitive ratio of at most \textmd{\normalsize {\small $\frac{P_{\max} (2-α_{s})}{c_{o}+c_{m}/L} [1+2\frac{P_{\max}-c_{o}}{P_{\max}(1+α_{g})}]$}}, where $α_{s}$ and $α_{g}$ are normalized look-ahead window sizes, $P_{\max}$ is the maximum grid power price, and $L$, $c_{o}$, and $c_{m}$ are parameters of an on-site generator. Using extensive workload traces from Akamai with the corresponding grid power prices, we simulate our offline and online algorithms in a realistic setting. Our offline (resp., online) algorithm achieves a cost reduction of 25.8% (resp., 20.7%) for a hybrid data center and 12.3% (resp., 7.3%) for an on-grid data center. The cost reductions are quite significant and make a strong case for a joint optimization of energy supply and energy demand in a data center. A hybrid data center provides about 13% additional cost reduction over an on-grid data center representing the additional cost benefits that on-site power generation provides over using the grid alone.
78.3CVApr 6
Low-Bitrate Video Compression through Semantic-Conditioned DiffusionLingdong Wang, Guan-Ming Su, Divya Kothandaraman et al.
Traditional video codecs optimized for pixel fidelity collapse at ultra-low bitrates and produce severe artifacts. This failure arises from a fundamental misalignment between pixel accuracy and human perception. We propose a semantic video compression framework named DiSCo that transmits only the most meaningful information while relying on generative priors for detail synthesis. The source video is decomposed into three compact modalities: a textual description, a spatiotemporally degraded video, and optional sketches or poses that respectively capture semantic, appearance, and motion cues. A conditional video diffusion model then reconstructs high-quality, temporally coherent videos from these compact representations. Temporal forward filling, token interleaving, and modality-specific codecs are proposed to improve multimodal generation and modality compactness. Experiments show that our method outperforms baseline semantic and traditional codecs by 2-10X on perceptual metrics at low bitrates.
93.4IVMay 18
CATRF: Codec-Adaptive TriPlane Radiance Fields for Volumetric Content DeliveryTung-I Chen, Lingdong Wang, Subhransu Maji et al.
Volumetric media promises next-generation content delivery applications, but its bandwidth demand remains a key bottleneck. Implicit and hybrid volumetric representations reduce model sizes, yet still require careful coding to reach 2D video-like bitrates. We present CATRF, a standard-codec-in-the-loop compression framework for plane-factorized radiance fields. During training, we quantize and pack 2D feature planes into codec-friendly canvases, run a standard codec roundtrip (JPEG/VP9/HEVC/AV1), then unpack and dequantize the decoded features before volume rendering. We use a straight-through estimator (STE) to insert the non-differentiable, standard codec pipeline into the training loop, allowing radiance-field features to adapt directly to the real, client-side codec distortions without introducing any learned codec parameters. On both static and dynamic benchmarks, CATRF consistently achieves a better rate-distortion trade-off over codec-agnostic and learned-codec-in-the-loop baselines, and also outperforms recent compressed 3DGS methods in both compression efficiency and decoding speed. These results highlight a practical path toward low-bitrate, compression-resilient volumetric representations for free-viewpoint video streaming.
LGSep 7, 2025
Smoothed Online Optimization for Target Tracking: Robust and Learning-Augmented AlgorithmsAli Zeynali, Mahsa Sahebdel, Qingsong Liu et al.
We introduce the Smoothed Online Optimization for Target Tracking (SOOTT) problem, a new framework that integrates three key objectives in online decision-making under uncertainty: (1) tracking cost for following a dynamically moving target, (2) adversarial perturbation cost for withstanding unpredictable disturbances, and (3) switching cost for penalizing abrupt changes in decisions. This formulation captures real-world scenarios such as elastic and inelastic workload scheduling in AI clusters, where operators must balance long-term service-level agreements (e.g., LLM training) against sudden demand spikes (e.g., real-time inference). We first present BEST, a robust algorithm with provable competitive guarantees for SOOTT. To enhance practical performance, we introduce CoRT, a learning-augmented variant that incorporates untrusted black-box predictions (e.g., from ML models) into its decision process. Our theoretical analysis shows that CoRT strictly improves over BEST when predictions are accurate, while maintaining robustness under arbitrary prediction errors. We validate our approach through a case study on workload scheduling, demonstrating that both algorithms effectively balance trajectory tracking, decision smoothness, and resilience to external disturbances.
LGDec 21, 2024
Towards Environmentally Equitable AIMohammad Hajiesmaili, Shaolei Ren, Ramesh K. Sitaraman et al.
The skyrocketing demand for artificial intelligence (AI) has created an enormous appetite for globally deployed power-hungry servers. As a result, the environmental footprint of AI systems has come under increasing scrutiny. More crucially, the current way that we exploit AI workloads' flexibility and manage AI systems can lead to wildly different environmental impacts across locations, increasingly raising environmental inequity concerns and creating unintended sociotechnical consequences. In this paper, we advocate environmental equity as a priority for the management of future AI systems, advancing the boundaries of existing resource management for sustainable AI and also adding a unique dimension to AI fairness. Concretely, we uncover the potential of equity-aware geographical load balancing to fairly re-distribute the environmental cost across different regions, followed by algorithmic challenges. We conclude by discussing a few future directions to exploit the full potential of system management approaches to mitigate AI's environmental inequity.