DCJul 31, 2022Code
Adaptive Edge Offloading for Image Classification Under Rate LimitJiaming Qiu, Ruiqi Wang, Ayan Chakrabarti et al.
This paper considers a setting where embedded devices are used to acquire and classify images. Because of limited computing capacity, embedded devices rely on a parsimonious classification model with uneven accuracy. When local classification is deemed inaccurate, devices can decide to offload the image to an edge server with a more accurate but resource-intensive model. Resource constraints, e.g., network bandwidth, however, require regulating such transmissions to avoid congestion and high latency. The paper investigates this offloading problem when transmissions regulation is through a token bucket, a mechanism commonly used for such purposes. The goal is to devise a lightweight, online offloading policy that optimizes an application-specific metric (e.g., classification accuracy) under the constraints of the token bucket. The paper develops a policy based on a Deep Q-Network (DQN), and demonstrates both its efficacy and the feasibility of its deployment on embedded devices. Of note is the fact that the policy can handle complex input patterns, including correlation in image arrivals and classification accuracy. The evaluation is carried out by performing image classification over a local testbed using synthetic traces generated from the ImageNet image classification benchmark. Implementation of this work is available at https://github.com/qiujiaming315/edgeml-dqn.
DCOct 8, 2023
Progressive Neural Compression for Adaptive Image Offloading under Timing ConstraintsRuiqi Wang, Hanyang Liu, Jiaming Qiu et al.
IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth.
2.4NIApr 27
On the Benefits of Traffic "Reprofiling" -- The Multiple Hops Case -- Part IIJiaming Qiu, Roch Guerin
Delivering hard delay guarantees over packet networks is increasingly important to applications ranging from automotive systems, avionics, industrial control, etc. Traffic control and schedulers play an essential role in enforcing such guarantees. In this paper, we focus on ``simple'' static priority and FIFO schedulers, and explore how reprofiling flows entering the network, i.e., proactively shaping them to a different traffic profile, can deliver delay guarantees with less bandwidth. To that end, we formulate a joint optimization framework and develop efficient algorithms to solve it. Extensive evaluations across both realistic and synthetic topologies demonstrate that, as with more sophisticated schedulers, reprofiling flows is beneficial. They also highlight an intuitive coupling between a scheduler's capability and its ability to leverage more complex reprofiling solutions.
DCOct 24, 2024
Optimizing Edge Offloading Decisions for Object DetectionJiaming Qiu, Ruiqi Wang, Brooks Hu et al.
Recent advances in machine learning and hardware have produced embedded devices capable of performing real-time object detection with commendable accuracy. We consider a scenario in which embedded devices rely on an onboard object detector, but have the option to offload detection to a more powerful edge server when local accuracy is deemed too low. Resource constraints, however, limit the number of images that can be offloaded to the edge. Our goal is to identify which images to offload to maximize overall detection accuracy under those constraints. To that end, the paper introduces a reward metric designed to quantify potential accuracy improvements from offloading individual images, and proposes an efficient approach to make offloading decisions by estimating this reward based only on local detection results. The approach is computationally frugal enough to run on embedded devices, and empirical findings indicate that it outperforms existing alternatives in improving detection accuracy even when the fraction of offloaded images is small.