Marco Levorato

h-index40

30papers

863citations

Novelty51%

AI Score44

Ranked #74,705 of 205,806 authors (top 36%)#16,755 in LG (top 40%)

30 Papers

ITJan 30, 2013

Cognitive Access Policies under a Primary ARQ process via Forward-Backward Interference Cancellation

Nicolò Michelusi, Petar Popovski, Osvaldo Simeone et al.

This paper introduces a novel technique for access by a cognitive Secondary User (SU) using best-effort transmission to a spectrum with an incumbent Primary User (PU), which uses Type-I Hybrid ARQ. The technique leverages the primary ARQ protocol to perform Interference Cancellation (IC) at the SU receiver (SUrx). Two IC mechanisms that work in concert are introduced: Forward IC, where SUrx, after decoding the PU message, cancels its interference in the (possible) following PU retransmissions of the same message, to improve the SU throughput; Backward IC, where SUrx performs IC on previous SU transmissions, whose decoding failed due to severe PU interference. Secondary access policies are designed that determine the secondary access probability in each state of the network so as to maximize the average long-term SU throughput by opportunistically leveraging IC, while causing bounded average long-term PU throughput degradation and SU power expenditure. It is proved that the optimal policy prescribes that the SU prioritizes its access in the states where SUrx knows the PU message, thus enabling IC. An algorithm is provided to optimally allocate additional secondary access opportunities in the states where the PU message is unknown. Numerical results are shown to assess the throughput gain provided by the proposed techniques.

SYSep 20, 2010

Optimization of ARQ Protocols in Interference Networks with QoS Constraints

Marco Levorato, Daniel O'Neill, Andrea Goldsmith et al.

We study optimal transmission strategies in interfering wireless networks, under Quality of Service constraints. A buffered, dynamic network with multiple sources is considered, and sources use a retransmission strategy in order to improve packet delivery probability. The optimization problem is formulated as a Markov Decision Process, where constraints and objective functions are ratios of time-averaged cost functions. The optimal strategy is found as the solution of a Linear Fractional Program, where the optimization variables are the steady-state probability of state-action pairs. Numerical results illustrate the dependence of optimal transmission/interference strategies on the constraints imposed on the network.

LGDec 2, 2022

Matching DNN Compression and Cooperative Training with Resources and Data Availability

Francesco Malandrino, Giuseppe Di Giacomo, Armin Karamzade et al.

To make machine learning (ML) sustainable and apt to run on the diverse devices where relevant data is, it is essential to compress ML models as needed, while still meeting the required learning quality and time performance. However, how much and when an ML model should be compressed, and {\em where} its training should be executed, are hard decisions to make, as they depend on the model itself, the resources of the available nodes, and the data such nodes own. Existing studies focus on each of those aspects individually, however, they do not account for how such decisions can be made jointly and adapted to one another. In this work, we model the network system focusing on the training of DNNs, formalize the above multi-dimensional problem, and, given its NP-hardness, formulate an approximate dynamic programming problem that we solve through the PACT algorithmic framework. Importantly, PACT leverages a time-expanded graph representing the learning process, and a data-driven and theoretical approach for the prediction of the loss evolution to be expected as a consequence of training decisions. We prove that PACT's solutions can get as close to the optimum as desired, at the cost of an increased time complexity, and that, in any case, such complexity is polynomial. Numerical results also show that, even under the most disadvantageous settings, PACT outperforms state-of-the-art alternatives and closely matches the optimal energy cost.

LGMar 16, 2022

SC2 Benchmark: Supervised Compression for Split Computing

Yoshitomo Matsubara, Ruihan Yang, Marco Levorato et al.

With the increasing demand for deep learning models on mobile devices, splitting neural network computation between the device and a more powerful edge server has become an attractive solution. However, existing split computing approaches often underperform compared to a naive baseline of remote computation on compressed data. Recent studies propose learning compressed representations that contain more relevant information for supervised downstream tasks, showing improved tradeoffs between compressed data size and supervised performance. However, existing evaluation metrics only provide an incomplete picture of split computing. This study introduces supervised compression for split computing (SC2) and proposes new evaluation criteria: minimizing computation on the mobile device, minimizing transmitted data size, and maximizing model accuracy. We conduct a comprehensive benchmark study using 10 baseline methods, three computer vision tasks, and over 180 trained models, and discuss various aspects of SC2. We also release sc2bench, a Python package for future research on SC2. Our proposed metrics and package will help researchers better understand the tradeoffs of supervised compression in split computing.

LGApr 28, 2023

Active Reinforcement Learning for Personalized Stress Monitoring in Everyday Settings

Ali Tazarv, Sina Labbaf, Amir Rahmani et al.

Most existing sensor-based monitoring frameworks presume that a large available labeled dataset is processed to train accurate detection models. However, in settings where personalization is necessary at deployment time to fine-tune the model, a person-specific dataset needs to be collected online by interacting with the users. Optimizing the collection of labels in such phase is instrumental to impose a tolerable burden on the users while maximizing personal improvement. In this paper, we consider a fine-grain stress detection problem based on wearable sensors targeting everyday settings, and propose a novel context-aware active learning strategy capable of jointly maximizing the meaningfulness of the signal samples we request the user to label and the response rate. We develop a multilayered sensor-edge-cloud platform to periodically capture physiological signals and process them in real-time, as well as to collect labels and retrain the detection model. We collect a large dataset and show that the context-aware active learning technique we propose achieves a desirable detection performance using 88\% and 32\% fewer queries from users compared to a randomized strategy and a traditional active learning strategy, respectively.

SYNov 23, 2015

Modeling And Control Battery Aging in Energy Harvesting Systems

Roberto Valentini, Nga Dang, Marco Levorato et al.

Energy storage is a fundamental component for the development of sustainable and environment-aware technologies. One of the critical challenges that needs to be overcome is preserving the State of Health (SoH) in energy harvesting systems, where bursty arrival of energy and load may severely degrade the battery. Tools from Markov process and Dynamic Programming theory are becoming an increasingly popular choice to control dynamics of these systems due to their ability to seamlessly incorporate heterogeneous components and support a wide range of applications. Mapping aging rate measures to fit within the boundaries of these tools is non-trivial. In this paper, a framework for modeling and controlling the aging rate of batteries based on Markov process theory is presented. Numerical results illustrate the tradeoff between battery degradation and task completion delay enabled by the proposed framework.

LGJun 22, 2023

Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems

Juliano S. Assine, J. C. S. Santos Filho, Eduardo Valle et al.

The execution of large deep neural networks (DNN) at mobile edge devices requires considerable consumption of critical resources, such as energy, while imposing demands on hardware capabilities. In approaches based on edge computing the execution of the models is offloaded to a compute-capable device positioned at the edge of 5G infrastructures. The main issue of the latter class of approaches is the need to transport information-rich signals over wireless links with limited and time-varying capacity. The recent split computing paradigm attempts to resolve this impasse by distributing the execution of DNN models across the layers of the systems to reduce the amount of data to be transmitted while imposing minimal computing load on mobile devices. In this context, we propose a novel split computing approach based on slimmable ensemble encoders. The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time. This is in contrast with existing approaches, where the same adaptation requires costly context switching and model loading. Moreover, our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices. We present a comprehensive comparison with the most advanced split computing solutions, as well as an experimental evaluation on GPU-less devices.

LGJul 11, 2024

Enhancing Performance and User Engagement in Everyday Stress Monitoring: A Context-Aware Active Reinforcement Learning Approach

Seyed Amir Hossein Aqajari, Ziyu Wang, Ali Tazarv et al.

In today's fast-paced world, accurately monitoring stress levels is crucial. Sensor-based stress monitoring systems often need large datasets for training effective models. However, individual-specific models are necessary for personalized and interactive scenarios. Traditional methods like Ecological Momentary Assessments (EMAs) assess stress but struggle with efficient data collection without burdening users. The challenge is to timely send EMAs, especially during stress, balancing monitoring efficiency and user convenience. This paper introduces a novel context-aware active reinforcement learning (RL) algorithm for enhanced stress detection using Photoplethysmography (PPG) data from smartwatches and contextual data from smartphones. Our approach dynamically selects optimal times for deploying EMAs, utilizing the user's immediate context to maximize label accuracy and minimize intrusiveness. Initially, the study was executed in an offline environment to refine the label collection process, aiming to increase accuracy while reducing user burden. Later, we integrated a real-time label collection mechanism, transitioning to an online methodology. This shift resulted in an 11% improvement in stress detection efficiency. Incorporating contextual data improved model accuracy by 4%. Personalization studies indicated a 10% enhancement in AUC-ROC scores, demonstrating better stress level differentiation. This research marks a significant move towards personalized, context-driven real-time stress monitoring methods.

CLSep 16, 2024

Improving Multi-candidate Speculative Decoding

Xiaofan Lu, Yixiao Zeng, Feiyang Ma et al.

Speculative Decoding (SD) is a technique to accelerate the inference of Large Language Models (LLMs) by using a lower complexity draft model to propose candidate tokens verified by a larger target model. To further improve efficiency, Multi-Candidate Speculative Decoding (MCSD) improves upon this by sampling multiple candidate tokens from the draft model at each step and verifying them in parallel, thus increasing the chances of accepting a token and reducing generation time. Existing MCSD methods rely on the draft model to initialize the multi-candidate sequences and use static length and tree attention structure for draft generation. However, such an approach suffers from the draft and target model's output distribution differences, especially in a dynamic generation context. In this work, we introduce a new version of MCSD that includes a target model initialized multi-candidate generation, a dynamic sliced topology-aware causal mask for dynamic length adjustment, and decision models to optimize early stopping. We experimented with our method on Llama 2-7B and its variants and observed a maximum 27.5% speedup compared to our MCSD baseline across three benchmarks with Llama 2-7B as the target model and JackFram 68M as the draft model. Additionally, we evaluate the effects of using the target model initialized multi-candidate process with different draft models on output quality.

NIOct 12, 2023

SplitBeam: Effective and Efficient Beamforming in Wi-Fi Networks Through Split Computing

Niloofar Bahadori, Yoshitomo Matsubara, Marco Levorato et al.

Modern IEEE 802.11 (Wi-Fi) networks extensively rely on multiple-input multiple-output (MIMO) to significantly improve throughput. To correctly beamform MIMO transmissions, the access point needs to frequently acquire a beamforming matrix (BM) from each connected station. However, the size of the matrix grows with the number of antennas and subcarriers, resulting in an increasing amount of airtime overhead and computational load at the station. Conventional approaches come with either excessive computational load or loss of beamforming precision. For this reason, we propose SplitBeam, a new framework where we train a split deep neural network (DNN) to directly output the BM given the channel state information (CSI) matrix as input. We formulate and solve a bottleneck optimization problem (BOP) to keep computation, airtime overhead, and bit error rate (BER) below application requirements. We perform extensive experimental CSI collection with off-the-shelf Wi-Fi devices in two distinct environments and compare the performance of SplitBeam with the standard IEEE 802.11 algorithm for BM feedback and the state-of-the-art DNN-based approach LB-SciFi. Our experimental results show that SplitBeam reduces the beamforming feedback size and computational complexity by respectively up to 81% and 84% while maintaining BER within about 10^-3 of existing approaches. We also implement the SplitBeam DNNs on FPGA hardware to estimate the end-to-end BM reporting delay, and show that the latter is less than 10 milliseconds in the most complex scenario, which is the target channel sounding frequency in realistic multi-user MIMO scenarios.

24.8ROApr 8Code

CADENCE: Context-Adaptive Depth Estimation for Navigation and Computational Efficiency

Timothy K Johnsen, Marco Levorato

Autonomous vehicles deployed in remote environments typically rely on embedded processors, compact batteries, and lightweight sensors. These hardware limitations conflict with the need to derive robust representations of the environment, which often requires executing computationally intensive deep neural networks for perception. To address this challenge, we present CADENCE, an adaptive system that dynamically scales the computational complexity of a slimmable monocular depth estimation network in response to navigation needs and environmental context. By closing the loop between perception fidelity and actuation requirements, CADENCE ensures high-precision computing is only used when mission-critical. We conduct evaluations on our released open-source testbed that integrates Microsoft AirSim with an NVIDIA Jetson Orin Nano. As compared to a state-of-the-art static approach, CADENCE decreases sensor acquisitions, power consumption, and inference latency by 9.67%, 16.1%, and 74.8%, respectively. The results demonstrate an overall reduction in energy expenditure by 75.0%, along with an increase in navigation accuracy by 7.43%.

CVJul 31, 2020Code

Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks

Yoshitomo Matsubara, Marco Levorato

The edge computing paradigm places compute-capable devices - edge servers - at the network edge to assist mobile devices in executing data analysis tasks. Intuitively, offloading compute-intense tasks to edge servers can reduce their execution time. However, poor conditions of the wireless channel connecting the mobile devices to the edge servers may degrade the overall capture-to-output delay achieved by edge offloading. Herein, we focus on edge computing supporting remote object detection by means of Deep Neural Networks (DNNs), and develop a framework to reduce the amount of data transmitted over the wireless link. The core idea we propose builds on recent approaches splitting DNNs into sections - namely head and tail models - executed by the mobile device and edge server, respectively. The wireless link, then, is used to transport the output of the last layer of the head model to the edge server, instead of the DNN input. Most prior work focuses on classification tasks and leaves the DNN structure unaltered. Herein, our focus is on DNNs for three different object detection tasks, which present a much more convoluted structure, and modify the architecture of the network to: (i) achieve in-network compression by introducing a bottleneck layer in the early layers on the head model, and (ii) prefilter pictures that do not contain objects of interest using a convolutional neural network. Results show that the proposed technique represents an effective intermediate option between local and edge computing in a parameter region where these extreme point solutions fail to provide satisfactory performance. The code and trained models are available at https://github.com/yoshitomo-matsubara/hnd-ghnd-object-detectors .

CVJul 27, 2020Code

Split Computing for Complex Object Detectors: Challenges and Preliminary Results

Yoshitomo Matsubara, Marco Levorato

Following the trends of mobile and edge computing for DNN models, an intermediate option, split computing, has been attracting attentions from the research community. Previous studies empirically showed that while mobile and edge computing often would be the best options in terms of total inference time, there are some scenarios where split computing methods can achieve shorter inference time. All the proposed split computing approaches, however, focus on image classification tasks, and most are assessed with small datasets that are far from the practical scenarios. In this paper, we discuss the challenges in developing split computing methods for powerful R-CNN object detectors trained on a large dataset, COCO 2017. We extensively analyze the object detectors in terms of layer-wise tensor size and model size, and show that naive split computing methods would not reduce inference time. To the best of our knowledge, this is the first study to inject small bottlenecks to such object detectors and unveil the potential of a split computing approach. The source code and trained models' weights used in this study are available at https://github.com/yoshitomo-matsubara/hnd-ghnd-object-detectors .

SPDec 14, 2023

Context-Aware Stress Monitoring using Wearable and Mobile Technologies in Everyday Settings

Seyed Amir Hossein Aqajari, Sina Labbaf, Phuc Hoang Tran et al.

Daily monitoring of stress is a critical component of maintaining optimal physical and mental health. Physiological signals and contextual information have recently emerged as promising indicators for detecting instances of heightened stress. Nonetheless, developing a real-time monitoring system that utilizes both physiological and contextual data to anticipate stress levels in everyday settings while also gathering stress labels from participants represents a significant challenge. We present a monitoring system that objectively tracks daily stress levels by utilizing both physiological and contextual data in a daily-life environment. Additionally, we have integrated a smart labeling approach to optimize the ecological momentary assessment (EMA) collection, which is required for building machine learning models for stress detection. We propose a three-tier Internet-of-Things-based system architecture to address the challenges. We utilized a cross-validation technique to accurately estimate the performance of our stress models. We achieved the F1-score of 70\% with a Random Forest classifier using both PPG and contextual data, which is considered an acceptable score in models built for everyday settings. Whereas using PPG data alone, the highest F1-score achieved is approximately 56\%, emphasizing the significance of incorporating both PPG and contextual data in stress detection tasks.

AIOct 22, 2024

Resource-Efficient Sensor Fusion via System-Wide Dynamic Gated Neural Networks

Chetna Singhal, Yashuo Wu, Francesco Malandrino et al.

Mobile systems will have to support multiple AI-based applications, each leveraging heterogeneous data sources through DNN architectures collaboratively executed within the network. To minimize the cost of the AI inference task subject to requirements on latency, quality, and - crucially - reliability of the inference process, it is vital to optimize (i) the set of sensors/data sources and (ii) the DNN architecture, (iii) the network nodes executing sections of the DNN, and (iv) the resources to use. To this end, we leverage dynamic gated neural networks with branches, and propose a novel algorithmic strategy called Quantile-constrained Inference (QIC), based upon quantile-Constrained policy optimization. QIC makes joint, high-quality, swift decisions on all the above aspects of the system, with the aim to minimize inference energy cost. We remark that this is the first contribution connecting gated dynamic DNNs with infrastructure-level decision making. We evaluate QIC using a dynamic gated DNN with stems and branches for optimal sensor fusion and inference, trained on the RADIATE dataset offering Radar, LiDAR, and Camera data, and real-world wireless measurements. Our results confirm that QIC matches the optimum and outperforms its alternatives by over 80%.

ROMay 16, 2024

NaviSlim: Adaptive Context-Aware Navigation and Sensing via Dynamic Slimmable Networks

Tim Johnsen, Marco Levorato

Small-scale autonomous airborne vehicles, such as micro-drones, are expected to be a central component of a broad spectrum of applications ranging from exploration to surveillance and delivery. This class of vehicles is characterized by severe constraints in computing power and energy reservoir, which impairs their ability to support the complex state-of-the-art neural models needed for autonomous operations. The main contribution of this paper is a new class of neural navigation models -- NaviSlim -- capable of adapting the amount of resources spent on computing and sensing in response to the current context (i.e., difficulty of the environment, current trajectory, and navigation goals). Specifically, NaviSlim is designed as a gated slimmable neural network architecture that, different from existing slimmable networks, can dynamically select a slimming factor to autonomously scale model complexity, which consequently optimizes execution time and energy consumption. Moreover, different from existing sensor fusion approaches, NaviSlim can dynamically select power levels of onboard sensors to autonomously reduce power and time spent during sensor acquisition, without the need to switch between different neural networks. By means of extensive training and testing on the robust simulation environment Microsoft AirSim, we evaluate our NaviSlim models on scenarios with varying difficulty and a test set that showed a dynamic reduced model complexity on average between 57-92%, and between 61-80% sensor utilization, as compared to static neural networks designed to match computing and sensing of that required by the most difficult scenario.

CVFeb 22, 2024

Distributed Radiance Fields for Edge Video Compression and Metaverse Integration in Autonomous Driving

Eugen Šlapak, Matúš Dopiriak, Mohammad Abdullah Al Faruque et al.

The metaverse is a virtual space that combines physical and digital elements, creating immersive and connected digital worlds. For autonomous mobility, it enables new possibilities with edge computing and digital twins (DTs) that offer virtual prototyping, prediction, and more. DTs can be created with 3D scene reconstruction methods that capture the real world's geometry, appearance, and dynamics. However, sending data for real-time DT updates in the metaverse, such as camera images and videos from connected autonomous vehicles (CAVs) to edge servers, can increase network congestion, costs, and latency, affecting metaverse services. Herein, a new method is proposed based on distributed radiance fields (RFs), multi-access edge computing (MEC) network for video compression and metaverse DT updates. RF-based encoder and decoder are used to create and restore representations of camera images. The method is evaluated on a dataset of camera images from the CARLA simulator. Data savings of up to 80% were achieved for H.264 I-frame - P-frame pairs by using RFs instead of I-frames, while maintaining high peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) qualitative metrics for the reconstructed images. Possible uses and challenges for the metaverse and autonomous mobility are also discussed.

CVJan 2, 2025

A Multi-task Supervised Compression Model for Split Computing

Yoshitomo Matsubara, Matteo Mendula, Marco Levorato

Split computing ($\neq$ split learning) is a promising approach to deep learning models for resource-constrained edge computing systems, where weak sensor (mobile) devices are wirelessly connected to stronger edge servers through channels with limited communication capacity. State-of-theart work on split computing presents methods for single tasks such as image classification, object detection, or semantic segmentation. The application of existing methods to multitask problems degrades model accuracy and/or significantly increase runtime latency. In this study, we propose Ladon, the first multi-task-head supervised compression model for multi-task split computing. Experimental results show that the multi-task supervised compression model either outperformed or rivaled strong lightweight baseline models in terms of predictive performance for ILSVRC 2012, COCO 2017, and PASCAL VOC 2012 datasets while learning compressed representations at its early layers. Furthermore, our models reduced end-to-end latency (by up to 95.4%) and energy consumption of mobile devices (by up to 88.2%) in multi-task split computing scenarios.

ROJun 18, 2024

NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation

Timothy K Johnsen, Ian Harshbarger, Zixia Xia et al.

Lightweight autonomous unmanned aerial vehicles (UAV) are emerging as a central component of a broad range of applications. However, autonomous navigation necessitates the implementation of perception algorithms, often deep neural networks (DNN), that process the input of sensor observations, such as that from cameras and LiDARs, for control logic. The complexity of such algorithms clashes with the severe constraints of these devices in terms of computing power, energy, memory, and execution time. In this paper, we propose NaviSplit, the first instance of a lightweight navigation framework embedding a distributed and dynamic multi-branched neural model. At its core is a DNN split at a compression point, resulting in two model parts: (1) the head model, that is executed at the vehicle, which partially processes and compacts perception from sensors; and (2) the tail model, that is executed at an interconnected compute-capable device, which processes the remainder of the compacted perception and infers navigation commands. Different from prior work, the NaviSplit framework includes a neural gate that dynamically selects a specific head model to minimize channel usage while efficiently supporting the navigation network. In our implementation, the perception model extracts a 2D depth map from a monocular RGB image captured by the drone using the robust simulator Microsoft AirSim. Our results demonstrate that the NaviSplit depth model achieves an extraction accuracy of 72-81% while transmitting an extremely small amount of data (1.2-18 KB) to the edge server. When using the neural gate, as utilized by NaviSplit, we obtain a slightly higher navigation accuracy as compared to a larger static network by 0.3% while significantly reducing the data rate by 95%. To the best of our knowledge, this is the first exemplar of dynamic multi-branched model based on split DNNs for autonomous navigation.

LGFeb 22, 2024

Dependable Distributed Training of Compressed Machine Learning Models

Francesco Malandrino, Giuseppe Di Giacomo, Marco Levorato et al.

The existing work on the distributed training of machine learning (ML) models has consistently overlooked the distribution of the achieved learning quality, focusing instead on its average value. This leads to a poor dependability}of the resulting ML models, whose performance may be much worse than expected. We fill this gap by proposing DepL, a framework for dependable learning orchestration, able to make high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. For concreteness, we consider as possible available models a full DNN and its compressed versions. Unlike previous studies, DepL guarantees that a target learning quality is reached with a target probability, while keeping the training cost at a minimum. We prove that DepL has constant competitive ratio and polynomial complexity, and show that it outperforms the state-of-the-art by over 27% and closely matches the optimum.

DCJan 11, 2022

SmartDet: Context-Aware Dynamic Control of Edge Task Offloading for Mobile Object Detection

Davide Callegaro, Francesco Restuccia, Marco Levorato

Mobile devices increasingly rely on object detection (OD) through deep neural networks (DNNs) to perform critical tasks. Due to their high complexity, the execution of these DNNs requires excessive time and energy. Low-complexity object tracking (OT) can be used with OD, where the latter is periodically applied to generate "fresh" references for tracking. However, the frames processed with OD incur large delays, which may make the reference outdated and degrade tracking quality. Herein, we propose to use edge computing in this context, and establish parallel OT (at the mobile device) and OD (at the edge server) processes that are resilient to large OD latency. We propose Katch-Up, a novel tracking mechanism that improves the system resilience to excessive OD delay. However, while Katch-Up significantly improves performance, it also increases the computing load of the mobile device. Hence, we design SmartDet, a low-complexity controller based on deep reinforcement learning (DRL) that learns controlling the trade-off between resource utilization and OD performance. SmartDet takes as input context-related information related to the current video content and the current network conditions to optimize frequency and type of OD offloading, as well as Katch-Up utilization. We extensively evaluate SmartDet on a real-world testbed composed of a JetSon Nano as mobile device and a GTX 980 Ti as edge server, connected through a Wi-Fi link. Experimental results show that SmartDet achieves an optimal balance between tracking performance - mean Average Recall (mAR) and resource usage. With respect to a baseline with full Katch-Upusage and maximum channel usage, we still increase mAR by 4% while using 50% less of the channel and 30% power resources associated with Katch-Up. With respect to a fixed strategy using minimal resources, we increase mAR by 20% while using Katch-Up on 1/3 of the frames.

LGJan 7, 2022

BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing

Yoshitomo Matsubara, Davide Callegaro, Sameer Singh et al.

Although mission-critical applications require the use of deep neural networks (DNNs), their continuous execution at mobile devices results in a significant increase in energy consumption. While edge offloading can decrease energy consumption, erratic patterns in channel quality, network and edge server load can lead to severe disruption of the system's key operations. An alternative approach, called split computing, generates compressed representations within the model (called "bottlenecks"), to reduce bandwidth usage and energy consumption. Prior work has proposed approaches that introduce additional layers, to the detriment of energy consumption and latency. For this reason, we propose a new framework called BottleFit, which, in addition to targeted DNN architecture modifications, includes a novel training strategy to achieve high accuracy even with strong compression rates. We apply BottleFit on cutting-edge DNN models in image classification, and show that BottleFit achieves 77.1% data compression with up to 0.6% accuracy loss on ImageNet dataset, while state of the art such as SPINN loses up to 6% in accuracy. We experimentally measure the power consumption and latency of an image classification application running on an NVIDIA Jetson Nano board (GPU-based) and a Raspberry PI board (GPU-less). We show that BottleFit decreases power consumption and latency respectively by up to 49% and 89% with respect to (w.r.t.) local computing and by 37% and 55% w.r.t. edge offloading. We also compare BottleFit with state-of-the-art autoencoders-based approaches, and show that (i) BottleFit reduces power consumption and execution time respectively by up to 54% and 44% on the Jetson and 40% and 62% on Raspberry PI; (ii) the size of the head model executed on the mobile device is 83 times smaller. We publish the code repository for reproducibility of the results in this study.

LGDec 7, 2021

Federated Deep Reinforcement Learning for the Distributed Control of NextG Wireless Networks

Peyman Tehrani, Francesco Restuccia, Marco Levorato

Next Generation (NextG) networks are expected to support demanding tactile internet applications such as augmented reality and connected autonomous vehicles. Whereas recent innovations bring the promise of larger link capacity, their sensitivity to the environment and erratic performance defy traditional model-based control rationales. Zero-touch data-driven approaches can improve the ability of the network to adapt to the current operating conditions. Tools such as reinforcement learning (RL) algorithms can build optimal control policy solely based on a history of observations. Specifically, deep RL (DRL), which uses a deep neural network (DNN) as a predictor, has been shown to achieve good performance even in complex environments and with high dimensional inputs. However, the training of DRL models require a large amount of data, which may limit its adaptability to ever-evolving statistics of the underlying environment. Moreover, wireless networks are inherently distributed systems, where centralized DRL approaches would require excessive data exchange, while fully distributed approaches may result in slower convergence rates and performance degradation. In this paper, to address these challenges, we propose a federated learning (FL) approach to DRL, which we refer to federated DRL (F-DRL), where base stations (BS) collaboratively train the embedded DNN by only sharing models' weights rather than training data. We evaluate two distinct versions of F-DRL, value and policy based, and show the superior performance they achieve compared to distributed and centralized DRL.

CVNov 15, 2021

Spatio-Temporal Split Learning for Autonomous Aerial Surveillance using Urban Air Mobility (UAM) Networks

Yoo Jeong Ha, Soyi Jung, Jae-Hyun Kim et al.

Autonomous surveillance unmanned aerial vehicles (UAVs) are deployed to observe the streets of the city for any suspicious activities. This paper utilizes surveillance UAVs for the purpose of detecting the presence of a fire in the streets. An extensive database is collected from UAV surveillance drones. With the aid of artificial intelligence (AI), fire stations can swiftly identify the presence of a fire emerging in the neighborhood. Spatio-temporal split learning is applied to this scenario to preserve privacy and globally train a fire classification model. Fires are hazardous natural disasters that can spread very quickly. Swift identification of fire is required to deploy firefighters to the scene. In order to do this, strong communication between the UAV and the central server where the deep learning process occurs is required. Improving communication resilience is integral to enhancing a safe experience on the roads. Therefore, this paper explores the adequate number of clients and data ratios for split learning in this UAV setting, as well as the required network infrastructure.

CVAug 21, 2021

Supervised Compression for Resource-Constrained Edge Computing Systems

Yoshitomo Matsubara, Ruihan Yang, Marco Levorato et al.

There has been much interest in deploying deep learning algorithms on low-powered devices, including smartphones, drones, and medical sensors. However, full-scale deep neural networks are often too resource-intensive in terms of energy and storage. As a result, the bulk part of the machine learning operation is therefore often carried out on an edge server, where the data is compressed and transmitted. However, compressing data (such as images) leads to transmitting information irrelevant to the supervised task. Another popular approach is to split the deep network between the device and the server while compressing intermediate features. To date, however, such split computing strategies have barely outperformed the aforementioned naive data compression baselines due to their inefficient approaches to feature compression. This paper adopts ideas from knowledge distillation and neural image compression to compress intermediate feature representations more efficiently. Our supervised compression approach uses a teacher model and a student model with a stochastic bottleneck and learnable prior for entropy coding (Entropic Student). We compare our approach to various neural image and feature compression baselines in three vision tasks and found that it achieves better supervised rate-distortion performance while maintaining smaller end-to-end latency. We furthermore show that the learned feature representations can be tuned to serve multiple downstream tasks.

LGJul 31, 2021

Personalized Stress Monitoring using Wearable Sensors in Everyday Settings

Ali Tazarv, Sina Labbaf, Stephanie M. Reich et al.

Since stress contributes to a broad range of mental and physical health problems, the objective assessment of stress is essential for behavioral and physiological studies. Although several studies have evaluated stress levels in controlled settings, objective stress assessment in everyday settings is still largely under-explored due to challenges arising from confounding contextual factors and limited adherence for self-reports. In this paper, we explore the objective prediction of stress levels in everyday settings based on heart rate (HR) and heart rate variability (HRV) captured via low-cost and easy-to-wear photoplethysmography (PPG) sensors that are widely available on newer smart wearable devices. We present a layered system architecture for personalized stress monitoring that supports a tunable collection of data samples for labeling, and present a method for selecting informative samples from the stream of real-time data for labeling. We captured the stress levels of fourteen volunteers through self-reported questionnaires over periods of between 1-3 months, and explored binary stress detection based on HR and HRV using Machine Learning Methods. We observe promising preliminary results given that the dataset is collected in the challenging environments of everyday settings. The binary stress detector is fairly accurate and can detect stressful vs non-stressful samples with a macro-F1 score of up to \%76. Our study lays the groundwork for more sophisticated labeling strategies that generate context-aware, personalized models that will empower health professionals to provide personalized interventions.

LGJul 30, 2021

A Deep Learning Approach to Predict Blood Pressure from PPG Signals

Ali Tazarv, Marco Levorato

Blood Pressure (BP) is one of the four primary vital signs indicating the status of the body's vital (life-sustaining) functions. BP is difficult to continuously monitor using a sphygmomanometer (i.e. a blood pressure cuff), especially in everyday-setting. However, other health signals which can be easily and continuously acquired, such as photoplethysmography (PPG), show some similarities with the Aortic Pressure waveform. Based on these similarities, in recent years several methods were proposed to predict BP from the PPG signal. Building on these results, we propose an advanced personalized data-driven approach that uses a three-layer deep neural network to estimate BP based on PPG signals. Different from previous work, the proposed model analyzes the PPG signal in time-domain and automatically extracts the most critical features for this specific application, then uses a variation of recurrent neural networks called Long-Short-Term-Memory (LSTM) to map the extracted features to the BP value associated with that time window. Experimental results on two separate standard hospital datasets, yielded absolute errors mean and absolute error standard deviation for systolic and diastolic BP values outperforming prior works.

SPMar 8, 2021

Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges

Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia

Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others. However, continuously executing the entire DNN on mobile devices can quickly deplete their battery. Although task offloading to cloud/edge servers may decrease the mobile device's computational burden, erratic patterns in channel quality, network, and edge server load can lead to a significant delay in task execution. Recently, approaches based on split computing (SC) have been proposed, where the DNN is split into a head and a tail model, executed respectively on the mobile device and on the edge server. Ultimately, this may reduce bandwidth usage as well as energy consumption. Another approach, called early exiting (EE), trains models to embed multiple "exits" earlier in the architecture, each providing increasingly higher target accuracy. Therefore, the trade-off between accuracy and delay can be tuned according to the current conditions or application demands. In this paper, we provide a comprehensive survey of the state of the art in SC and EE strategies by presenting a comparison of the most relevant approaches. We conclude the paper by providing a set of compelling research challenges.

LGSep 15, 2020

Frequency-based Multi Task learning With Attention Mechanism for Fault Detection In Power Systems

Peyman Tehrani, Marco Levorato

The prompt and accurate detection of faults and abnormalities in electric transmission lines is a critical challenge in smart grid systems. Existing methods mostly rely on model-based approaches, which may not capture all the aspects of these complex temporal series. Recently, the availability of data sets collected using advanced metering devices, such as Micro-Phasor Measurement units ($μ$ PMU), which provide measurements at microsecond timescale, boosted the development of data-driven methodologies. In this paper, we introduce a novel deep learning-based approach for fault detection and test it on a real data set, namely, the Kaggle platform for a partial discharge detection task. Our solution adopts a Long-Short Term Memory architecture with attention mechanism to extract time series features, and uses a 1D-Convolutional Neural Network structure to exploit frequency information of the signal for prediction. Additionally, we propose an unsupervised method to cluster signals based on their frequency components, and apply multi task learning on different clusters. The method we propose outperforms the winner solutions in the Kaggle competition and other state of the art methods in many performance metrics, and improves the interpretability of analysis.

SPJul 27, 2019

Optimizing Energy Efficiency of Wearable Sensors Using Fog-assisted Control

Delaram Amiri, Arman Anzanpour, Iman Azimi et al.

Recent advances in the Internet of Things (IoT) technologies have enabled the use of wearables for remote patient monitoring. Wearable sensors capture the patient's vital signs, and provide alerts or diagnosis based on the collected data. Unfortunately, wearables typically have limited energy and computational capacity, making their use challenging for healthcare applications where monitoring must continue uninterrupted long time, without the need to charge or change the battery. Fog computing can alleviate this problem by offloading computationally intensive tasks from the sensor layer to higher layers, thereby not only meeting the sensors' limited computational capacity but also enabling the use of local closed-loop energy optimization algorithms to increase the battery life.