Chenyang Lu

CV
h-index29
26papers
471citations
Novelty49%
AI Score49

26 Papers

DCJul 31, 2022Code
Adaptive Edge Offloading for Image Classification Under Rate Limit

Jiaming Qiu, Ruiqi Wang, Ayan Chakrabarti et al.

This paper considers a setting where embedded devices are used to acquire and classify images. Because of limited computing capacity, embedded devices rely on a parsimonious classification model with uneven accuracy. When local classification is deemed inaccurate, devices can decide to offload the image to an edge server with a more accurate but resource-intensive model. Resource constraints, e.g., network bandwidth, however, require regulating such transmissions to avoid congestion and high latency. The paper investigates this offloading problem when transmissions regulation is through a token bucket, a mechanism commonly used for such purposes. The goal is to devise a lightweight, online offloading policy that optimizes an application-specific metric (e.g., classification accuracy) under the constraints of the token bucket. The paper develops a policy based on a Deep Q-Network (DQN), and demonstrates both its efficacy and the feasibility of its deployment on embedded devices. Of note is the fact that the policy can handle complex input patterns, including correlation in image arrivals and classification accuracy. The evaluation is carried out by performing image classification over a local testbed using synthetic traces generated from the ImageNet image classification benchmark. Implementation of this work is available at https://github.com/qiujiaming315/edgeml-dqn.

SYMay 4, 2017
Holistic Cyber-Physical Management for Dependable Wireless Control Systems

Yehan Ma, Dolvara Gunatilaka, Bo Li et al.

Wireless sensor-actuator networks (WSANs) are gaining momentum in industrial process automation as a communication infrastructure for lowering deployment and maintenance costs. In traditional wireless control systems the plant controller and the network manager operate in isolation, which ignore the significant influence of network reliability on plant control performance. To enhance the dependability of industrial wireless control, we propose a holistic cyber-physical management framework that employs run-time coordination between the plant control and network management. Our design includes a holistic controller that generates actuation signals to physical plants and reconfigures the WSAN to maintain desired control performance while saving wireless resources. As a concrete example of holistic control, we design a holistic manager that dynamically reconfigures the number of transmissions in the WSAN based on online observations of physical and cyber variables. We have implemented the holistic management framework in the Wireless Cyber-Physical Simulator (WCPS). A systematic case study has been presented based on two 5-state plants sharing a 16-node WSAN. Simulation results show that the holistic management design has significantly enhanced the resilience of the system against both wireless interferences and physical disturbances, while effectively reducing the number of wireless transmissions.

LGMay 24, 2022
HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

Hanyang Liu, Sunny S. Lou, Benjamin C. Warner et al.

Burnout is a significant public health concern affecting nearly half of the healthcare workforce. This paper presents the first end-to-end deep learning framework for predicting physician burnout based on electronic health record (EHR) activity logs, digital traces of physician work activities that are available in any EHR system. In contrast to prior approaches that exclusively relied on surveys for burnout measurement, our framework directly learns deep representations of physician behaviors from large-scale clinician activity logs to predict burnout. We propose the Hierarchical burnout Prediction based on Activity Logs (HiPAL), featuring a pre-trained time-dependent activity embedding mechanism tailored for activity logs and a hierarchical predictive model, which mirrors the natural hierarchical structure of clinician activity logs and captures physicians' evolving burnout risk at both short-term and long-term levels. To utilize the large amount of unlabeled activity logs, we propose a semi-supervised framework that learns to transfer knowledge extracted from unlabeled clinician activities to the HiPAL-based prediction model. The experiment on over 15 million clinician activity logs collected from the EHR at a large academic medical center demonstrates the advantages of our proposed framework in predictive performance of physician burnout and training efficiency over state-of-the-art approaches.

CVJun 3, 2023
Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers

Chenyang Lu, Daan de Geus, Gijs Dubbelman

This paper introduces Content-aware Token Sharing (CTS), a token reduction approach that improves the computational efficiency of semantic segmentation networks that use Vision Transformers (ViTs). Existing works have proposed token reduction approaches to improve the efficiency of ViT-based image classification networks, but these methods are not directly applicable to semantic segmentation, which we address in this work. We observe that, for semantic segmentation, multiple image patches can share a token if they contain the same semantic class, as they contain redundant information. Our approach leverages this by employing an efficient, class-agnostic policy network that predicts if image patches contain the same semantic class, and lets them share a token if they do. With experiments, we explore the critical design choices of CTS and show its effectiveness on the ADE20K, Pascal Context and Cityscapes datasets, various ViT backbones, and different segmentation decoders. With Content-aware Token Sharing, we are able to reduce the number of processed tokens by up to 44%, without diminishing the segmentation quality.

CVSep 9, 2024
Real-Time Human Action Recognition on Embedded Platforms

Ruiqi Wang, Zichen Wang, Peiqi Gao et al.

With advancements in computer vision and deep learning, video-based human action recognition (HAR) has become practical. However, due to the complexity of the computation pipeline, running HAR on live video streams incurs excessive delays on embedded platforms. This work tackles the real-time performance challenges of HAR with four contributions: 1) an experimental study identifying a standard Optical Flow (OF) extraction technique as the latency bottleneck in a state-of-the-art HAR pipeline, 2) an exploration of the latency-accuracy tradeoff between the standard and deep learning approaches to OF extraction, which highlights the need for a novel, efficient motion feature extractor, 3) the design of Integrated Motion Feature Extractor (IMFE), a novel single-shot neural network architecture for motion feature extraction with drastic improvement in latency, 4) the development of RT-HARE, a real-time HAR system tailored for embedded platforms. Experimental results on an Nvidia Jetson Xavier NX platform demonstrated that RT-HARE realizes real-time HAR at a video frame rate of 30 frames per second while delivering high levels of recognition accuracy.

LGOct 10, 2022
Self-explaining Hierarchical Model for Intraoperative Time Series

Dingwen Li, Bing Xue, Christopher King et al.

Major postoperative complications are devastating to surgical patients. Some of these complications are potentially preventable via early predictions based on intraoperative data. However, intraoperative data comprise long and fine-grained multivariate time series, prohibiting the effective learning of accurate models. The large gaps associated with clinical events and protocols are usually ignored. Moreover, deep models generally lack transparency. Nevertheless, the interpretability is crucial to assist clinicians in planning for and delivering postoperative care and timely interventions. Towards this end, we propose a hierarchical model combining the strength of both attention and recurrent models for intraoperative time series. We further develop an explanation module for the hierarchical model to interpret the predictions by providing contributions of intraoperative data in a fine-grained manner. Experiments on a large dataset of 111,888 surgeries with multiple outcomes and an external high-resolution ICU dataset show that our model can achieve strong predictive performance (i.e., high accuracy) and offer robust interpretations (i.e., high transparency) for predicted outcomes based on intraoperative time series.

CVDec 7, 2025Code
SparseCoop: Cooperative Perception with Kinematic-Grounded Queries

Jiahao Wang, Zhongwei Jiang, Wenchao Sun et al.

Cooperative perception is critical for autonomous driving, overcoming the inherent limitations of a single vehicle, such as occlusions and constrained fields-of-view. However, current approaches sharing dense Bird's-Eye-View (BEV) features are constrained by quadratically-scaling communication costs and the lack of flexibility and interpretability for precise alignment across asynchronous or disparate viewpoints. While emerging sparse query-based methods offer an alternative, they often suffer from inadequate geometric representations, suboptimal fusion strategies, and training instability. In this paper, we propose SparseCoop, a fully sparse cooperative perception framework for 3D detection and tracking that completely discards intermediate BEV representations. Our framework features a trio of innovations: a kinematic-grounded instance query that uses an explicit state vector with 3D geometry and velocity for precise spatio-temporal alignment; a coarse-to-fine aggregation module for robust fusion; and a cooperative instance denoising task to accelerate and stabilize training. Experiments on V2X-Seq and Griffin datasets show SparseCoop achieves state-of-the-art performance. Notably, it delivers this with superior computational efficiency, low transmission cost, and strong robustness to communication latency. Code is available at https://github.com/wang-jh18-SVM/SparseCoop.

CLAug 19, 2023
Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection

Benjamin C. Warner, Ziqi Xu, Simon Haroutounian et al.

Survey data can contain a high number of features while having a comparatively low quantity of examples. Machine learning models that attempt to predict outcomes from survey data under these conditions can overfit and result in poor generalizability. One remedy to this issue is feature selection, which attempts to select an optimal subset of features to learn upon. A relatively unexplored source of information in the feature selection process is the usage of textual names of features, which may be semantically indicative of which features are relevant to a target outcome. The relationships between feature names and target names can be evaluated using language models (LMs) to produce semantic textual similarity (STS) scores, which can then be used to select features. We examine the performance using STS to select features directly and in the minimal-redundancy-maximal-relevance (mRMR) algorithm. The performance of STS as a feature selection metric is evaluated against preliminary survey data collected as a part of a clinical study on persistent post-surgical pain (PPSP). The results suggest that features selected with STS can result in higher performance models compared to traditional feature selection algorithms.

LGJul 6, 2023
Assisting Clinical Decisions for Scarcely Available Treatment via Disentangled Latent Representation

Bing Xue, Ahmed Sameh Said, Ziqi Xu et al.

Extracorporeal membrane oxygenation (ECMO) is an essential life-supporting modality for COVID-19 patients who are refractory to conventional therapies. However, the proper treatment decision has been the subject of significant debate and it remains controversial about who benefits from this scarcely available and technically complex treatment option. To support clinical decisions, it is a critical need to predict the treatment need and the potential treatment and no-treatment responses. Targeting this clinical challenge, we propose Treatment Variational AutoEncoder (TVAE), a novel approach for individualized treatment analysis. TVAE is specifically designed to address the modeling challenges like ECMO with strong treatment selection bias and scarce treatment cases. TVAE conceptualizes the treatment decision as a multi-scale problem. We model a patient's potential treatment assignment and the factual and counterfactual outcomes as part of their intrinsic characteristics that can be represented by a deep latent variable model. The factual and counterfactual prediction errors are alleviated via a reconstruction regularization scheme together with semi-supervision, and the selection bias and the scarcity of treatment cases are mitigated by the disentangled and distribution-matched latent space and the label-balancing generative strategy. We evaluate TVAE on two real-world COVID-19 datasets: an international dataset collected from 1651 hospitals across 63 countries, and a institutional dataset collected from 15 hospitals. The results show that TVAE outperforms state-of-the-art treatment effect models in predicting both the propensity scores and factual outcomes on heterogeneous COVID-19 datasets. Additional experiments also show TVAE outperforms the best existing models in individual treatment effect estimation on the synthesized IHDP benchmark dataset.

CVMar 21, 2022
Self-Supervised Road Layout Parsing with Graph Auto-Encoding

Chenyang Lu, Gijs Dubbelman

Aiming for higher-level scene understanding, this work presents a neural network approach that takes a road-layout map in bird's-eye-view as input, and predicts a human-interpretable graph that represents the road's topological layout. Our approach elevates the understanding of road layouts from pixel level to the level of graphs. To achieve this goal, an image-graph-image auto-encoder is utilized. The network is designed to learn to regress the graph representation at its auto-encoder bottleneck. This learning is self-supervised by an image reconstruction loss, without needing any external manual annotations. We create a synthetic dataset containing common road layout patterns and use it for training of the auto-encoder in addition to the real-world Argoverse dataset. By using this additional synthetic dataset, which conceptually captures human knowledge of road layouts and makes this available to the network for training, we are able to stabilize and further improve the performance of topological road layout understanding on the real-world Argoverse dataset. The evaluation shows that our approach exhibits comparable performance to a strong fully-supervised baseline.

DCOct 8, 2023
Progressive Neural Compression for Adaptive Image Offloading under Timing Constraints

Ruiqi Wang, Hanyang Liu, Jiaming Qiu et al.

IoT devices are increasingly the source of data for machine learning (ML) applications running on edge servers. Data transmissions from devices to servers are often over local wireless networks whose bandwidth is not just limited but, more importantly, variable. Furthermore, in cyber-physical systems interacting with the physical environment, image offloading is also commonly subject to timing constraints. It is, therefore, important to develop an adaptive approach that maximizes the inference performance of ML applications under timing constraints and the resource constraints of IoT devices. In this paper, we use image classification as our target application and propose progressive neural compression (PNC) as an efficient solution to this problem. Although neural compression has been used to compress images for different ML applications, existing solutions often produce fixed-size outputs that are unsuitable for timing-constrained offloading over variable bandwidth. To address this limitation, we train a multi-objective rateless autoencoder that optimizes for multiple compression rates via stochastic taildrop to create a compression solution that produces features ordered according to their importance to inference performance. Features are then transmitted in that order based on available bandwidth, with classification ultimately performed using the (sub)set of features received by the deadline. We demonstrate the benefits of PNC over state-of-the-art neural compression approaches and traditional compression methods on a testbed comprising an IoT device and an edge server connected over a wireless network with varying bandwidth.

CLApr 16
CURA: Clinical Uncertainty Risk Alignment for Language Model-Based Risk Prediction

Sizhe Wang, Ziqi Xu, Claire Najjuuko et al.

Clinical language models (LMs) are increasingly applied to support clinical risk prediction from free-text notes, yet their uncertainty estimates often remain poorly calibrated and clinically unreliable. In this work, we propose Clinical Uncertainty Risk Alignment (CURA), a framework that aligns clinical LM-based risk estimates and uncertainty with both individual error likelihoods and cohort-level ambiguities. CURA first fine-tunes domain-specific clinical LMs to obtain task-adapted patient embeddings, and then performs uncertainty fine-tuning of a multi-head classifier using a bi-level uncertainty objective. Specifically, an individual-level calibration term aligns predictive uncertainty with each patient's likelihood of error, while a cohort-aware regularizer pulls risk estimates toward event rates in their local neighborhoods in the embedding space and places extra weight on ambiguous cohorts near the decision boundary. We further show that this cohort-aware term can be interpreted as a cross-entropy loss with neighborhood-informed soft labels, providing a label-smoothing view of our method. Extensive experiments on MIMIC-IV clinical risk prediction tasks across various clinical LMs show that CURA consistently improves calibration metrics without substantially compromising discrimination. Further analysis illustrates that CURA reduces overconfident false reassurance and yields more trustworthy uncertainty estimates for downstream clinical decision support.

CVApr 16, 2020Code
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding

Panagiotis Meletis, Xiaoxiao Wen, Chenyang Lu et al.

In this technical report, we present two novel datasets for image scene understanding. Both datasets have annotations compatible with panoptic segmentation and additionally they have part-level labels for selected semantic classes. This report describes the format of the two datasets, the annotation protocols, the merging strategies, and presents the datasets statistics. The datasets labels together with code for processing and visualization will be published at https://github.com/tue-mps/panoptic_parts.

CVSep 10, 2019Code
Semantic Foreground Inpainting from Weak Supervision

Chenyang Lu, Gijs Dubbelman

Semantic scene understanding is an essential task for self-driving vehicles and mobile robots. In our work, we aim to estimate a semantic segmentation map, in which the foreground objects are removed and semantically inpainted with background classes, from a single RGB image. This semantic foreground inpainting task is performed by a single-stage convolutional neural network (CNN) that contains our novel max-pooling as inpainting (MPI) module, which is trained with weak supervision, i.e., it does not require manual background annotations for the foreground regions to be inpainted. Our approach is inherently more efficient than the previous two-stage state-of-the-art method, and outperforms it by a margin of 3% IoU for the inpainted foreground regions on Cityscapes. The performance margin increases to 6% IoU, when tested on the unseen KITTI dataset. The code and the manually annotated datasets for testing are shared with the research community at https://github.com/Chenyang-Lu/semantic-foreground-inpainting.

CRSep 11, 2024
SoK: Security and Privacy Risks of Healthcare AI

Yuanhaur Chang, Han Liu, Chenyang Lu et al.

The integration of artificial intelligence (AI) and machine learning (ML) into healthcare systems holds great promise for enhancing patient care and care delivery efficiency; however, it also exposes sensitive data and system integrity to potential cyberattacks. Current security and privacy (S&P) research on healthcare AI is highly unbalanced in terms of healthcare deployment scenarios and threat models, and has a disconnected focus with the biomedical research community. This hinders a comprehensive understanding of the risks that healthcare AI entails. To address this gap, this paper takes a thorough examination of existing healthcare AI S&P research, providing a unified framework that allows the identification of under-explored areas. Our survey presents a systematic overview of healthcare AI attacks and defenses, and points out challenges and research opportunities for each AI-driven healthcare application domain. Through our experimental analysis of different threat models and feasibility studies on under-explored adversarial attacks, we provide compelling insights into the pressing need for cybersecurity research in the rapidly evolving field of healthcare AI.

CLFeb 27, 2024
The Foundational Capabilities of Large Language Models in Predicting Postoperative Risks Using Clinical Notes

Charles Alba, Bing Xue, Joanna Abraham et al.

Clinical notes recorded during a patient's perioperative journey holds immense informational value. Advances in large language models (LLMs) offer opportunities for bridging this gap. Using 84,875 pre-operative notes and its associated surgical cases from 2018 to 2021, we examine the performance of LLMs in predicting six postoperative risks using various fine-tuning strategies. Pretrained LLMs outperformed traditional word embeddings by an absolute AUROC of 38.3% and AUPRC of 33.2%. Self-supervised fine-tuning further improved performance by 3.2% and 1.5%. Incorporating labels into training further increased AUROC by 1.8% and AUPRC by 2%. The highest performance was achieved with a unified foundation model, with improvements of 3.6% for AUROC and 2.6% for AUPRC compared to self-supervision, highlighting the foundational capabilities of LLMs in predicting postoperative risks, which could be potentially beneficial when deployed for perioperative care

LGDec 2, 2024
A Novel Generative Multi-Task Representation Learning Approach for Predicting Postoperative Complications in Cardiac Surgery Patients

Junbo Shen, Bing Xue, Thomas Kannampallil et al.

Early detection of surgical complications allows for timely therapy and proactive risk mitigation. Machine learning (ML) can be leveraged to identify and predict patient risks for postoperative complications. We developed and validated the effectiveness of predicting postoperative complications using a novel surgical Variational Autoencoder (surgVAE) that uncovers intrinsic patterns via cross-task and cross-cohort presentation learning. This retrospective cohort study used data from the electronic health records of adult surgical patients over four years (2018 - 2021). Six key postoperative complications for cardiac surgery were assessed: acute kidney injury, atrial fibrillation, cardiac arrest, deep vein thrombosis or pulmonary embolism, blood transfusion, and other intraoperative cardiac events. We compared prediction performances of surgVAE against widely-used ML models and advanced representation learning and generative models under 5-fold cross-validation. 89,246 surgeries (49% male, median (IQR) age: 57 (45-69)) were included, with 6,502 in the targeted cardiac surgery cohort (61% male, median (IQR) age: 60 (53-70)). surgVAE demonstrated superior performance over existing ML solutions across all postoperative complications of cardiac surgery patients, achieving macro-averaged AUPRC of 0.409 and macro-averaged AUROC of 0.831, which were 3.4% and 3.7% higher, respectively, than the best alternative method (by AUPRC scores). Model interpretation using Integrated Gradients highlighted key risk factors based on preoperative variable importance. surgVAE showed excellent discriminatory performance for predicting postoperative complications and addressing the challenges of data complexity, small cohort sizes, and low-frequency positive events. surgVAE enables data-driven predictions of patient risks and prognosis while enhancing the interpretability of patient risk profiles.

CVApr 10
Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception

Jiahao Wang, Zikun Xu, Yuner Zhang et al.

Cooperative 3D perception via Vehicle-to-Everything communication is a promising paradigm for enhancing autonomous driving, offering extended sensing horizons and occlusion resolution. However, the practical deployment of existing methods is hindered at long distances by two critical bottlenecks: the quadratic computational scaling of dense BEV representations and the fragility of feature association mechanisms under significant observation and alignment errors. To overcome these limitations, we introduce Long-SCOPE, a fully sparse framework designed for robust long-distance cooperative 3D perception. Our method features two novel components: a Geometry-guided Query Generation module to accurately detect small, distant objects, and a learnable Context-Aware Association module that robustly matches cooperative queries despite severe positional noise. Experiments on the V2X-Seq and Griffin datasets validate that Long-SCOPE achieves state-of-the-art performance, particularly in challenging 100-150 m long-range settings, while maintaining highly competitive computation and communication costs.

DCOct 24, 2024
Optimizing Edge Offloading Decisions for Object Detection

Jiaming Qiu, Ruiqi Wang, Brooks Hu et al.

Recent advances in machine learning and hardware have produced embedded devices capable of performing real-time object detection with commendable accuracy. We consider a scenario in which embedded devices rely on an onboard object detector, but have the option to offload detection to a more powerful edge server when local accuracy is deemed too low. Resource constraints, however, limit the number of images that can be offloaded to the edge. Our goal is to identify which images to offload to maximize overall detection accuracy under those constraints. To that end, the paper introduces a reward metric designed to quantify potential accuracy improvements from offloading individual images, and proposes an efficient approach to make offloading decisions by estimating this reward based only on local detection results. The approach is computationally frugal enough to run on embedded devices, and empirical findings indicate that it outperforms existing alternatives in improving detection accuracy even when the fraction of offloaded images is small.

CVJun 11, 2021
Part-aware Panoptic Segmentation

Daan de Geus, Panagiotis Meletis, Chenyang Lu et al.

In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing. For this novel task, we provide consistent annotations on two commonly used datasets: Cityscapes and Pascal VOC. Moreover, we present a single metric to evaluate PPS, called Part-aware Panoptic Quality (PartPQ). For this new task, using the metric and annotations, we set multiple baselines by merging results of existing state-of-the-art methods for panoptic segmentation and part segmentation. Finally, we conduct several experiments that evaluate the importance of the different levels of abstraction in this single task.

LGApr 30, 2021
Predicting Intraoperative Hypoxemia with Hybrid Inference Sequence Autoencoder Networks

Hanyang Liu, Michael C. Montana, Dingwen Li et al.

We present an end-to-end model using streaming physiological time series to predict near-term risk for hypoxemia, a rare, but life-threatening condition known to cause serious patient harm during surgery. Inspired by the fact that a hypoxemia event is defined based on a future sequence of low SpO2 (i.e., blood oxygen saturation) instances, we propose the hybrid inference network (hiNet) that makes hybrid inference on both future low SpO2 instances and hypoxemia outcomes. hiNet integrates 1) a joint sequence autoencoder that simultaneously optimizes a discriminative decoder for label prediction, and 2) two auxiliary decoders trained for data reconstruction and forecast, which seamlessly learn contextual latent representations that capture the transition from present states to future states. All decoders share a memory-based encoder that helps capture the global dynamics of patient measurement. For a large surgical cohort of 72,081 surgeries at a major academic medical center, our model outperforms strong baselines including the model used by the state-of-the-art hypoxemia prediction system. With its capability to make real-time predictions of near-term hypoxemic at clinically acceptable alarm rates, hiNet shows promise in improving clinical decision making and easing burden of perioperative care.

CVDec 10, 2020
Image-Graph-Image Translation via Auto-Encoding

Chenyang Lu, Gijs Dubbelman

This work presents the first convolutional neural network that learns an image-to-graph translation task without needing external supervision. Obtaining graph representations of image content, where objects are represented as nodes and their relationships as edges, is an important task in scene understanding. Current approaches follow a fully-supervised approach thereby requiring meticulous annotations. To overcome this, we are the first to present a self-supervised approach based on a fully-differentiable auto-encoder in which the bottleneck encodes the graph's nodes and edges. This self-supervised approach can currently encode simple line drawings into graphs and obtains comparable results to a fully-supervised baseline in terms of F1 score on triplet matching. Besides these promising results, we provide several directions for future research on how our approach can be extended to cover more complex imagery.

LGOct 26, 2020
Real-Time Edge Classification: Optimal Offloading under Token Bucket Constraints

Ayan Chakrabarti, Roch Guérin, Chenyang Lu et al.

To deploy machine learning-based algorithms for real-time applications with strict latency constraints, we consider an edge-computing setting where a subset of inputs are offloaded to the edge for processing by an accurate but resource-intensive model, and the rest are processed only by a less-accurate model on the device itself. Both models have computational costs that match available compute resources, and process inputs with low-latency. But offloading incurs network delays, and to manage these delays to meet application deadlines, we use a token bucket to constrain the average rate and burst length of transmissions from the device. We introduce a Markov Decision Process-based framework to make offload decisions under these constraints, based on the local model's confidence and the token bucket state, with the goal of minimizing a specified error measure for the application. Beyond isolated decisions for individual devices, we also propose approaches to allow multiple devices connected to the same access switch to share their bursting allocation. We evaluate and analyze the policies derived using our framework on the standard ImageNet image classification benchmark.

CVJul 23, 2019
Hallucinating Beyond Observation: Learning to Complete with Partial Observation and Unpaired Prior Knowledge

Chenyang Lu, Gijs Dubbelman

We propose a novel single-step training strategy that allows convolutional encoder-decoder networks that use skip connections, to complete partially observed data by means of hallucination. This strategy is demonstrated for the task of completing 2-D road layouts as well as 3-D vehicle shapes. As input, it takes data from a partially observed domain, for which no ground truth is available, and data from an unpaired prior knowledge domain and trains the network in an end-to-end manner. Our single-step training strategy is compared against two state-of-the-art baselines, one using a two-step auto-encoder training strategy and one using an adversarial strategy. Our novel strategy achieves an improvement up to +12.2% F-measure on the Cityscapes dataset. The learned network intrinsically generalizes better than the baselines on unseen datasets, which is demonstrated by an improvement up to +23.8% F-measure on the unseen KITTI dataset. Moreover, our approach outperforms the baselines using the same backbone network on the 3-D shape completion benchmark by a margin of 0.006 Hamming distance.

ROApr 6, 2018
Monocular Semantic Occupancy Grid Mapping with Convolutional Variational Encoder-Decoder Networks

Chenyang Lu, Marinus Jacobus Gerardus van de Molengraft, Gijs Dubbelman

In this work, we research and evaluate end-to-end learning of monocular semantic-metric occupancy grid mapping from weak binocular ground truth. The network learns to predict four classes, as well as a camera to bird's eye view mapping. At the core, it utilizes a variational encoder-decoder network that encodes the front-view visual information of the driving scene and subsequently decodes it into a 2-D top-view Cartesian coordinate system. The evaluations on Cityscapes show that the end-to-end learning of semantic-metric occupancy grids outperforms the deterministic mapping approach with flat-plane assumption by more than 12% mean IoU. Furthermore, we show that the variational sampling with a relatively small embedding vector brings robustness against vehicle dynamic perturbations, and generalizability for unseen KITTI data. Our network achieves real-time inference rates of approx. 35 Hz for an input image with a resolution of 256x512 pixels and an output map with 64x64 occupancy grid cells using a Titan V GPU.

SYAug 24, 2017
Wireless Network Design for Control Systems: A Survey

Pangun Park, Sinem Coleri Ergen, Carlo Fischione et al.

Wireless networked control systems (WNCS) are composed of spatially distributed sensors, actuators, and con- trollers communicating through wireless networks instead of conventional point-to-point wired connections. Due to their main benefits in the reduction of deployment and maintenance costs, large flexibility and possible enhancement of safety, WNCS are becoming a fundamental infrastructure technology for critical control systems in automotive electrical systems, avionics control systems, building management systems, and industrial automation systems. The main challenge in WNCS is to jointly design the communication and control systems considering their tight interaction to improve the control performance and the network lifetime. In this survey, we make an exhaustive review of the literature on wireless network design and optimization for WNCS. First, we discuss what we call the critical interactive variables including sampling period, message delay, message dropout, and network energy consumption. The mutual effects of these communication and control variables motivate their joint tuning. We discuss the effect of controllable wireless network parameters at all layers of the communication protocols on the probability distribution of these interactive variables. We also review the current wireless network standardization for WNCS and their corresponding methodology for adapting the network parameters. Moreover, we discuss the analysis and design of control systems taking into account the effect of the interactive variables on the control system performance. Finally, we present the state-of-the-art wireless network design and optimization for WNCS, while highlighting the tradeoff between the achievable performance and complexity of various approaches. We conclude the survey by highlighting major research issues and identifying future research directions.