Brian Lee

CV
h-index25
12papers
1,017citations
Novelty34%
AI Score49

12 Papers

IVNov 7, 2022
Realistic Bokeh Effect Rendering on Mobile GPUs, Mobile AI & AIM 2022 challenge: Report

Andrey Ignatov, Radu Timofte, Jin Zhang et al.

As mobile cameras with compact optics are unable to produce a strong bokeh effect, lots of interest is now devoted to deep learning-based solutions for this task. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based bokeh effect rendering approach that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale EBB! bokeh dataset consisting of 5K shallow / wide depth-of-field image pairs captured using the Canon 7D DSLR camera. The runtime of the resulting models was evaluated on the Kirin 9000's Mali GPU that provides excellent acceleration results for the majority of common deep learning ops. A detailed description of all models developed in this challenge is provided in this paper.

AIMay 25
JobBench: Aligning Agent Work With Human Will

Yuetai Li, Yichen Feng, Zhangchen Xu et al.

Current benchmarks for occupational AI agents are scoped primarily by economic values, telling a replacement story. We introduce JobBench, which evaluates AI agents on the workflows that experts identify as high-priority for delegation, empowering humans based on their needs instead of replacing them with GDP value. JobBench covers 130 agentic tasks across 35 occupations. Each task is packaged as a workspace of heterogeneous reference files, requiring the agent to reason through the cluttered information streams of real professional work. Outputs are graded by a fact-anchored chain of rubrics, averaging 35.6 binary criteria per task. We evaluate 36 models; the strongest, Claude Opus~4.7 under Claude Code, reaches only 45.9 %. We hope JobBench shifts the community's target labour-market effect from replacement to enhancement: building agents that do what humans actually want delegated, not only what is most economically valuable.

CVApr 13
MapATM: Enhancing HD Map Construction through Actor Trajectory Modeling

Mingyang Li, Brian Lee, Rui Zuo et al.

High-definition (HD) mapping tasks, which perform lane detections and predictions, are extremely challenging due to non-ideal conditions such as view occlusions, distant lane visibility, and adverse weather conditions. Those conditions often result in compromised lane detection accuracy and reduced reliability within autonomous driving systems. To address these challenges, we introduce MapATM, a novel deep neural network that effectively leverages historical actor trajectory information to improve lane detection accuracy, where actors refer to moving vehicles. By utilizing actor trajectories as structural priors for road geometry, MapATM achieves substantial performance enhancements, notably increasing AP by 4.6 for lane dividers and mAP by 2.6 on the challenging NuScenes dataset, representing relative improvements of 10.1% and 6.1%, respectively, compared to strong baseline methods. Extensive qualitative evaluations further demonstrate MapATM's capability to consistently maintain stable and robust map reconstruction across diverse and complex driving scenarios, underscoring its practical value for autonomous driving applications.

NCDec 28, 2025
Nonlinear Dynamical Modeling of Human Intracranial Brain Activity with Flexible Inference

Kiarash Vaziri, Lucine L. Oganesian, HyeongChan Jo et al.

Dynamical modeling of multisite human intracranial neural recordings is essential for developing neurotechnologies such as brain-computer interfaces (BCIs). Linear dynamical models are widely used for this purpose due to their interpretability and their suitability for BCIs. In particular, these models enable flexible real-time inference, even in the presence of missing neural samples, which often occur in wireless BCIs. However, neural activity can exhibit nonlinear structure that is not captured by linear models. Furthermore, while recurrent neural network models can capture nonlinearity, their inference does not directly address handling missing observations. To address this gap, recent work introduced DFINE, a deep learning framework that integrates neural networks with linear state-space models to capture nonlinearities while enabling flexible inference. However, DFINE was developed for intracortical recordings that measure localized neuronal populations. Here we extend DFINE to modeling of multisite human intracranial electroencephalography (iEEG) recordings. We find that DFINE significantly outperforms linear state-space models (LSSMs) in forecasting future neural activity. Furthermore, DFINE matches or exceeds the accuracy of a gated recurrent unit (GRU) model in neural forecasting, indicating that a linear dynamical backbone, when paired and jointly trained with nonlinear neural networks, can effectively describe the dynamics of iEEG signals while also enabling flexible inference. Additionally, DFINE handles missing observations more robustly than the baselines, demonstrating its flexible inference and utility for BCIs. Finally, DFINE's advantage over LSSM is more pronounced in high gamma spectral bands. Taken together, these findings highlight DFINE as a strong and flexible framework for modeling human iEEG dynamics, with potential applications in next-generation BCIs.

LGOct 31, 2025
Panprediction: Optimal Predictions for Any Downstream Task and Loss

Sivaraman Balakrishnan, Nika Haghtalab, Daniel Hsu et al.

Supervised learning is classically formulated as training a model to minimize a fixed loss function over a fixed distribution, or task. However, an emerging paradigm instead views model training as extracting enough information from data so that the model can be used to minimize many losses on many downstream tasks. We formalize a mathematical framework for this paradigm, which we call panprediction, and study its statistical complexity. Formally, panprediction generalizes omniprediction and sits upstream from multi-group learning, which respectively focus on predictions that generalize to many downstream losses or many downstream tasks, but not both. Concretely, we design algorithms that learn deterministic and randomized panpredictors with $\tilde{O}(1/\varepsilon^3)$ and $\tilde{O}(1/\varepsilon^2)$ samples, respectively. Our results demonstrate that under mild assumptions, simultaneously minimizing infinitely many losses on infinitely many tasks can be as statistically easy as minimizing one loss on one task. Along the way, we improve the best known sample complexity guarantee of deterministic omniprediction by a factor of $1/\varepsilon$, and match all other known sample complexity guarantees of omniprediction and multi-group learning. Our key technical ingredient is a nearly lossless reduction from panprediction to a statistically efficient notion of calibration, called step calibration.

CVAug 25, 2022
Bokeh-Loss GAN: Multi-Stage Adversarial Training for Realistic Edge-Aware Bokeh

Brian Lee, Fei Lei, Huaijin Chen et al.

In this paper, we tackle the problem of monocular bokeh synthesis, where we attempt to render a shallow depth of field image from a single all-in-focus image. Unlike in DSLR cameras, this effect can not be captured directly in mobile cameras due to the physical constraints of the mobile aperture. We thus propose a network-based approach that is capable of rendering realistic monocular bokeh from single image inputs. To do this, we introduce three new edge-aware Bokeh Losses based on a predicted monocular depth map, that sharpens the foreground edges while blurring the background. This model is then finetuned using an adversarial loss to generate a realistic Bokeh effect. Experimental results show that our approach is capable of generating a pleasing, natural Bokeh effect with sharp edges while handling complicated scenes.

LGJun 3, 2024
Single Trajectory Conformal Prediction

Brian Lee, Nikolai Matni

We study the performance of risk-controlling prediction sets (RCPS), an empirical risk minimization-based formulation of conformal prediction, with a single trajectory of temporally correlated data from an unknown stochastic dynamical system. First, we use the blocking technique to show that RCPS attains performance guarantees similar to those enjoyed in the iid setting whenever data is generated by asymptotically stationary and contractive dynamics. Next, we use the decoupling technique to characterize the graceful degradation in RCPS guarantees when the data generating process deviates from stationarity and contractivity. We conclude by discussing how these tools could be used toward a unified analysis of online and offline conformal prediction algorithms, which are currently treated with very different tools.

CVApr 24, 2021
A Survey of Modern Deep Learning based Object Detection Models

Syed Sahil Abbas Zaidi, Mohammad Samar Ansari, Asra Aslam et al.

Object Detection is the task of classification and localization of objects in an image or video. It has gained prominence in recent years due to its widespread applications. This article surveys recent developments in deep learning based object detectors. Concise overview of benchmark datasets and evaluation metrics used in detection is also provided along with some of the prominent backbone architectures used in recognition tasks. It also covers contemporary lightweight classification models used on edge devices. Lastly, we compare the performances of these architectures on multiple metrics.

CRJan 13, 2021
Protecting Privacy and Transforming COVID-19 Case Surveillance Datasets for Public Use

Brian Lee, Brandi Dupervil, Nicholas P. Deputy et al.

Objectives: Federal open data initiatives that promote increased sharing of federally collected data are important for transparency, data quality, trust, and relationships with the public and state, tribal, local, and territorial (STLT) partners. These initiatives advance understanding of health conditions and diseases by providing data to more researchers, scientists, and policymakers for analysis, collaboration, and valuable use outside CDC responders. This is particularly true for emerging conditions such as COVID-19 where we have much to learn and have evolving data needs. Since the beginning of the outbreak, CDC has collected person-level, de-identified data from jurisdictions and currently has over 8 million records, increasing each day. This paper describes how CDC designed and produces two de-identified public datasets from these collected data. Materials and Methods: Data elements were included based on the usefulness, public request, and privacy implications; specific field values were suppressed to reduce risk of reidentification and exposure of confidential information. Datasets were created and verified for privacy and confidentiality using data management platform analytic tools as well as R scripts. Results: Unrestricted data are available to the public through Data.CDC.gov and restricted data, with additional fields, are available with a data use agreement through a private repository on GitHub.com. Practice Implications: Enriched understanding of the available public data, the methods used to create these data, and the algorithms used to protect privacy of de-identified individuals allow for improved data use. Automating data generation procedures allows greater and more timely sharing of data.

ROOct 5, 2020
Blockchain for Multi-Robot Collaboration to Combat COVID-19 and Future Pandemics

S. H. Alsamhi, Brian Lee

This conceptual paper overviews how blockchain technology is involving the operation of multi-robot collaboration for combating COVID-19 and future pandemics. Robots are a promising technology for providing many tasks such as spraying, disinfection, cleaning, treating, detecting high body temperature/mask absence, and delivering goods and medical supplies experiencing an epidemic COVID-19. For combating COVID-19, many heterogeneous and homogenous robots are required to perform different tasks for supporting different purposes in the quarantine area. Controlling and decentralizing multi-robot play a vital role in combating COVID-19 by reducing human interaction, monitoring, delivering goods. Blockchain technology can manage multi-robot collaboration in a decentralized fashion, improve the interaction among them to exchange information, share representation, share goals, and trust. We highlight the challenges and provide the tactical solutions enabled by integrating blockchain and multi-robot collaboration to combat COVID-19 pandemic. The framework of our conceptual proposed can increase the intelligence, decentralization, and autonomous operations of connected multi-robot collaboration in the blockchain network. We overview blockchain potential benefits to defining a framework of multi-robot collaboration applications to combat COVID-19 epidemics such as monitoring and outdoor and hospital End to End (E2E) delivery systems. Furthermore, we discuss the challenges and opportunities of integrated blockchain, multi-robot collaboration, and the Internet of Things (IoT) for combating COVID-19 and future pandemics.

CROct 26, 2017
Situational Awareness based Risk-Adapatable Access Control in Enterprise Networks

Brian Lee, Roman Vanickis, Franklin Rogelio et al.

As the computing landscape evolves towards distributed architectures such as Internet of Things (IoT),enterprises are moving away from traditional perimeter based security models toward so called zero trust networking (ZTN) models that treat both the intranet and Internet as equally untrustworthy. Such security models incorporate risk arising from dynamic and situational factors, such as device location and security risk level risk, into the access control decision. Researchers have developed a number of risk models such as RAdAC (Risk Adaptable Access Control) to handle dynamic contexts and these have been applied to medical and other scenarios. In this position paper we describe our ongoing work to apply RAdAC to ZTN. We develop a policy management framework, FURZE, to facilitate fuzzy risk evaluation that also defines how to adapt to dynamically changing contexts. We also consider how enterprise security situational awareness (SSA) - which describes the potential impact to an organisations mission based on the current threats and the relative importance of the information asset under threat - can be incorporated into a RAdAC scheme

MMJul 25, 2017
MVP2P: Layer-Dependency-Aware Live MVC Video Streaming over Peer-to-Peer Networks

Zhao Liu, Niall Murray, Brian Lee et al.

Multiview video supports observing a scene from different viewpoints. The Joint Video Team (JVT) developed H.264/MVC to enhance the compression efficiency for multiview video, however, MVC encoded multiview video (MVC video) still requires high bitrates for transmission. This paper investigates live MVC video streaming over Peer-to-Peer (P2P) networks. The goal is to minimize the server bandwidth costs whist ensuring high streaming quality to peers. MVC employs intra-view and inter-view prediction structures, which leads to a complicated layer dependency relationship. As the peers' outbound bandwidth is shared while supplying all the MVC video layers, the bandwidth allocation to one MVC layer affects the available outbound bandwidth of the other layers. To optimise the utilisation of the peers' outbound bandwidth for providing video layers, a maximum flow based model is proposed which considers the MVC video layer dependency and the layer supplying relationship between peers. Based on the model, a layer dependency aware live MVC video streaming method over a BitTorrent-like P2P network is proposed, named MVP2P. The key components of MVP2P include a chunk scheduling strategy and a peer selection strategy for receiving peers, and a bandwidth scheduling algorithm for supplying peers. To evaluate the efficiency of the proposed solution, MVP2P is compared with existing methods considering the constraints of peer bandwidth, peer numbers, view switching rates, and peer churns. The test results show that MVP2P significantly outperforms the existing methods.