Zhengwen Zhang

CV
h-index14
6papers
417citations
Novelty47%
AI Score32

6 Papers

CVSep 8, 2023Code
Towards Efficient SDRTV-to-HDRTV by Learning from Image Formation

Xiangyu Chen, Zheyuan Li, Zhengwen Zhang et al.

Modern displays can render video content with high dynamic range (HDR) and wide color gamut (WCG). However, most resources are still in standard dynamic range (SDR). Therefore, transforming existing SDR content into the HDRTV standard holds significant value. This paper defines and analyzes the SDRTV-to-HDRTV task by modeling the formation of SDRTV/HDRTV content. Our findings reveal that a naive endto-end supervised training approach suffers from severe gamut transition errors. To address this, we propose a new three-step solution called HDRTVNet++, which includes adaptive global color mapping, local enhancement, and highlight refinement. The adaptive global color mapping step utilizes global statistics for image-adaptive color adjustments. A local enhancement network further enhances details, and the two sub-networks are combined as a generator to achieve highlight consistency through GANbased joint training. Designed for ultra-high-definition TV content, our method is both effective and lightweight for processing 4K resolution images. We also constructed a dataset using HDR videos in the HDR10 standard, named HDRTV1K, containing 1235 training and 117 testing images, all in 4K resolution. Additionally, we employ five metrics to evaluate SDRTV-to-HDRTV performance. Our results demonstrate state-of-the-art performance both quantitatively and visually. The codes and models are available at https://github.com/xiaom233/HDRTVNet-plus.

AIJan 7, 2024
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects

Yuheng Cheng, Ceyao Zhang, Zhengwen Zhang et al. · pku

Intelligent agents stand out as a potential path toward artificial general intelligence (AGI). Thus, researchers have dedicated significant effort to diverse implementations for them. Benefiting from recent progress in large language models (LLMs), LLM-based agents that use universal natural language as an interface exhibit robust generalization capabilities across various applications -- from serving as autonomous general-purpose task assistants to applications in coding, social, and economic domains, LLM-based agents offer extensive exploration opportunities. This paper surveys current research to provide an in-depth overview of LLM-based intelligent agents within single-agent and multi-agent systems. It covers their definitions, research frameworks, and foundational components such as their composition, cognitive and planning methods, tool utilization, and responses to environmental feedback. We also delve into the mechanisms of deploying LLM-based agents in multi-agent systems, including multi-role collaboration, message passing, and strategies to alleviate communication issues between agents. The discussions also shed light on popular datasets and application scenarios. We conclude by envisioning prospects for LLM-based agents, considering the evolving landscape of AI and natural language processing.

AO-PHOct 11, 2022
Near Real-time CO$_2$ Emissions Based on Carbon Satellite and Artificial Intelligence

Zhengwen Zhang, Jinjin Gu, Junhua Zhao et al.

To limit global warming to pre-industrial levels, global governments, industry and academia are taking aggressive efforts to reduce carbon emissions. The evaluation of anthropogenic carbon dioxide (CO$_2$) emissions, however, depends on the self-reporting information that is not always reliable. Society need to develop an objective, independent, and generalized system to meter CO$_2$ emissions. Satellite CO$_2$ observation from space that reports column-average regional CO$_2$ dry-air mole fractions has gradually indicated its potential to build such a system. Nevertheless, estimating anthropogenic CO$_2$ emissions from CO$_2$ observing satellite is bottlenecked by the influence of the highly complicated physical characteristics of atmospheric activities. Here we provide the first method that combines the advanced artificial intelligence (AI) techniques and the carbon satellite monitor to quantify anthropogenic CO$_2$ emissions. We propose an integral AI based pipeline that contains both a data retrieval algorithm and a two-step data-driven solution. First, the data retrieval algorithm can generate effective datasets from multi-modal data including carbon satellite, the information of carbon sources, and several environmental factors. Second, the two-step data-driven solution that applies the powerful representation of deep learning techniques to learn to quantify anthropogenic CO$_2$ emissions from satellite CO$_2$ observation with other factors. Our work unmasks the potential of quantifying CO$_2$ emissions based on the combination of deep learning algorithms and the carbon satellite monitor.

IVAug 18, 2021Code
A New Journey from SDRTV to HDRTV

Xiangyu Chen, Zhengwen Zhang, Jimmy S. Ren et al.

Nowadays modern displays are capable to render video content with high dynamic range (HDR) and wide color gamut (WCG). However, most available resources are still in standard dynamic range (SDR). Therefore, there is an urgent demand to transform existing SDR-TV contents into their HDR-TV versions. In this paper, we conduct an analysis of SDRTV-to-HDRTV task by modeling the formation of SDRTV/HDRTV content. Base on the analysis, we propose a three-step solution pipeline including adaptive global color mapping, local enhancement and highlight generation. Moreover, the above analysis inspires us to present a lightweight network that utilizes global statistics as guidance to conduct image-adaptive color mapping. In addition, we construct a dataset using HDR videos in HDR10 standard, named HDRTV1K, and select five metrics to evaluate the results of SDRTV-to-HDRTV algorithms. Furthermore, our final results achieve state-of-the-art performance in quantitative comparisons and visual quality. The code and dataset are available at https://github.com/chxy95/HDRTVNet.

CVApr 13, 2021Code
Very Lightweight Photo Retouching Network with Conditional Sequential Modulation

Yihao Liu, Jingwen He, Xiangyu Chen et al.

Photo retouching aims at improving the aesthetic visual quality of images that suffer from photographic defects, especially for poor contrast, over/under exposure, and inharmonious saturation. In practice, photo retouching can be accomplished by a series of image processing operations. As most commonly-used retouching operations are pixel-independent, i.e., the manipulation on one pixel is uncorrelated with its neighboring pixels, we can take advantage of this property and design a specialized algorithm for efficient global photo retouching. We analyze these global operations and find that they can be mathematically formulated by a Multi-Layer Perceptron (MLP). Based on this observation, we propose an extremely lightweight framework -- Conditional Sequential Retouching Network (CSRNet). Benefiting from the utilization of $1\times1$ convolution, CSRNet only contains less than 37K trainable parameters, which are orders of magnitude smaller than existing learning-based methods. Experiments show that our method achieves state-of-the-art performance on the benchmark MIT-Adobe FiveK dataset quantitively and qualitatively. In addition to achieve global photo retouching, the proposed framework can be easily extended to learn local enhancement effects. The extended model, namely CSRNet-L, also achieves competitive results in various local enhancement tasks. Codes are available at https://github.com/lyh-18/CSRNet.

IVMay 27, 2021
HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization

Xiangyu Chen, Yihao Liu, Zhengwen Zhang et al.

Most consumer-grade digital cameras can only capture a limited range of luminance in real-world scenes due to sensor constraints. Besides, noise and quantization errors are often introduced in the imaging process. In order to obtain high dynamic range (HDR) images with excellent visual quality, the most common solution is to combine multiple images with different exposures. However, it is not always feasible to obtain multiple images of the same scene and most HDR reconstruction methods ignore the noise and quantization loss. In this work, we propose a novel learning-based approach using a spatially dynamic encoder-decoder network, HDRUNet, to learn an end-to-end mapping for single image HDR reconstruction with denoising and dequantization. The network consists of a UNet-style base network to make full use of the hierarchical multi-scale information, a condition network to perform pattern-specific modulation and a weighting network for selectively retaining information. Moreover, we propose a Tanh_L1 loss function to balance the impact of over-exposed values and well-exposed values on the network learning. Our method achieves the state-of-the-art performance in quantitative comparisons and visual quality. The proposed HDRUNet model won the second place in the single frame track of NITRE2021 High Dynamic Range Challenge.