Siwang Zhou

IV
h-index20
3papers
169citations
Novelty55%
AI Score37

3 Papers

IVAug 28, 2019Code
Multi-Channel Deep Networks for Block-Based Image Compressive Sensing

Siwang Zhou, Yan He, Yonghe Liu et al.

Incorporating deep neural networks in image compressive sensing (CS) receives intensive attentions in multimedia technology and applications recently. As deep network approaches learn the inverse mapping directly from the CS measurements, the reconstruction speed is significantly faster than the conventional CS algorithms. However, for existing network based approaches, a CS sampling procedure has to map a separate network model. This may potentially degrade the performance of image CS with block-wise sampling because of blocking artifacts, especially when multiple sampling rates are assigned to different blocks within an image. In this paper, we develop a multichannel deep network for block-based image CS by exploiting inter-block correlation with performance significantly exceeding the current state-of-the-art methods. The significant performance improvement is attributed to block-wise approximation but full image removal of blocking artifacts. Specifically, with our multichannel structure, the image blocks with a variety of sampling rates can be reconstructed in a single model. The initially reconstructed blocks are then capable of being reassembled into a full image to improve the recovered images by unrolling a hand-designed block based CS recovery algorithm. Experimental results demonstrate that the proposed method outperforms the state-of-the-art CS methods by a large margin in terms of objective metrics and subjective visual image quality. Our source codes are available at https://github.com/siwangzhou/DeepBCS.

CVJan 13, 2025
Pedestrian Trajectory Prediction Based on Social Interactions Learning With Random Weights

Jiajia Xie, Sheng Zhang, Beihao Xia et al.

Pedestrian trajectory prediction is a critical technology in the evolution of self-driving cars toward complete artificial intelligence. Over recent years, focusing on the trajectories of pedestrians to model their social interactions has surged with great interest in more accurate trajectory predictions. However, existing methods for modeling pedestrian social interactions rely on pre-defined rules, struggling to capture non-explicit social interactions. In this work, we propose a novel framework named DTGAN, which extends the application of Generative Adversarial Networks (GANs) to graph sequence data, with the primary objective of automatically capturing implicit social interactions and achieving precise predictions of pedestrian trajectory. DTGAN innovatively incorporates random weights within each graph to eliminate the need for pre-defined interaction rules. We further enhance the performance of DTGAN by exploring diverse task loss functions during adversarial training, which yields improvements of 16.7\% and 39.3\% on metrics ADE and FDE, respectively. The effectiveness and accuracy of our framework are verified on two public datasets. The experimental results show that our proposed DTGAN achieves superior performance and is well able to understand pedestrians' intentions.

IVDec 16, 2024
Block-Based Multi-Scale Image Rescaling

Jian Li, Siwang Zhou

Image rescaling (IR) seeks to determine the optimal low-resolution (LR) representation of a high-resolution (HR) image to reconstruct a high-quality super-resolution (SR) image. Typically, HR images with resolutions exceeding 2K possess rich information that is unevenly distributed across the image. Traditional image rescaling methods often fall short because they focus solely on the overall scaling rate, ignoring the varying amounts of information in different parts of the image. To address this limitation, we propose a Block-Based Multi-Scale Image Rescaling Framework (BBMR), tailored for IR tasks involving HR images of 2K resolution and higher. BBMR consists of two main components: the Downscaling Module and the Upscaling Module. In the Downscaling Module, the HR image is segmented into sub-blocks of equal size, with each sub-block receiving a dynamically allocated scaling rate while maintaining a constant overall scaling rate. For the Upscaling Module, we introduce the Joint Super-Resolution method (JointSR), which performs SR on these sub-blocks with varying scaling rates and effectively eliminates blocking artifacts. Experimental results demonstrate that BBMR significantly enhances the SR image quality on the of 2K and 4K test dataset compared to initial network image rescaling methods.