Yujia Xue

CV
3papers
66citations
Novelty62%
AI Score27

3 Papers

IVMay 7, 2022
Block Modulating Video Compression: An Ultra Low Complexity Image Compression Encoder for Resource Limited Platforms

Siming Zheng, Yujia Xue, Waleed Tahir et al.

We consider the image and video compression on resource limited platforms. An ultra low-cost image encoder, named Block Modulating Video Compression (BMVC) with an encoding complexity ${\cal O}(1)$ is proposed to be implemented on mobile platforms with low consumption of power and computation resources. We also develop two types of BMVC decoders, implemented by deep neural networks. The first BMVC decoder is based on the Plug-and-Play (PnP) algorithm, which is flexible to different compression ratios. And the second decoder is a memory efficient end-to-end convolutional neural network, which aims for real-time decoding. Extensive results on the high definition images and videos demonstrate the superior performance of the proposed codec and the robustness against bit quantization.

CVApr 27, 2018
Deep learning approach to Fourier ptychographic microscopy

Thanh Nguyen, Yujia Xue, Yunzhe Li et al.

Convolutional neural networks (CNNs) have gained tremendous success in solving complex inverse problems. The aim of this work is to develop a novel CNN framework to reconstruct video sequence of dynamic live cells captured using a computational microscopy technique, Fourier ptychographic microscopy (FPM). The unique feature of the FPM is its capability to reconstruct images with both wide field-of-view (FOV) and high resolution, i.e. a large space-bandwidth-product (SBP), by taking a series of low resolution intensity images. For live cell imaging, a single FPM frame contains thousands of cell samples with different morphological features. Our idea is to fully exploit the statistical information provided by this large spatial ensemble so as to make predictions in a sequential measurement, without using any additional temporal dataset. Specifically, we show that it is possible to reconstruct high-SBP dynamic cell videos by a CNN trained only on the first FPM dataset captured at the beginning of a time-series experiment. Our CNN approach reconstructs a 12800X10800 pixels phase image using only ~25 seconds, a 50X speedup compared to the model-based FPM algorithm. In addition, the CNN further reduces the required number of images in each time frame by ~6X. Overall, this significantly improves the imaging throughput by reducing both the acquisition and computational times. The proposed CNN is based on the conditional generative adversarial network (cGAN) framework. Additionally, we also exploit transfer learning so that our pre-trained CNN can be further optimized to image other cell types. Our technique demonstrates a promising deep learning approach to continuously monitor large live-cell populations over an extended time and gather useful spatial and temporal information with sub-cellular resolution.

CVSep 4, 2017
Hyperspectral Light Field Stereo Matching

Kang Zhu, Yujia Xue, Qiang Fu et al.

In this paper, we describe how scene depth can be extracted using a hyperspectral light field capture (H-LF) system. Our H-LF system consists of a 5 x 6 array of cameras, with each camera sampling a different narrow band in the visible spectrum. There are two parts to extracting scene depth. The first part is our novel cross-spectral pairwise matching technique, which involves a new spectral-invariant feature descriptor and its companion matching metric we call bidirectional weighted normalized cross correlation (BWNCC). The second part, namely, H-LF stereo matching, uses a combination of spectral-dependent correspondence and defocus cues that rely on BWNCC. These two new cost terms are integrated into a Markov Random Field (MRF) for disparity estimation. Experiments on synthetic and real H-LF data show that our approach can produce high-quality disparity maps. We also show that these results can be used to produce the complete plenoptic cube in addition to synthesizing all-focus and defocused color images under different sensor spectral responses.