Xinrui Zhan

CV
5papers
23citations
Novelty59%
AI Score26

5 Papers

CVJun 23, 2022
Warped Convolutional Networks: Bridge Homography to sl(3) algebra by Group Convolution

Xinrui Zhan, Yang Li, Wenyu Liu et al.

Homography has an essential relationship with the special linear group and the embedding Lie algebra structure. Although the Lie algebra representation is elegant, few researchers have established the connection between homography and algebra expression in neural networks. In this paper, we propose Warped Convolution Networks (WCN) to effectively learn and represent the homography by SL(3) group and sl(3) algebra with group convolution. To this end, six commutative subgroups within the SL(3) group are composed to form a homography. For each subgroup, a warping function is proposed to bridge the Lie algebra structure to its corresponding parameters in homography. By taking advantage of the warped convolution, homography learning is formulated into several simple pseudo-translation regressions. By walking along the Lie topology, our proposed WCN is able to learn the features that are invariant to homography. Moreover, it can be easily plugged into other popular CNN-based methods. Extensive experiments on the POT benchmark, S-COCO-Proj, and MNIST-Proj dataset show that our proposed method is effective for planar object tracking, homography estimation, and classification.

CVDec 16, 2022
Scattering-induced entropy boost for highly-compressed optical sensing and encryption

Xinrui Zhan, Xuyang Chang, Daoyu Li et al.

Image sensing often relies on a high-quality machine vision system with a large field of view and high resolution. It requires fine imaging optics, has high computational costs, and requires a large communication bandwidth between image sensors and computing units. In this paper, we propose a novel image-free sensing framework for resource-efficient image classification, where the required number of measurements can be reduced by up to two orders of magnitude. In the proposed framework for single-pixel detection, the optical field for a target is first scattered by an optical diffuser and then two-dimensionally modulated by a spatial light modulator. The optical diffuser simultaneously serves as a compressor and an encryptor for the target information, effectively narrowing the field of view and improving the system's security. The one-dimensional sequence of intensity values, which is measured with time-varying patterns on the spatial light modulator, is then used to extract semantic information based on end-to-end deep learning. The proposed sensing framework is shown to obtain over a 95\% accuracy at sampling rates of 1% and 5% for classification on the MNIST dataset and the recognition of Chinese license plates, respectively, and the framework is up to 24% more efficient than the approach without an optical diffuser. The proposed framework represents a significant breakthrough in high-throughput machine intelligence for scene analysis with low bandwidth, low costs, and strong encryption.

IVOct 14, 2022
Whole-body tumor segmentation of 18F -FDG PET/CT using a cascaded and ensembled convolutional neural networks

Ludovic Sibille, Xinrui Zhan, Lei Xiang

Background: A crucial initial processing step for quantitative PET/CT analysis is the segmentation of tumor lesions enabling accurate feature ex-traction, tumor characterization, oncologic staging, and image-based therapy response assessment. Manual lesion segmentation is however associated with enormous effort and cost and is thus infeasible in clinical routine. Goal: The goal of this study was to report the performance of a deep neural network designed to automatically segment regions suspected of cancer in whole-body 18F-FDG PET/CT images in the context of the AutoPET challenge. Method: A cascaded approach was developed where a stacked ensemble of 3D UNET CNN processed the PET/CT images at a fixed 6mm resolution. A refiner network composed of residual layers enhanced the 6mm segmentation mask to the original resolution. Results: 930 cases were used to train the model. 50% were histologically proven cancer patients and 50% were healthy controls. We obtained a dice=0.68 on 84 stratified test cases. Manual and automatic Metabolic Tumor Volume (MTV) were highly correlated (R2 = 0.969,Slope = 0.947). Inference time was 89.7 seconds on average. Conclusion: The proposed algorithm accurately segmented regions suspicious for cancer in whole-body 18F -FDG PET/CT images.

IVJan 8, 2022
Weighted Encoding Optimization for Dynamic Single-pixel Imaging and Sensing

Xinrui Zhan, Liheng Bian, Chunli Zhu et al.

Using single-pixel detection, the end-to-end neural network that jointly optimizes both encoding and decoding enables high-precision imaging and high-level semantic sensing. However, for varied sampling rates, the large-scale network requires retraining that is laboursome and computation-consuming. In this letter, we report a weighted optimization technique for dynamic rate-adaptive single-pixel imaging and sensing, which only needs to train the network for one time that is available for any sampling rates. Specifically, we introduce a novel weighting scheme in the encoding process to characterize different patterns' modulation efficiency. While the network is training at a high sampling rate, the modulation patterns and corresponding weights are updated iteratively, which produces optimal ranked encoding series when converged. In the experimental implementation, the optimal pattern series with the highest weights are employed for light modulation, thus achieving highly-efficient imaging and sensing. The reported strategy saves the additional training of another low-rate network required by the existing dynamic single-pixel networks, which further doubles training efficiency. Experiments on the MNIST dataset validated that once the network is trained with a sampling rate of 1, the average imaging PSNR reaches 23.50 dB at 0.1 sampling rate, and the image-free classification accuracy reaches up to 95.00\% at a sampling rate of 0.03 and 97.91\% at a sampling rate of 0.1.

CVDec 15, 2021
Homography Decomposition Networks for Planar Object Tracking

Xinrui Zhan, Yueran Liu, Jianke Zhu et al.

Planar object tracking plays an important role in AI applications, such as robotics, visual servoing, and visual SLAM. Although the previous planar trackers work well in most scenarios, it is still a challenging task due to the rapid motion and large transformation between two consecutive frames. The essential reason behind this problem is that the condition number of such a non-linear system changes unstably when the searching range of the homography parameter space becomes larger. To this end, we propose a novel Homography Decomposition Networks(HDN) approach that drastically reduces and stabilizes the condition number by decomposing the homography transformation into two groups. Specifically, a similarity transformation estimator is designed to predict the first group robustly by a deep convolution equivariant network. By taking advantage of the scale and rotation estimation with high confidence, a residual transformation is estimated by a simple regression model. Furthermore, the proposed end-to-end network is trained in a semi-supervised fashion. Extensive experiments show that our proposed approach outperforms the state-of-the-art planar tracking methods at a large margin on the challenging POT, UCSB and POIC datasets.