Xin Zhao

h-index26

5papers

1,123citations

Novelty38%

AI Score28

Ranked #149,680 of 194,257 authors (top 77%)#48,883 in CV (top 83%)

5 Papers

16.7CVDec 29, 2018Code

EANet: Enhancing Alignment for Cross-Domain Person Re-identification

Houjing Huang, Wenjie Yang, Xiaotang Chen et al.

Person re-identification (ReID) has achieved significant improvement under the single-domain setting. However, directly exploiting a model to new domains is always faced with huge performance drop, and adapting the model to new domains without target-domain identity labels is still challenging. In this paper, we address cross-domain ReID and make contributions for both model generalization and adaptation. First, we propose Part Aligned Pooling (PAP) that brings significant improvement for cross-domain testing. Second, we design a Part Segmentation (PS) constraint over ReID feature to enhance alignment and improve model generalization. Finally, we show that applying our PS constraint to unlabeled target domain images serves as effective domain adaptation. We conduct extensive experiments between three large datasets, Market1501, CUHK03 and DukeMTMC-reID. Our model achieves state-of-the-art performance under both source-domain and cross-domain settings. For completeness, we also demonstrate the complementarity of our model to existing domain adaptation methods. The code is available at https://github.com/huanghoujing/EANet.

1.4CVJun 22, 2021

A Comparison for Patch-level Classification of Deep Learning Methods on Transparent Environmental Microorganism Images: from Convolutional Neural Networks to Visual Transformers

Hechen Yang, Chen Li, Jinghua Zhang et al.

Nowadays, analysis of Transparent Environmental Microorganism Images (T-EM images) in the field of computer vision has gradually become a new and interesting spot. This paper compares different deep learning classification performance for the problem that T-EM images are challenging to analyze. We crop the T-EM images into 8 * 8 and 224 * 224 pixel patches in the same proportion and then divide the two different pixel patches into foreground and background according to ground truth. We also use four convolutional neural networks and a novel ViT network model to compare the foreground and background classification experiments. We conclude that ViT performs the worst in classifying 8 * 8 pixel patches, but it outperforms most convolutional neural networks in classifying 224 * 224 pixel patches.

30.6CLSep 12, 2019Code

UER: An Open-Source Toolkit for Pre-training Models

Zhe Zhao, Hui Chen, Jinbin Zhang et al.

Existing works, including ELMO and BERT, have revealed the importance of pre-training for NLP tasks. While there does not exist a single pre-training model that works best in all cases, it is of necessity to develop a framework that is able to deploy various pre-training models efficiently. For this purpose, we propose an assemble-on-demand pre-training toolkit, namely Universal Encoder Representations (UER). UER is loosely coupled, and encapsulated with rich modules. By assembling modules on demand, users can either reproduce a state-of-the-art pre-training model or develop a pre-training model that remains unexplored. With UER, we have built a model zoo, which contains pre-trained models based on different corpora, encoders, and targets (objectives). With proper pre-trained models, we could achieve new state-of-the-art results on a range of downstream datasets.

2.6CVMar 4, 2019

Hyperspectral Image Classification with Deep Metric Learning and Conditional Random Field

Yi Liang, Xin Zhao, Alan J. X. Guo et al.

To improve the classification performance in the context of hyperspectral image processing, many works have been developed based on two common strategies, namely the spatial-spectral information integration and the utilization of neural networks. However, both strategies typically require more training data than the classical algorithms, aggregating the shortage of labeled samples. In this letter, we propose a novel framework that organically combines the spectrum-based deep metric learning model and the conditional random field algorithm. The deep metric learning model is supervised by the center loss to produce spectrum-based features that gather more tightly in Euclidean space within classes. The conditional random field with Gaussian edge potentials, which is firstly proposed for image segmentation tasks, is introduced to give the pixel-wise classification over the hyperspectral image by utilizing both the geographical distances between pixels and the Euclidean distances between the features produced by the deep metric learning model. The proposed framework is trained by spectral pixels at the deep metric learning stage and utilizes the half handcrafted spatial features at the conditional random field stage. This settlement alleviates the shortage of training data to some extent. Experiments on two real hyperspectral images demonstrate the advantages of the proposed method in terms of both classification accuracy and computation cost.

0.9CVJan 28, 2019

An End-to-End Solution for Effectively Demoting Watermarked Images in Image Search

Ning Ma, Xin Zhao, Mark Bolin

We propose an end-to-end solution, from watermark feature generation to metric design, for effectively demoting watermarked images surfed by a real world image search engine. We use a few fundamental techniques to obtain effective watermark features of images in the image search index, and utilize the signals in a commercial search engine to improve the image search quality. We collect a diverse and large set (about 1M) of images with human labels indicating whether the image contains visible watermark. We train a few deep convolutional neural networks to extract watermark information from the raw images. The deep CNN classifiers we trained can achieve high accuracy on the watermark test data set. We also analyze the images based on their domains to get watermark information from a domain-based watermark classifier. We design a new novel hybrid metric which includes the relevance, image attractiveness and watermark information all together. We demonstrate that using these watermark signals together with the new metric in image search ranker can significantly demote the watermarked images during the online image ranking.