Yuehu Liu

CV
8papers
88citations
Novelty41%
AI Score25

8 Papers

CVOct 7, 2022
Dual Clustering Co-teaching with Consistent Sample Mining for Unsupervised Person Re-Identification

Zeqi Chen, Zhichao Cui, Chi Zhang et al. · pku

In unsupervised person Re-ID, peer-teaching strategy leveraging two networks to facilitate training has been proven to be an effective method to deal with the pseudo label noise. However, training two networks with a set of noisy pseudo labels reduces the complementarity of the two networks and results in label noise accumulation. To handle this issue, this paper proposes a novel Dual Clustering Co-teaching (DCCT) approach. DCCT mainly exploits the features extracted by two networks to generate two sets of pseudo labels separately by clustering with different parameters. Each network is trained with the pseudo labels generated by its peer network, which can increase the complementarity of the two networks to reduce the impact of noises. Furthermore, we propose dual clustering with dynamic parameters (DCDP) to make the network adaptive and robust to dynamically changing clustering parameters. Moreover, Consistent Sample Mining (CSM) is proposed to find the samples with unchanged pseudo labels during training for potential noisy sample removal. Extensive experiments demonstrate the effectiveness of the proposed method, which outperforms the state-of-the-art unsupervised person Re-ID methods by a considerable margin and surpasses most methods utilizing camera information.

CVSep 6, 2021Code
GeneAnnotator: A Semi-automatic Annotation Tool for Visual Scene Graph

Zhixuan Zhang, Chi Zhang, Zhenning Niu et al.

In this manuscript, we introduce a semi-automatic scene graph annotation tool for images, the GeneAnnotator. This software allows human annotators to describe the existing relationships between participators in the visual scene in the form of directed graphs, hence enabling the learning and reasoning on visual relationships, e.g., image captioning, VQA and scene graph generation, etc. The annotations for certain image datasets could either be merged in a single VG150 data-format file to support most existing models for scene graph learning or transformed into a separated annotation file for each single image to build customized datasets. Moreover, GeneAnnotator provides a rule-based relationship recommending algorithm to reduce the heavy annotation workload. With GeneAnnotator, we propose Traffic Genome, a comprehensive scene graph dataset with 1000 diverse traffic images, which in return validates the effectiveness of the proposed software for scene graph annotation. The project source code, with usage examples and sample data is available at https://github.com/Milomilo0320/A-Semi-automatic-Annotation-Software-for-Scene-Graph, under the Apache open-source license.

IRMay 2, 2020Code
PyRetri: A PyTorch-based Library for Unsupervised Image Retrieval by Deep Convolutional Neural Networks

Benyi Hu, Ren-Jie Song, Xiu-Shen Wei et al.

Despite significant progress of applying deep learning methods to the field of content-based image retrieval, there has not been a software library that covers these methods in a unified manner. In order to fill this gap, we introduce PyRetri, an open source library for deep learning based unsupervised image retrieval. The library encapsulates the retrieval process in several stages and provides functionality that covers various prominent methods for each stage. The idea underlying its design is to provide a unified platform for deep learning based image retrieval research, with high usability and extensibility. To the best of our knowledge, this is the first open-source library for unsupervised image retrieval by deep learning.

LGAug 9, 2021
Unified Regularity Measures for Sample-wise Learning and Generalization

Chi Zhang, Xiaoning Ma, Yu Liu et al.

Fundamental machine learning theory shows that different samples contribute unequally both in learning and testing processes. Contemporary studies on DNN imply that such sample difference is rooted on the distribution of intrinsic pattern information, namely sample regularity. Motivated by the recent discovery on network memorization and generalization, we proposed a pair of sample regularity measures for both processes with a formulation-consistent representation. Specifically, cumulative binary training/generalizing loss (CBTL/CBGL), the cumulative number of correct classiffcations of the training/testing sample within training stage, is proposed to quantize the stability in memorization-generalization process; while forgetting/mal-generalizing events, i.e., the mis-classification of previously learned or generalized sample, are utilized to represent the uncertainty of sample regularity with respect to optimization dynamics. Experiments validated the effectiveness and robustness of the proposed approaches for mini-batch SGD optimization. Further applications on training/testing sample selection show the proposed measures sharing the unified computing procedure could benefit for both tasks.

CVMar 11, 2021
Density-aware Haze Image Synthesis by Self-Supervised Content-Style Disentanglement

Chi Zhang, Zihang Lin, Liheng Xu et al.

The key procedure of haze image translation through adversarial training lies in the disentanglement between the feature only involved in haze synthesis, i.e.style feature, and the feature representing the invariant semantic content, i.e. content feature. Previous methods separate content feature apart by utilizing it to classify haze image during the training process. However, in this paper we recognize the incompleteness of the content-style disentanglement in such technical routine. The flawed style feature entangled with content information inevitably leads the ill-rendering of the haze images. To address, we propose a self-supervised style regression via stochastic linear interpolation to reduce the content information in style feature. The ablative experiments demonstrate the disentangling completeness and its superiority in level-aware haze image synthesis. Moreover, the generated haze data are applied in the testing generalization of vehicle detectors. Further study between haze-level and detection performance shows that haze has obvious impact on the generalization of the vehicle detectors and such performance degrading level is linearly correlated to the haze-level, which, in turn, validates the effectiveness of the proposed method.

CVMay 16, 2019
Bimodal Stereo: Joint Shape and Pose Estimation from Color-Depth Image Pair

Chi Zhang, Yuehu Liu, Ying Wu et al.

Mutual calibration between color and depth cameras is a challenging topic in multi-modal data registration. In this paper, we are confronted with a "Bimodal Stereo" problem, which aims to solve camera pose from a pair of an uncalibrated color image and a depth map from different views automatically. To address this problem, an iterative Shape-from-Shading (SfS) based framework is proposed to estimate shape and pose simultaneously. In the pipeline, the estimated shape is refined by the shape prior from the given depth map under the estimated pose. Meanwhile, the estimated pose is improved by the registration of estimated shape and shape from given depth map. We also introduce a shading based refinement in the pipeline to address noisy depth map with holes. Extensive experiments showed that through our method, both the depth map, the recovered shape as well as its pose can be desirably refined and recovered.

CVMay 15, 2019
Joint haze image synthesis and dehazing with mmd-vae losses

Zongliang Li, Chi Zhang, Gaofeng Meng et al.

Fog and haze are weathers with low visibility which are adversarial to the driving safety of intelligent vehicles equipped with optical sensors like cameras and LiDARs. Therefore image dehazing for perception enhancement and haze image synthesis for testing perception abilities are equivalently important in the development of such autonomous driving systems. From the view of image translation, these two problems are essentially dual with each other, which have the potentiality to be solved jointly. In this paper, we propose an unsupervised Image-to-Image Translation framework based on Variational Autoencoders (VAE) and Generative Adversarial Nets (GAN) to handle haze image synthesis and haze removal simultaneously. Since the KL divergence in the VAE objectives could not guarantee the optimal mapping under imbalanced and unpaired training samples with limited size, Maximum mean discrepancy (MMD) based VAE is utilized to ensure the translating consistency in both directions. The comprehensive analysis on both synthesis and dehazing performance of our method demonstrate the feasibility and practicability of the proposed method.

CVNov 24, 2018
Spatio-Temporal Road Scene Reconstruction using Superpixel Markov Random Field

Yaochen Li, Yuehu Liu, Jihua Zhu et al.

Scene model construction based on image rendering is an indispensable but challenging technique in computer vision and intelligent transportation systems. In this paper, we propose a framework for constructing 3D corridor-based road scene models. This consists of two successive stages: road detection and scene construction. The road detection is realized by a new superpixel Markov random field (MRF) algorithm. The data fidelity term in the MRF's energy function is jointly computed according to the superpixel features of color, texture and location. The smoothness term is established on the basis of the interaction of spatio-temporally adjacent superpixels. In the subsequent scene construction, the foreground and background regions are modeled independently. Experiments for road detection demonstrate the proposed method outperforms the state-of-the-art in both accuracy and speed. The scene construction experiments confirm that the proposed scene models show better correctness ratios, and have the potential to support a range of applications.