Meng Yang

h-index37

11papers

2,397citations

Novelty51%

AI Score31

Ranked #133,310 of 194,257 authors (top 69%)#43,999 in CV (top 74%)

11 Papers

1.4CVMar 26, 2021Code

Leaning Compact and Representative Features for Cross-Modality Person Re-Identification

Guangwei Gao, Hao Shao, Fei Wu et al.

This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task, which aims to match pedestrian samples between visible and infrared modes. In order to reduce the modality-discrepancy between samples from different cameras, most existing works usually use constraints based on Euclidean metric. Because of the Euclidean based distance metric strategy cannot effectively measure the internal angles between the embedded vectors, the existing solutions cannot learn the angularly discriminative feature embedding. Since the most important factor affecting the classification task based on embedding vector is whether there is an angularly discriminative feature space, in this paper, we present a new loss function called Enumerate Angular Triplet (EAT) loss. Also, motivated by the knowledge distillation, to narrow down the features between different modalities before feature embedding, we further present a novel Cross-Modality Knowledge Distillation (CMKD) loss. Benefit from the above two considerations, the embedded features are discriminative enough in a way to tackle modality-discrepancy problem. The experimental results on RegDB and SYSU-MM01 datasets have demonstrated that the proposed method is superior to the other most advanced methods in terms of impressive performance. Code is available at https://github.com/IVIPLab/LCCRF.

10.0CVMar 25, 2021

Hierarchical Deep CNN Feature Set-Based Representation Learning for Robust Cross-Resolution Face Recognition

Guangwei Gao, Yi Yu, Jian Yang et al.

Cross-resolution face recognition (CRFR), which is important in intelligent surveillance and biometric forensics, refers to the problem of matching a low-resolution (LR) probe face image against high-resolution (HR) gallery face images. Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space where the resolution discrepancy is mitigated. However, little works consider how to extract and utilize the intermediate discriminative features from the noisy LR query faces to further mitigate the resolution discrepancy due to the resolution limitations. In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR. In particular, our contributions are threefold. (i) To learn more robust and discriminative features, we desire to adaptively fuse the contextual features from different layers. (ii) To fully exploit these contextual features, we design a feature set-based representation learning (FSRL) scheme to collaboratively represent the hierarchical features for more accurate recognition. Moreover, FSRL utilizes the primitive form of feature maps to keep the latent structural information, especially in noisy cases. (iii) To further promote the recognition performance, we desire to fuse the hierarchical recognition outputs from different stages. Meanwhile, the discriminability from different scales can also be fully integrated. By exploiting these advantages, the efficiency of the proposed method can be delivered. Experimental results on several face datasets have verified the superiority of the presented algorithm to the other competitive CRFR approaches.

3.4CVMar 31, 2019

Pedestrian re-identification based on Tree branch network with local and global learning

Hui Li, Meng Yang, Zhihui Lai et al.

Deep part-based methods in recent literature have revealed the great potential of learning local part-level representation for pedestrian image in the task of person re-identification. However, global features that capture discriminative holistic information of human body are usually ignored or not well exploited. This motivates us to investigate joint learning global and local features from pedestrian images. Specifically, in this work, we propose a novel framework termed tree branch network (TBN) for person re-identification. Given a pedestrain image, the feature maps generated by the backbone CNN, are partitioned recursively into several pieces, each of which is followed by a bottleneck structure that learns finer-grained features for each level in the hierarchical tree-like framework. In this way, representations are learned in a coarse-to-fine manner and finally assembled to produce more discriminative image descriptions. Experimental results demonstrate the effectiveness of the global and local feature learning method in the proposed TBN framework. We also show significant improvement in performance over state-of-the-art methods on three public benchmarks: Market-1501, CUHK-03 and DukeMTMC.

31.0CVJul 20, 2018Code

Toward Characteristic-Preserving Image-based Virtual Try-On Network

Bochao Wang, Huabin Zheng, Xiaodan Liang et al.

Image-based virtual try-on systems for fitting new in-shop clothes into a person image have attracted increasing research attention, yet is still challenging. A desirable pipeline should not only transform the target clothes into the most fitting shape seamlessly but also preserve well the clothes identity in the generated image, that is, the key characteristics (e.g. texture, logo, embroidery) that depict the original clothes. However, previous image-conditioned generation works fail to meet these critical requirements towards the plausible virtual try-on performance since they fail to handle large spatial misalignment between the input image and target clothes. Prior work explicitly tackled spatial deformation using shape context matching, but failed to preserve clothing details due to its coarse-to-fine strategy. In this work, we propose a new fully-learnable Characteristic-Preserving Virtual Try-On Network(CP-VTON) for addressing all real-world challenges in this task. First, CP-VTON learns a thin-plate spline transformation for transforming the in-shop clothes into fitting the body shape of the target person via a new Geometric Matching Module (GMM) rather than computing correspondences of interest points as prior works did. Second, to alleviate boundary artifacts of warped clothes and make the results more realistic, we employ a Try-On Module that learns a composition mask to integrate the warped clothes and the rendered image to ensure smoothness. Extensive experiments on a fashion dataset demonstrate our CP-VTON achieves the state-of-the-art virtual try-on performance both qualitatively and quantitatively.

5.2OCDec 16, 2016

An MM Algorithm for Split Feasibility Problems

Jason Xu, Eric C. Chi, Meng Yang et al.

The classical multi-set split feasibility problem seeks a point in the intersection of finitely many closed convex domain constraints, whose image under a linear mapping also lies in the intersection of finitely many closed convex range constraints. Split feasibility generalizes important inverse problems including convex feasibility, linear complementarity, and regression with constraint sets. When a feasible point does not exist, solution methods that proceed by minimizing a proximity function can be used to obtain optimal approximate solutions to the problem. We present an extension of the proximity function approach that generalizes the linear split feasibility problem to allow for non-linear mappings. Our algorithm is based on the principle of majorization-minimization, is amenable to quasi-Newton acceleration, and comes complete with convergence guarantees under mild assumptions. Furthermore, we show that the Euclidean norm appearing in the proximity function of the non-linear split feasibility problem can be replaced by arbitrary Bregman divergences. We explore several examples illustrating the merits of non-linear formulations over the linear case, with a focus on optimization for intensity-modulated radiation therapy.

44.4MLDec 7, 2016Code

Large-Margin Softmax Loss for Convolutional Neural Networks

Weiyang Liu, Yandong Wen, Zhiding Yu et al.

Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.

1.1LGNov 15, 2015

Robust Elastic Net Regression

Weiyang Liu, Rongmei Lin, Meng Yang

We propose a robust elastic net (REN) model for high-dimensional sparse regression and give its performance guarantees (both the statistical error bound and the optimization bound). A simple idea of trimming the inner product is applied to the elastic net model. Specifically, we robustify the covariance matrix by trimming the inner product based on the intuition that the trimmed inner product can not be significant affected by a bounded number of arbitrarily corrupted points (outliers). The REN model can also derive two interesting special cases: robust Lasso and robust soft thresholding. Comprehensive experimental results show that the robustness of the proposed model consistently outperforms the original elastic net and matches the performance guarantees nicely.

4.5CVNov 14, 2015

Jointly Learning Non-negative Projection and Dictionary with Discriminative Graph Constraints for Classification

Weiyang Liu, Zhiding Yu, Yandong Wen et al.

Sparse coding with dictionary learning (DL) has shown excellent classification performance. Despite the considerable number of existing works, how to obtain features on top of which dictionaries can be better learned remains an open and interesting question. Many current prevailing DL methods directly adopt well-performing crafted features. While such strategy may empirically work well, it ignores certain intrinsic relationship between dictionaries and features. We propose a framework where features and dictionaries are jointly learned and optimized. The framework, named joint non-negative projection and dictionary learning (JNPDL), enables interaction between the input features and the dictionaries. The non-negative projection leads to discriminative parts-based object features while DL seeks a more suitable representation. Discriminative graph constraints are further imposed to simultaneously maximize intra-class compactness and inter-class separability. Experiments on both image and image set classification show the excellent performance of JNPDL by outperforming several state-of-the-art approaches.

1.3CVJul 25, 2015

Efficient Face Alignment via Locality-constrained Representation for Robust Recognition

Yandong Wen, Weiyang Liu, Meng Yang et al.

Practical face recognition has been studied in the past decades, but still remains an open challenge. Current prevailing approaches have already achieved substantial breakthroughs in recognition accuracy. However, their performance usually drops dramatically if face samples are severely misaligned. To address this problem, we propose a highly efficient misalignment-robust locality-constrained representation (MRLR) algorithm for practical real-time face recognition. Specifically, the locality constraint that activates the most correlated atoms and suppresses the uncorrelated ones, is applied to construct the dictionary for face alignment. Then we simultaneously align the warped face and update the locality-constrained dictionary, eventually obtaining the final alignment. Moreover, we make use of the block structure to accelerate the derived analytical solution. Experimental results on public data sets show that MRLR significantly outperforms several state-of-the-art approaches in terms of efficiency and scalability with even better performance.

4.5CVFeb 2, 2015

Structured Occlusion Coding for Robust Face Recognition

Yandong Wen, Weiyang Liu, Meng Yang et al.

Occlusion in face recognition is a common yet challenging problem. While sparse representation based classification (SRC) has been shown promising performance in laboratory conditions (i.e. noiseless or random pixel corrupted), it performs much worse in practical scenarios. In this paper, we consider the practical face recognition problem, where the occlusions are predictable and available for sampling. We propose the structured occlusion coding (SOC) to address occlusion problems. The structured coding here lies in two folds. On one hand, we employ a structured dictionary for recognition. On the other hand, we propose to use the structured sparsity in this formulation. Specifically, SOC simultaneously separates the occlusion and classifies the image. In this way, the problem of recognizing an occluded image is turned into seeking a structured sparse solution on occlusion-appended dictionary. In order to construct a well-performing occlusion dictionary, we propose an occlusion mask estimating technique via locality constrained dictionary (LCD), showing striking improvement in occlusion sample. On a category-specific occlusion dictionary, we replace norm sparsity with the structured sparsity which is shown more robust, further enhancing the robustness of our approach. Moreover, SOC achieves significant improvement in handling large occlusion in real world. Extensive experiments are conducted on public data sets to validate the superiority of the proposed algorithm.

14.6CVApr 11, 2012

Collaborative Representation based Classification for Face Recognition

Lei Zhang, Meng Yang, Xiangchu Feng et al.

By coding a query sample as a sparse linear combination of all training samples and then classifying it by evaluating which class leads to the minimal coding residual, sparse representation based classification (SRC) leads to interesting results for robust face recognition. It is widely believed that the l1- norm sparsity constraint on coding coefficients plays a key role in the success of SRC, while its use of all training samples to collaboratively represent the query sample is rather ignored. In this paper we discuss how SRC works, and show that the collaborative representation mechanism used in SRC is much more crucial to its success of face classification. The SRC is a special case of collaborative representation based classification (CRC), which has various instantiations by applying different norms to the coding residual and coding coefficient. More specifically, the l1 or l2 norm characterization of coding residual is related to the robustness of CRC to outlier facial pixels, while the l1 or l2 norm characterization of coding coefficient is related to the degree of discrimination of facial features. Extensive experiments were conducted to verify the face recognition accuracy and efficiency of CRC with different instantiations.