Hua Yang

h-index16

6papers

451citations

Novelty53%

AI Score27

Ranked #157,251 of 194,257 authors (top 81%)#51,001 in CV (top 86%)

6 Papers

11.1CVMay 6, 2020

Attentive Semantic Exploring for Manipulated Face Detection

Zehao Chen, Hua Yang

Face manipulation methods develop rapidly in recent years, whose potential risk to society accounts for the emerging of researches on detection methods. However, due to the diversity of manipulation methods and the high quality of fake images, detection methods suffer from a lack of generalization ability. To solve the problem, we find that segmenting images into semantic fragments could be effective, as discriminative defects and distortions are closely related to such fragments. Besides, to highlight discriminative regions in fragments and to measure contribution to the final prediction of each fragment is efficient for the improvement of generalization ability. Therefore, we propose a novel manipulated face detection method based on Multilevel Facial Semantic Segmentation and Cascade Attention Mechanism. To evaluate our method, we reconstruct two datasets: GGFI and FFMI, and also collect two open-source datasets. Experiments on four datasets verify the advantages of our approach against other state-of-the-arts, especially its generalization ability.

1.4CVJan 21, 2021

MPASNET: Motion Prior-Aware Siamese Network for Unsupervised Deep Crowd Segmentation in Video Scenes

Jinhai Yang, Hua Yang

Crowd segmentation is a fundamental task serving as the basis of crowded scene analysis, and it is highly desirable to obtain refined pixel-level segmentation maps. However, it remains a challenging problem, as existing approaches either require dense pixel-level annotations to train deep learning models or merely produce rough segmentation maps from optical or particle flows with physical models. In this paper, we propose the Motion Prior-Aware Siamese Network (MPASNET) for unsupervised crowd semantic segmentation. This model not only eliminates the need for annotation but also yields high-quality segmentation maps. Specially, we first analyze the coherent motion patterns across the frames and then apply a circular region merging strategy on the collective particles to generate pseudo-labels. Moreover, we equip MPASNET with siamese branches for augmentation-invariant regularization and siamese feature aggregation. Experiments over benchmark datasets indicate that our model outperforms the state-of-the-arts by more than 12% in terms of mIoU.

7.9CVJul 11, 2020

Towards Cross-Granularity Few-Shot Learning: Coarse-to-Fine Pseudo-Labeling with Visual-Semantic Meta-Embedding

Jinhai Yang, Hua Yang, Lin Chen

Few-shot learning aims at rapidly adapting to novel categories with only a handful of samples at test time, which has been predominantly tackled with the idea of meta-learning. However, meta-learning approaches essentially learn across a variety of few-shot tasks and thus still require large-scale training data with fine-grained supervision to derive a generalized model, thereby involving prohibitive annotation cost. In this paper, we advance the few-shot classification paradigm towards a more challenging scenario, i.e., cross-granularity few-shot classification, where the model observes only coarse labels during training while is expected to perform fine-grained classification during testing. This task largely relieves the annotation cost since fine-grained labeling usually requires strong domain-specific expertise. To bridge the cross-granularity gap, we approximate the fine-grained data distribution by greedy clustering of each coarse-class into pseudo-fine-classes according to the similarity of image embeddings. We then propose a meta-embedder that jointly optimizes the visual- and semantic-discrimination, in both instance-wise and coarse class-wise, to obtain a good feature space for this coarse-to-fine pseudo-labeling process. Extensive experiments and ablation studies are conducted to demonstrate the effectiveness and robustness of our approach on three representative datasets.

4.1CVNov 17, 2019

Distribution Context Aware Loss for Person Re-identification

Zhigang Chang, Qin Zhou, Mingyang Yu et al.

To learn the optimal similarity function between probe and gallery images in Person re-identification, effective deep metric learning methods have been extensively explored to obtain discriminative feature embedding. However, existing metric loss like triplet loss and its variants always emphasize pair-wise relations but ignore the distribution context in feature space, leading to inconsistency and sub-optimal. In fact, the similarity of one pair not only decides the match of this pair, but also has potential impacts on other sample pairs. In this paper, we propose a novel Distribution Context Aware (DCA) loss based on triplet loss to combine both numerical similarity and relation similarity in feature space for better clustering. Extensive experiments on three benchmarks including Market-1501, DukeMTMC-reID and MSMT17, evidence the favorable performance of our method against the corresponding baseline and other state-of-the-art methods.

31.0CVFeb 2, 2019

Online Multi-Object Tracking with Dual Matching Attention Networks

Ji Zhu, Hua Yang, Nian Liu et al.

In this paper, we propose an online Multi-Object Tracking (MOT) approach which integrates the merits of single object tracking and data association methods in a unified framework to handle noisy detections and frequent interactions between targets. Specifically, for applying single object tracking in MOT, we introduce a cost-sensitive tracking loss based on the state-of-the-art visual tracker, which encourages the model to focus on hard negative distractors during online learning. For data association, we propose Dual Matching Attention Networks (DMAN) with both spatial and temporal attention mechanisms. The spatial attention module generates dual attention maps which enable the network to focus on the matching patterns of the input image pair, while the temporal attention module adaptively allocates different levels of attention to different samples in the tracklet to suppress noisy observations. Experimental results on the MOT benchmark datasets show that the proposed algorithm performs favorably against both online and offline trackers in terms of identity-preserving metrics.

6.3CVMar 22, 2018

Weighted Bilinear Coding over Salient Body Parts for Person Re-identification

Zhigang Chang, Qin Zhou, Heng Fan et al.

Deep convolutional neural networks (CNNs) have demonstrated dominant performance in person re-identification (Re-ID). Existing CNN based methods utilize global average pooling (GAP) to aggregate intermediate convolutional features for Re-ID. However, this strategy only considers the first-order statistics of local features and treats local features at different locations equally important, leading to sub-optimal feature representation. To deal with these issues, we propose a novel weighted bilinear coding (WBC) framework for local feature aggregation in CNN networks to pursue more representative and discriminative feature representations, which can adapt to other state-of-the-art methods and improve their performance. In specific, bilinear coding is used to encode the channel-wise feature correlations to capture richer feature interactions. Meanwhile, a weighting scheme is applied on the bilinear coding to adaptively adjust the weights of local features at different locations based on their importance in recognition, further improving the discriminability of feature aggregation. To handle the spatial misalignment issue, we use a salient part net (spatial attention module) to derive salient body parts, and apply the WBC model on each part. The final representation, formed by concatenating the WBC encoded features of each part, is both discriminative and resistant to spatial misalignment. Experiments on three benchmarks including Market-1501, DukeMTMC-reID and CUHK03 evidence the favorable performance of our method against other outstanding methods.