Feiyun Zhang

CV
3papers
3citations
Novelty32%
AI Score33

3 Papers

47.8LGMay 22
Multi-Gate Residuals

Zhizhan Zheng, Feiyun Zhang, Shuchun Liu et al.

While Attention Residuals has shown some effectiveness in addressing the widespread issue of unbounded activation growth across deep residual layers, it inevitably incurs significant communication overhead. To circumvent this bottleneck, we propose Multi-Gate Residuals (MGR), which stabilizes activation scales without additional communication burden. It utilizes a straightforward scoring and gating mechanism to maintain multi-stream context, coupled with Attention Pooling to extract hidden states from the stream states. Empirical experiments demonstrate that MGR is practical for large-scale training and deployment, offering tangible performance improvements over existing architectures.

CVDec 11, 2018
Coconditional Autoencoding Adversarial Networks for Chinese Font Feature Learning

Zhizhan Zheng, Feiyun Zhang

In this work, we propose a novel framework named Coconditional Autoencoding Adversarial Networks (CocoAAN) for Chinese font learning, which jointly learns a generation network and two encoding networks of different feature domains using an adversarial process. The encoding networks map the glyph images into style and content features respectively via the pairwise substitution optimization strategy, and the generation network maps these two kinds of features to glyph samples. Together with a discriminative network conditioned on the extracted features, our framework succeeds in producing realistic-looking Chinese glyph images flexibly. Unlike previous models relying on the complex segmentation of Chinese components or strokes, our model can "parse" structures in an unsupervised way, through which the content feature representation of each character is captured. Experiments demonstrate our framework has a powerful generalization capacity to other unseen fonts and characters.

CVJun 12, 2018
Qiniu Submission to ActivityNet Challenge 2018

Xiaoteng Zhang, Yixin Bao, Feiyun Zhang et al.

In this paper, we introduce our submissions for the tasks of trimmed activity recognition (Kinetics) and trimmed event recognition (Moments in Time) for Activitynet Challenge 2018. In the two tasks, non-local neural networks and temporal segment networks are implemented as our base models. Multi-modal cues such as RGB image, optical flow and acoustic signal have also been used in our method. We also propose new non-local-based models for further improvement on the recognition accuracy. The final submissions after ensembling the models achieve 83.5% top-1 accuracy and 96.8% top-5 accuracy on the Kinetics validation set, 35.81% top-1 accuracy and 62.59% top-5 accuracy on the MIT validation set.