RuJie Zhao

2papers

2 Papers

CVJan 13, 2022
Automatic Sparse Connectivity Learning for Neural Networks

Zhimin Tang, Linkai Luo, Bike Xie et al.

Since sparse neural networks usually contain many zero weights, these unnecessary network connections can potentially be eliminated without degrading network performance. Therefore, well-designed sparse neural networks have the potential to significantly reduce FLOPs and computational resources. In this work, we propose a new automatic pruning method - Sparse Connectivity Learning (SCL). Specifically, a weight is re-parameterized as an element-wise multiplication of a trainable weight variable and a binary mask. Thus, network connectivity is fully described by the binary mask, which is modulated by a unit step function. We theoretically prove the fundamental principle of using a straight-through estimator (STE) for network pruning. This principle is that the proxy gradients of STE should be positive, ensuring that mask variables converge at their minima. After finding Leaky ReLU, Softplus, and Identity STEs can satisfy this principle, we propose to adopt Identity STE in SCL for discrete mask relaxation. We find that mask gradients of different features are very unbalanced, hence, we propose to normalize mask gradients of each feature to optimize mask variable training. In order to automatically train sparse masks, we include the total number of network connections as a regularization term in our objective function. As SCL does not require pruning criteria or hyper-parameters defined by designers for network layers, the network is explored in a larger hypothesis space to achieve optimized sparse connectivity for the best performance. SCL overcomes the limitations of existing automatic pruning methods. Experimental results demonstrate that SCL can automatically learn and select important network connections for various baseline network structures. Deep learning models trained by SCL outperform the SOTA human-designed and automatic pruning methods in sparsity, accuracy, and FLOPs reduction.

HCJul 15, 2017
Automatic Identification of Non-Meaningful Body-Movements and What It Reveals About Humans

Md Iftekhar Tanveer, RuJie Zhao, Mohammed Hoque

We present a framework to identify whether a public speaker's body movements are meaningful or non-meaningful ("Mannerisms") in the context of their speeches. In a dataset of 84 public speaking videos from 28 individuals, we extract 314 unique body movement patterns (e.g. pacing, gesturing, shifting body weights, etc.). Online workers and the speakers themselves annotated the meaningfulness of the patterns. We extracted five types of features from the audio-video recordings: disfluency, prosody, body movements, facial, and lexical. We use linear classifiers to predict the annotations with AUC up to 0.82. Analysis of the classifier weights reveals that it puts larger weights on the lexical features while predicting self-annotations. Contrastingly, it puts a larger weight on prosody features while predicting audience annotations. This analysis might provide subtle hint that public speakers tend to focus more on the verbal features while evaluating self-performances. The audience, on the other hand, tends to focus more on the non-verbal aspects of the speech. The dataset and code associated with this work has been released for peer review and further analysis.