Tieran Zheng

SD
5papers
31citations
Novelty41%
AI Score25

5 Papers

NASep 7, 2011
Trace Norm Regularized Tensor Classification and Its Online Learning Approaches

Ziqiang Shi, Tieran Zheng, Jiqing Han

In this paper we propose an algorithm to classify tensor data. Our methodology is built on recent studies about matrix classification with the trace norm constrained weight matrix and the tensor trace norm. Similar to matrix classification, the tensor classification is formulated as a convex optimization problem which can be solved by using the off-the-shelf accelerated proximal gradient (APG) method. However, there are no analytic solutions as the matrix case for the updating of the weight tensors via the proximal gradient. To tackle this problem, the Douglas-Rachford splitting technique and the alternating direction method of multipliers (ADM) used in tensor completion are adapted to update the weight tensors. Further more, due to the demand of real applications, we also propose its online learning approaches. Experiments demonstrate the efficiency of the methods.

CLSep 19, 2024
Mutual Information-based Representations Disentanglement for Unaligned Multimodal Language Sequences

Fan Qian, Jiqing Han, Jianchen Li et al.

The key challenge in unaligned multimodal language sequences lies in effectively integrating information from various modalities to obtain a refined multimodal joint representation. Recently, the disentangle and fuse methods have achieved the promising performance by explicitly learning modality-agnostic and modality-specific representations and then fusing them into a multimodal joint representation. However, these methods often independently learn modality-agnostic representations for each modality and utilize orthogonal constraints to reduce linear correlations between modality-agnostic and modality-specific representations, neglecting to eliminate their nonlinear correlations. As a result, the obtained multimodal joint representation usually suffers from information redundancy, leading to overfitting and poor generalization of the models. In this paper, we propose a Mutual Information-based Representations Disentanglement (MIRD) method for unaligned multimodal language sequences, in which a novel disentanglement framework is designed to jointly learn a single modality-agnostic representation. In addition, the mutual information minimization constraint is employed to ensure superior disentanglement of representations, thereby eliminating information redundancy within the multimodal joint representation. Furthermore, the challenge of estimating mutual information caused by the limited labeled data is mitigated by introducing unlabeled data. Meanwhile, the unlabeled data also help to characterize the underlying structure of multimodal data, consequently further preventing overfitting and enhancing the performance of the models. Experimental results on several widely used benchmark datasets validate the effectiveness of our proposed approach.

SDApr 17, 2019
A Multi-Task Learning Framework for Overcoming the Catastrophic Forgetting in Automatic Speech Recognition

Jiabin Xue, Jiqing Han, Tieran Zheng et al.

Recently, data-driven based Automatic Speech Recognition (ASR) systems have achieved state-of-the-art results. And transfer learning is often used when those existing systems are adapted to the target domain, e.g., fine-tuning, retraining. However, in the processes, the system parameters may well deviate too much from the previously learned parameters. Thus, it is difficult for the system training process to learn knowledge from target domains meanwhile not forgetting knowledge from the previous learning process, which is called as catastrophic forgetting (CF). In this paper, we attempt to solve the CF problem with the lifelong learning and propose a novel multi-task learning (MTL) training framework for ASR. It considers reserving original knowledge and learning new knowledge as two independent tasks, respectively. On the one hand, we constrain the new parameters not to deviate too far from the original parameters and punish the new system when forgetting original knowledge. On the other hand, we force the new system to solve new knowledge quickly. Then, a MTL mechanism is employed to get the balance between the two tasks. We applied our method to an End2End ASR task and obtained the best performance in both target and original datasets.

SDApr 17, 2019
Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Jiabin Xue, Jiqing Han, Tieran Zheng et al.

It is an effective way that improves the performance of the existing Automatic Speech Recognition (ASR) systems by retraining with more and more new training data in the target domain. Recently, Deep Neural Network (DNN) has become a successful model in the ASR field. In the training process of the DNN based methods, a back propagation of error between the transcription and the corresponding annotated text is used to update and optimize the parameters. Thus, the parameters are more influenced by the training samples with a big propagation error than the samples with a small one. In this paper, we define the samples with significant error as the hard samples and try to improve the performance of the ASR system by adding many of them. Unfortunately, the hard samples are sparse in the training data of the target domain, and manually label them is expensive. Therefore, we propose a hard samples mining method based on an enhanced deep multiple instance learning, which can find the hard samples from unlabeled training data by using a small subset of the dataset with manual labeling in the target domain. We applied our method to an End2End ASR task and obtained the best performance.

ITJul 23, 2012
Guarantees of Augmented Trace Norm Models in Tensor Recovery

Ziqiang Shi, Jiqing Han, Tieran Zheng et al.

This paper studies the recovery guarantees of the models of minimizing $\|\mathcal{X}\|_*+\frac{1}{2α}\|\mathcal{X}\|_F^2$ where $\mathcal{X}$ is a tensor and $\|\mathcal{X}\|_*$ and $\|\mathcal{X}\|_F$ are the trace and Frobenius norm of respectively. We show that they can efficiently recover low-rank tensors. In particular, they enjoy exact guarantees similar to those known for minimizing $\|\mathcal{X}\|_*$ under the conditions on the sensing operator such as its null-space property, restricted isometry property, or spherical section property. To recover a low-rank tensor $\mathcal{X}^0$, minimizing $\|\mathcal{X}\|_*+\frac{1}{2α}\|\mathcal{X}\|_F^2$ returns the same solution as minimizing $\|\mathcal{X}\|_*$ almost whenever $α\geq10\mathop {\max}\limits_{i}\|X^0_{(i)}\|_2$.