CLAIMay 10, 2024

HC$^2$L: Hybrid and Cooperative Contrastive Learning for Cross-lingual Spoken Language Understanding

arXiv:2405.06204v11 citationsh-index: 8IEEE Trans Pattern Anal Mach Intell
Originality Incremental advance
AI Analysis

This work addresses cross-lingual spoken language understanding for multilingual applications, representing an incremental improvement by enhancing existing contrastive learning methods with label-aware semantics.

The paper tackled the problem of zero-shot cross-lingual spoken language understanding by proposing a hybrid and cooperative contrastive learning approach that integrates unsupervised and supervised contrastive learning mechanisms, achieving new state-of-the-art performance with consistent improvements over 9 languages.

State-of-the-art model for zero-shot cross-lingual spoken language understanding performs cross-lingual unsupervised contrastive learning to achieve the label-agnostic semantic alignment between each utterance and its code-switched data. However, it ignores the precious intent/slot labels, whose label information is promising to help capture the label-aware semantics structure and then leverage supervised contrastive learning to improve both source and target languages' semantics. In this paper, we propose Hybrid and Cooperative Contrastive Learning to address this problem. Apart from cross-lingual unsupervised contrastive learning, we design a holistic approach that exploits source language supervised contrastive learning, cross-lingual supervised contrastive learning and multilingual supervised contrastive learning to perform label-aware semantics alignments in a comprehensive manner. Each kind of supervised contrastive learning mechanism includes both single-task and joint-task scenarios. In our model, one contrastive learning mechanism's input is enhanced by others. Thus the total four contrastive learning mechanisms are cooperative to learn more consistent and discriminative representations in the virtuous cycle during the training process. Experiments show that our model obtains consistent improvements over 9 languages, achieving new state-of-the-art performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes