AIDec 12, 2024

Structural Entropy Guided Probabilistic Coding

arXiv:2412.08841v24 citationsh-index: 6Has CodeAAAI
Originality Highly original
AI Analysis

This addresses a limitation in probabilistic representation learning for natural language understanding tasks, offering improved performance over existing methods.

The paper tackles the problem that existing probabilistic embedding methods ignore structural relationships between latent variables, proposing a structural entropy-guided probabilistic coding model (SEPC) that incorporates structural information through a novel regularization loss and probabilistic encoding tree. Experimental results on 12 natural language understanding tasks show SEPC outperforms state-of-the-art models in effectiveness, generalization, and robustness to label noise.

Probabilistic embeddings have several advantages over deterministic embeddings as they map each data point to a distribution, which better describes the uncertainty and complexity of data. Many works focus on adjusting the distribution constraint under the Information Bottleneck (IB) principle to enhance representation learning. However, these proposed regularization terms only consider the constraint of each latent variable, omitting the structural information between latent variables. In this paper, we propose a novel structural entropy-guided probabilistic coding model, named SEPC. Specifically, we incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Besides, as traditional structural information theory is not well-suited for regression tasks, we propose a probabilistic encoding tree, transferring regression tasks to classification tasks while diminishing the influence of the transformation. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC compared to other state-of-the-art models in terms of effectiveness, generalization capability, and robustness to label noise. The codes and datasets are available at https://github.com/SELGroup/SEPC.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes