LGSep 21, 2023

Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling

arXiv:2309.11983v35 citationsh-index: 24
Originality Incremental advance
AI Analysis

This work addresses the problem of handling data variability in sequence modeling tasks like speech recognition, offering an incremental improvement by extending CTC to variational frameworks.

The paper tackles the limitation of Connectionist Temporal Classification (CTC) being only applicable to deterministic models by integrating it with a variational model, resulting in two novel loss functions that enable training more generalizable order-preserving sequence models.

Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences. However, CTC is only applied to deterministic sequence models, where the latent space is discontinuous and sparse, which in turn makes them less capable of handling data variability when compared to variational models. In this paper, we integrate CTC with a variational model and derive loss functions that can be used to train more generalizable sequence models that preserve order. Specifically, we derive two versions of the novel variational CTC based on two reasonable assumptions, the first being that the variational latent variables at each time step are conditionally independent; and the second being that these latent variables are Markovian. We show that both loss functions allow direct optimization of the variational lower bound for the model log-likelihood, and present computationally tractable forms for implementing them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes