LG CVJun 26, 2016

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification

Amir Ahooye Atashin, Kamaledin Ghiasi-Shirazi, Ahad Harati

arXiv:1606.08051v31.0

Originality Incremental advance

AI Analysis

This addresses a bottleneck for researchers in sequence labeling tasks like gesture recognition, though it is incremental as it combines existing methods.

The paper tackled the problem of training Latent-dynamic Conditional Random Fields (LDCRF) on unsegmented sequences, which was previously limited to pre-segmented data, by using Connectionist Temporal Classification (CTC). The result showed that the proposed method outperformed LDCRFs, hidden Markov models, and conditional random fields in gesture recognition tasks.

Many machine learning problems such as speech recognition, gesture recognition, and handwriting recognition are concerned with simultaneous segmentation and labeling of sequence data. Latent-dynamic conditional random field (LDCRF) is a well-known discriminative method that has been successfully used for this task. However, LDCRF can only be trained with pre-segmented data sequences in which the label of each frame is available apriori. In the realm of neural networks, the invention of connectionist temporal classification (CTC) made it possible to train recurrent neural networks on unsegmented sequences with great success. In this paper, we use CTC to train an LDCRF model on unsegmented sequences. Experimental results on two gesture recognition tasks show that the proposed method outperforms LDCRFs, hidden Markov models, and conditional random fields.

View on arXiv PDF

Similar