LGMLSep 24, 2020

Semi-supervised sequence classification through change point detection

arXiv:2009.11829v28 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of expensive labeling in sequential data applications like human activity recognition, offering an incremental improvement over existing semi-supervised methods.

The paper tackles the problem of learning classifiers for sequential sensor data with limited labeled data by proposing a semi-supervised framework that uses change point detection to identify class changes and generate similar/dissimilar sequence pairs, resulting in improved representations and performance on synthetic and real-world activity recognition datasets.

Sequential sensor data is generated in a wide variety of practical applications. A fundamental challenge involves learning effective classifiers for such sequential data. While deep learning has led to impressive performance gains in recent years in domains such as speech, this has relied on the availability of large datasets of sequences with high-quality labels. In many applications, however, the associated class labels are often extremely limited, with precise labelling/segmentation being too expensive to perform at a high volume. However, large amounts of unlabeled data may still be available. In this paper we propose a novel framework for semi-supervised learning in such contexts. In an unsupervised manner, change point detection methods can be used to identify points within a sequence corresponding to likely class changes. We show that change points provide examples of similar/dissimilar pairs of sequences which, when coupled with labeled, can be used in a semi-supervised classification setting. Leveraging the change points and labeled data, we form examples of similar/dissimilar sequences to train a neural network to learn improved representations for classification. We provide extensive synthetic simulations and show that the learned representations are superior to those learned through an autoencoder and obtain improved results on both simulated and real-world human activity recognition datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes