CVNov 5, 2023

CycleCL: Self-supervised Learning for Periodic Videos

arXiv:2311.03402v25 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of analyzing periodic video sequences for applications like production systems and medical monitoring, offering a domain-specific solution.

The paper tackles the problem of self-supervised learning for periodic videos, where existing methods fail to capture cycle progression and ignore noise, by proposing CycleCL, a contrastive learning method that optimizes for phase sensitivity and repetition invariance, resulting in significant performance improvements on industrial and human action datasets.

Analyzing periodic video sequences is a key topic in applications such as automatic production systems, remote sensing, medical applications, or physical training. An example is counting repetitions of a physical exercise. Due to the distinct characteristics of periodic data, self-supervised methods designed for standard image datasets do not capture changes relevant to the progression of the cycle and fail to ignore unrelated noise. They thus do not work well on periodic data. In this paper, we propose CycleCL, a self-supervised learning method specifically designed to work with periodic data. We start from the insight that a good visual representation for periodic data should be sensitive to the phase of a cycle, but be invariant to the exact repetition, i.e. it should generate identical representations for a specific phase throughout all repetitions. We exploit the repetitions in videos to design a novel contrastive learning method based on a triplet loss that optimizes for these desired properties. Our method uses pre-trained features to sample pairs of frames from approximately the same phase and negative pairs of frames from different phases. Then, we iterate between optimizing a feature encoder and resampling triplets, until convergence. By optimizing a model this way, we are able to learn features that have the mentioned desired properties. We evaluate CycleCL on an industrial and multiple human actions datasets, where it significantly outperforms previous video-based self-supervised learning methods on all tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes