Learning Penalty for Optimal Partitioning via Automatic Feature Extraction
This work addresses a key problem in changepoint detection for domains like genomics, offering an automated solution that improves accuracy over manual feature extraction methods.
The study tackled the challenge of determining the optimal penalty parameter for changepoint detection in Optimal Partitioning algorithms by proposing a novel approach that uses recurrent networks to learn the penalty directly from raw sequences through automatic feature extraction, achieving generally better accuracy on 20 benchmark genomic datasets compared to traditional methods.
Changepoint detection identifies significant shifts in data sequences, making it important in areas like finance, genetics, and healthcare. The Optimal Partitioning algorithms efficiently detect these changes, using a penalty parameter to limit the changepoints count. Determining the optimal value for this penalty can be challenging. Traditionally, this process involved manually extracting statistical features, such as sequence length or variance to make the prediction. This study proposes a novel approach that uses recurrent networks to learn this penalty directly from raw sequences by automatically extracting features. Experiments conducted on 20 benchmark genomic datasets show that this novel method generally outperforms traditional ones in changepoint detection accuracy.