MLNov 24, 2023Code
FRUITS: Feature Extraction Using Iterated Sums for Time Series ClassificationJoscha Diehl, Richard Krieg
We introduce a pipeline for time series classification that extracts features based on the iterated-sums signature (ISS) and then applies a linear classifier. These features are intrinsically nonlinear, capture chronological information, and, under certain settings, are invariant to time-warping. We are competitive with state-of-the-art methods on the UCR archive, both in terms of accuracy and speed. We make our code available at \url{https://github.com/irkri/fruits}.
LGAug 27, 2025Code
Global Permutation EntropyAbhijeet Avhale, Joscha Diehl, Niraj Velankar et al.
Permutation Entropy, introduced by Bandt and Pompe, is a widely used complexity measure for real-valued time series that is based on the relative order of values within consecutive segments of fixed length. After standardizing each segment to a permutation and computing the frequency distribution of these permutations, Shannon Entropy is then applied to quantify the series' complexity. We introduce Global Permutation Entropy (GPE), a novel index that considers all possible patterns of a given length, including non-consecutive ones. Its computation relies on recently developed algorithms that enable the efficient extraction of full permutation profiles. We illustrate some properties of GPE and demonstrate its effectiveness through experiments on synthetic datasets, showing that it reveals structural information not accessible through standard permutation entropy. We provide a Julia package for the calculation of GPE at `https://github.com/AThreeH1/Global-Permutation-Entropy'.
CTApr 7, 2025
Aggregating time-series and image data: functors and double functorsJoscha Diehl
Aggregation of time-series or image data over subsets of the domain is a fundamental task in data science. We show that many known aggregation operations can be interpreted as (double) functors on appropriate (double) categories. Such functorial aggregations are amenable to parallel implementation via straightforward extensions of Blelloch's parallel scan algorithm. In addition to providing a unified viewpoint on existing operations, it allows us to propose new aggregation operations for time-series and image data.
RADec 8, 2020
Generalized iterated-sums signaturesJoscha Diehl, Kurusch Ebrahimi-Fard, Nikolas Tapia
We explore the algebraic properties of a generalized version of the iterated-sums signature, inspired by previous work of F.~Király and H.~Oberhauser. In particular, we show how to recover the character property of the associated linear map over the tensor algebra by considering a deformed quasi-shuffle product of words on the latter. We introduce three non-linear transformations on iterated-sums signatures, close in spirit to Machine Learning applications, and show some of their properties.
RASep 17, 2020
Tropical time series, iterated-sums signatures and quasisymmetric functionsJoscha Diehl, Kurusch Ebrahimi-Fard, Nikolas Tapia
Aiming for a systematic feature-extraction from time series, we introduce the iterated-sums signature over arbitrary commutative semirings. The case of the tropical semiring is a central, and our motivating example. It leads to features of (real-valued) time series that are not easily available using existing signature-type objects. We demonstrate how the signature extracts chronological aspects of a time series, and that its calculation is possible in linear time. We identify quasisymmetric expressions over semirings as the appropriate framework for iterated-sums signatures over semiring-valued time series.
RAJun 13, 2019
Time-warping invariants of multidimensional time seriesJoscha Diehl, Kurusch Ebrahimi-Fard, Nikolas Tapia
In data science, one is often confronted with a time series representing measurements of some quantity of interest. Usually, as a first step, features of the time series need to be extracted. These are numerical quantities that aim to succinctly describe the data and to dampen the influence of noise. In some applications, these features are also required to satisfy some invariance properties. In this paper, we concentrate on time-warping invariants. We show that these correspond to a certain family of iterated sums of the increments of the time series, known as quasisymmetric functions in the mathematics literature. We present these invariant features in an algebraic framework, and we develop some of their basic properties.
CVJan 18, 2018
Invariants of multidimensional time series based on their iterated-integral signatureJoscha Diehl, Jeremy Reizenstein
We introduce a novel class of features for multidimensional time series, that are invariant with respect to transformations of the ambient space. The general linear group, the group of rotations and the group of permutations of the axes are considered. The starting point for their construction is Chen's iterated-integral signature.
CVMay 29, 2013
Rotation invariants of two dimensional curves based on iterated integralsJoscha Diehl
We introduce a novel class of rotation invariants of two dimensional curves based on iterated integrals. The invariants we present are in some sense complete and we describe an algorithm to calculate them, giving explicit computations up to order six. We present an application to online (stroke-trajectory based) character recognition. This seems to be the first time in the literature that the use of iterated integrals of a curve is proposed for (invariant) feature extraction in machine learning applications.