Understanding and Compressing Music with Maximal Transformable Patterns
This work addresses music analysis and compression for researchers in computational musicology, but it is incremental as it builds on existing pattern discovery and compression techniques.
The authors tackled the problem of compressing and understanding music by developing a polynomial-time algorithm to discover maximal patterns in point sets under user-specified transformations, and a second algorithm for lossless compression using these patterns. They evaluated it on folk-song melody classification, finding that broader transformation classes improved classification performance but not compression factor, with compression factors not explicitly quantified.
We present a polynomial-time algorithm that discovers all maximal patterns in a point set, $D\subset\mathbb{R}^k$, that are related by transformations in a user-specified class, $F$, of bijections over $\mathbb{R}^k$. We also present a second algorithm that discovers the set of occurrences for each of these maximal patterns and then uses compact encodings of these occurrence sets to compute a losslessly compressed encoding of the input point set. This encoding takes the form of a set of pairs, $E=\left\lbrace\left\langle P_1, T_1\right\rangle,\left\langle P_2, T_2\right\rangle,\ldots\left\langle P_{\ell}, T_{\ell}\right\rangle\right\rbrace$, where each $\langle P_i,T_i\rangle$ consists of a maximal pattern, $P_i\subseteq D$, and a set, $T_i\subset F$, of transformations that map $P_i$ onto other subsets of $D$. Each transformation is encoded by a vector of real values that uniquely identifies it within $F$ and the length of this vector is used as a measure of the complexity of $F$. We evaluate the new compression algorithm with three transformation classes of differing complexity, on the task of classifying folk-song melodies into tune families. The most complex of the classes tested includes all combinations of the musical transformations of transposition, inversion, retrograde, augmentation and diminution. We found that broadening the transformation class improved performance on this task. However, it did not, on average, improve compression factor, which may be due to the datasets (in this case, folk-song melodies) being too short and simple to benefit from the potentially greater number of pattern relationships that are discoverable with larger transformation classes.