Fred Jalbert-Desforges

2papers

2 Papers

28.8SDMay 15Code
vega-mir: An information-theoretic Python toolkit for symbolic music, with applications to harmonic graphs and rubato spectra

Fred Jalbert-Desforges

We present vega-mir, an open-source Python library that bundles nine information-theoretic and statistical metrics for the analysis of symbolic music corpora behind a small, tested, citable API, and demonstrates two of them at corpus scale in case studies not addressed by the upstream Cygnus paper. Of the nine metrics, three (Shannon entropy, Kullback-Leibler divergence, Zipfian fits) were deployed in the companion Cygnus arXiv preprint; two (network analysis on chord-transition graphs and spectral analysis of rubato curves) are deployed in full case studies here; the four remaining (multi-dimensional Gini, chi-squared stationarity, Higuchi fractal dimension, interval distribution) are validated against analytic anchors and exercised as sanity checks on a bundled 8-composer dataset. The two case studies yield two main observations. First, on the fourteen MAESTRO composers with N >= 10 pieces, the PageRank value of the gravity-centre node correlates with the marginal Kullback-Leibler distance at rho = 0.61 (Spearman, composer-level jackknife N = 14); the categorical gravity-centre identity takes five distinct values across the corpus but is not itself correlated with marginal KL (rho = 0.13, p = 0.21). Second, on the 247-piece Bach multi-master corpus (Schiff, Gould, Richter), Gould holds the highest periodicity ratio of the three performers, not the lowest, inverting the cliché that low scalar rubato reads as "metronomic": Gould's rubato is small in amplitude but structured in time, with a median dominant period of 66 beats against Schiff's 102 and Richter's 104.

0.9SDApr 25
An audio-to-analysis pipeline with certified transcription for information-theoretic profiling of the piano repertoire

Fred Jalbert-Desforges

We present an audio-to-analysis pipeline that produces composer-level information-theoretic profiles : reflecting compositional vocabulary as it emerges from aggregated performances : from raw recordings, built on a transcription layer whose accuracy we certify on a standard benchmark (F1 = 0.9791 on the MAESTRO v3.0.0 test set). Applied to 1,238 pieces and 15 MAESTRO composers with at least ten attributed pieces, spanning the Baroque through the early twentieth century, the pipeline derives empirical distributions over harmonic scale degrees and analyzes them through Shannon entropy, asymmetric Kullback-Leibler divergence, and Zipfian rank-frequency modeling. The resulting profiles (i) order composers along an interpretable axis of harmonic predictability, with a narrow entropy range (3.33-3.86 bits) that reveals the marginal-level similarity of tonal vocabularies; (ii) recover known stylistic lineages (Haydn-Beethoven, Liszt-Rachmaninoff, Schubert-Schumann) through the smallest KL divergences in the corpus, with Mendelssohn emerging as a stable outlier within this corpus; and (iii) separate contemporary neoclassical artists (Richter, Frahm, Glass, Arnalds, Jóhannsson) from historical composers on the quality of Zipfian fit to the transition distribution, with mean $R^2 = 0.78$ for neoclassical versus 0.46 for historical (N $\geq$ 10 pieces each). This gap is larger than the spread within either group and is consistent with a minimalist compositional tendency: a compact transition vocabulary used with sharper frequency-rank regularity than historical composers. All estimates are reported with Laplace-smoothed bootstrap 95% confidence intervals.