CVAIIVNov 25, 2024

Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN

arXiv:2411.16405v12 citationsh-index: 5BigData
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of limited training data for OMR systems, which is incremental as it applies existing GAN methods to a specific domain.

The paper tackled the data scarcity problem for Optical Music Recognition (OMR) systems by using Generative Adversarial Networks (GANs) to synthesize realistic handwritten music sheets, with the proposed CycleWGAN model achieving superior performance, including an FID score of 41.87, an IS of 2.29, and a KID of 0.05.

The generation of handwritten music sheets is a crucial step toward enhancing Optical Music Recognition (OMR) systems, which rely on large and diverse datasets for optimal performance. However, handwritten music sheets, often found in archives, present challenges for digitisation due to their fragility, varied handwriting styles, and image quality. This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. We provide a comprehensive evaluation of three GAN models - DCGAN, ProGAN, and CycleWGAN - comparing their ability to generate diverse and high-quality handwritten music images. The proposed CycleWGAN model, which enhances style transfer and training stability, significantly outperforms DCGAN and ProGAN in both qualitative and quantitative evaluations. CycleWGAN achieves superior performance, with an FID score of 41.87, an IS of 2.29, and a KID of 0.05, making it a promising solution for improving OMR systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes