CVApr 7
A High-Accuracy Optical Music Recognition Method Based on Bottleneck Residual ConvolutionsJunwen Ma, Huhu Xue, Xingyuan Zhao et al.
Optical Music Recognition (OMR) aims to convert printed or handwritten music score images into editable symbolic representations. This paper presents an end-to-end OMR framework that combines residual bottleneck convolutions with bidirectional gated recurrent unit (BiGRU)-based sequence modeling. A convolutional neural network with ResNet-v2-style residual bottleneck blocks and multi-scale dilated convolutions is used to extract features that encode both fine-grained symbol details and global staff-line structures. The extracted feature sequences are then fed into a BiGRU network to model temporal dependencies among musical symbols. The model is trained using the Connectionist Temporal Classification loss, enabling end-to-end prediction without explicit alignment annotations. Experimental results on the Camera-PrIMuS and PrIMuS datasets demonstrate the effectiveness of the proposed framework. On Camera-PrIMuS, the proposed method achieves a sequence error rate (SeER) of $7.52\%$ and a symbol error rate (SyER) of $0.45\%$, with pitch, type, and note accuracies of $99.33\%$, $99.60\%$, and $99.28\%$, respectively. The average training time is 1.74~s per epoch, demonstrating high computational efficiency while maintaining strong recognition performance. On PrIMuS, the method achieves a SeER of $8.11\%$ and a SyER of $0.49\%$, with pitch, type, and note accuracies of $99.27\%$, $99.58\%$, and $99.21\%$, respectively. A fine-grained error analysis further confirms the effectiveness of the proposed model.
MLDec 20, 2023
Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST)Xingyuan Zhao, Ruyu Zhou, Fang Liu
Applying a randomized algorithm to a subset rather than the entire dataset amplifies privacy guarantees. We propose a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification (PA) in the context of differential privacy (DP). We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures through newly introduced concept including strong vs weak PA effects and aligned privacy profile. We provide the privacy loss composition analysis over repeated applications of MUST via the Fourier accountant algorithm. Our theoretical and empirical results suggest that MUST offers stronger PA in $ε$ than the common one-stage sampling procedures including Poisson sampling, sampling without replacement, and sampling with replacement, while the results on $δ$ vary case by case. Our experiments show that MUST is non-inferior in the utility and stability of privacy-preserving (PP) outputs to one-stage subsampling methods at similar privacy loss while enhancing the computational efficiency of algorithms that require complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures that involve parallel or simultaneous subsampling when DP guarantees are necessary.