Congren Dai

LG
h-index14
3papers
3citations
Novelty52%
AI Score33

3 Papers

SDNov 24, 2025
Musical Score Understanding Benchmark: Evaluating Large Language Models' Comprehension of Complete Musical Scores

Congren Dai, Yue Yang, Krinos Li et al.

Understanding complete musical scores entails integrated reasoning over pitch, rhythm, harmony, and large-scale structure, yet the ability of Large Language Models and Vision-Language Models to interpret full musical notation remains insufficiently examined. We introduce the Musical Score Understanding Benchmark (MSU-Bench), the first large-scale, human-curated benchmark for score-level musical understanding across textual (ABC notation) and visual (PDF) modalities. MSU-Bench contains 1,800 generative Question-Answering pairs from works by Bach, Beethoven, Chopin, Debussy, and others, organised into four levels of increasing difficulty, ranging from onset information to texture and form. Evaluations of more than fifteen state-of-the-art models, in both zero-shot and fine-tuned settings, reveal pronounced modality gaps, unstable level-wise performance, and challenges in maintaining multilevel correctness. Fine-tuning substantially improves results across modalities while preserving general knowledge, positioning MSU-Bench as a robust foundation for future research in multimodal reasoning. To facilitate further research, we publicly release MSU-Bench and all associated resources.

LGMay 23, 2025
Dynamic Dual Buffer with Divide-and-Conquer Strategy for Online Continual Learning

Congren Dai, Huichi Zhou, Jiahao Huang et al.

Online Continual Learning (OCL) presents a complex learning environment in which new data arrives in a batch-to-batch online format, and the risk of catastrophic forgetting can significantly impair model efficacy. In this study, we address OCL by introducing an innovative memory framework that incorporates a short-term memory system to retain dynamic information and a long-term memory system to archive enduring knowledge. Specifically, the long-term memory system comprises a collection of sub-memory buffers, each linked to a cluster prototype and designed to retain data samples from distinct categories. We propose a novel $K$-means-based sample selection method to identify cluster prototypes for each encountered category. To safeguard essential and critical samples, we introduce a novel memory optimisation strategy that selectively retains samples in the appropriate sub-memory buffer by evaluating each cluster prototype against incoming samples through an optimal transportation mechanism. This approach specifically promotes each sub-memory buffer to retain data samples that exhibit significant discrepancies from the corresponding cluster prototype, thereby ensuring the preservation of semantically rich information. In addition, we propose a novel Divide-and-Conquer (DAC) approach that formulates the memory updating as an optimisation problem and divides it into several subproblems. As a result, the proposed DAC approach can solve these subproblems separately and thus can significantly reduce computations of the proposed memory updating process. We conduct a series of experiments across standard and imbalanced learning settings, and the empirical findings indicate that the proposed memory framework achieves state-of-the-art performance in both learning contexts.

IVMar 7, 2025
Lightweight Hypercomplex MRI Reconstruction: A Generalized Kronecker-Parameterized Approach

Haosen Zhang, Jiahao Huang, Yinzhe Wu et al.

Magnetic Resonance Imaging (MRI) is crucial for clinical diagnostics but is hindered by prolonged scan times. Current deep learning models enhance MRI reconstruction but are often memory-intensive and unsuitable for resource-limited systems. This paper introduces a lightweight MRI reconstruction model leveraging Kronecker-Parameterized Hypercomplex Neural Networks to achieve high performance with reduced parameters. By integrating Kronecker-based modules, including Kronecker MLP, Kronecker Window Attention, and Kronecker Convolution, the proposed model efficiently extracts spatial features while preserving representational power. We introduce Kronecker U-Net and Kronecker SwinMR, which maintain high reconstruction quality with approximately 50% fewer parameters compared to existing models. Experimental evaluation on the FastMRI dataset demonstrates competitive PSNR, SSIM, and LPIPS metrics, even at high acceleration factors (8x and 16x), with no significant performance drop. Additionally, Kronecker variants exhibit superior generalization and reduced overfitting on limited datasets, facilitating efficient MRI reconstruction on hardware-constrained systems. This approach sets a new benchmark for parameter-efficient medical imaging models.