AINov 20, 2025Code
Multidimensional Rubric-oriented Reward Model Learning via Geometric Projection Reference ConstraintsYongnan Jin, Xurui Li, Feng Cao et al.
The integration of large language models (LLMs) into medical practice holds transformative potential, yet their real-world clinical utility remains limited by critical alignment challenges: (1) a disconnect between static evaluation benchmarks and dynamic clinical cognitive needs, (2) difficulties in adapting to evolving, multi-source medical standards, and (3) the inability of conventional reward models to capture nuanced, multi-dimensional medical quality criteria. To address these gaps, we propose MR-RML (Multidimensional Rubric-oriented Reward Model Learning) via GPRC (Geometric Projection Reference Constraints), a novel alignment framework that integrates medical standards into a structured "Dimensions-Scenarios-Disciplines" matrix to guide data generation and model optimization. MR-RML introduces three core innovations: (1) a "Dimensions-Scenarios-Disciplines" medical standard system that embeds domain standards into the full training pipeline; (2) an independent multi-dimensional reward model that decomposes evaluation criteria, shifting from real-time rubric-based scoring to internalized reward modeling for improved consistency and cost-efficiency; (3) geometric projection reference constraints that transform medical cognitive logic into mathematical regularization, aligning scoring gradients with clinical reasoning and enabling synthetic data-driven training. Through extensive evaluations on the authoritative medical benchmark Healthbench, our method yields substantial performance gains over the base LLM Qwen-32B (45% on the full subset and 85% on Hard subset, respectively). It achieves a SOTA among open-source LLMs with scores of 62.7 (full subset) and 44.7 (hard subset), while also outperforming the majority of closed-source models.
SPJul 23, 2025Code
HuiduRep: A Robust Self-Supervised Framework for Learning Neural Representations from Extracellular RecordingsFeng Cao, Zishuo Feng, Wei Shi et al.
Extracellular recordings are transient voltage fluctuations in the vicinity of neurons, serving as a fundamental modality in neuroscience for decoding brain activity at single-neuron resolution. Spike sorting, the process of attributing each detected spike to its corresponding neuron, is a pivotal step in brain sensing pipelines. However, it remains challenging under low signal-to-noise ratio (SNR), electrode drift, and cross-session variability. In this paper, we propose HuiduRep, a robust self-supervised representation learning framework that extracts discriminative and generalizable features from extracellular recordings. By integrating contrastive learning with a denoising autoencoder, HuiduRep learns latent representations robust to noise and drift. With HuiduRep, we develop a spike sorting pipeline that clusters spike representations without ground truth labels. Experiments on hybrid and real-world datasets demonstrate that HuiduRep achieves strong robustness. Furthermore, the pipeline significantly outperforms state-of-the-art tools such as KiloSort4 and MountainSort5 on accuracy and precision on diverse datasets. These findings demonstrate the potential of self-supervised spike representation learning as a foundational tool for robust and generalizable processing of extracellular recordings. Code is available at: https://github.com/IgarashiAkatuki/HuiduRep
AIOct 12, 2025
Extended Triangular Method: A Generalized Algorithm for Contradiction Separation Based Automated DeductionYang Xu, Shuwei Chen, Jun Liu et al.
Automated deduction lies at the core of Artificial Intelligence (AI), underpinning theorem proving, formal verification, and logical reasoning. Despite decades of progress, reconciling deductive completeness with computational efficiency remains an enduring challenge. Traditional reasoning calculi, grounded in binary resolution, restrict inference to pairwise clause interactions and thereby limit deductive synergy among multiple clauses. The Contradiction Separation Extension (CSE) framework, introduced in 2018, proposed a dynamic multi-clause reasoning theory that redefined logical inference as a process of contradiction separation rather than sequential resolution. While that work established the theoretical foundation, its algorithmic realization remained unformalized and unpublished. This work presents the Extended Triangular Method (ETM), a generalized contradiction-construction algorithm that formalizes and extends the internal mechanisms of contradiction separation. The ETM unifies multiple contradiction-building strategies, including the earlier Standard Extension method, within a triangular geometric framework that supports flexible clause interaction and dynamic synergy. ETM serves as the algorithmic core of several high-performance theorem provers, CSE, CSE-E, CSI-E, and CSI-Enig, whose competitive results in standard first-order benchmarks (TPTP problem sets and CASC 2018-2015) empirically validate the effectiveness and generality of the proposed approach. By bridging theoretical abstraction and operational implementation, ETM advances the contradiction separation paradigm into a generalized, scalable, and practically competitive model for automated reasoning, offering new directions for future research in logical inference and theorem proving.
CLNov 18, 2024
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese CharactersZishuo Feng, Feng Cao
The task of converting Hanyu Pinyin abbreviations to Chinese characters is a significant branch within the domain of Chinese Spelling Correction (CSC). It plays an important role in many downstream applications such as named entity recognition and sentiment analysis. This task typically involves text-length alignment and seems easy to solve; however, due to the limited information content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we treat this as a fill-mask task and propose CNMBERT, which stands for zh-CN Pinyin Multi-mask BERT Model, as a solution to this issue. By introducing a multi-mask strategy and Mixture of Experts (MoE) layers, CNMBERT outperforms fine-tuned GPT models and ChatGPT-4o with a 61.53% MRR score and 51.86% accuracy on a 10,373-sample test dataset.
ROMay 23, 2021
Insect-Computer Hybrid System for Autonomous Search and Rescue MissionP. Thanh Tran-Ngoc, D. Long Le, Bing Sheng Chong et al.
There is still a long way to go before artificial mini robots are really used for search and rescue missions in disaster-hit areas due to hindrance in power consumption, computation load of the locomotion, and obstacle-avoidance system. Insect-computer hybrid system, which is the fusion of living insect platform and microcontroller, emerges as an alternative solution. This study demonstrates the first-ever insect-computer hybrid system conceived for search and rescue missions, which is capable of autonomous navigation and human presence detection in an unstructured environment. Customized navigation control algorithm utilizing the insect's intrinsic navigation capability achieved exploration and negotiation of complex terrains. On-board high-accuracy human presence detection using infrared camera was achieved with a custom machine learning model. Low power consumption suggests system suitability for hour-long operations and its potential for realization in real-life missions.