Zhuo Zuo

h-index8

3papers

142citations

3 Papers

CLJun 26Code

Enhancing Numerical Prediction in LLMs via Smooth MMD Alignment

Zhuo Zuo, Li Yue, Wenhao Zheng et al.

Despite their strong general capabilities, large language models (LLMs) often remain unreliable when outputs must be numerically precise. A key reason is the training objective: standard cross-entropy treats numeric tokens as unstructured categories and ignores the metric structure of their values. We address this mismatch with Smooth Maximum Mean Discrepancy (SMMD), which builds on the classic MMD by incorporating value-distance kernels over numeric tokens and graph-based smoothness. With this kernel defined over a numeric sub-vocabulary, SMMD aligns the predicted numeric distribution to the target via kernel matching and smooths the prediction-target residual over the induced kernel graph to encourage local consistency. We evaluate SMMD on four numeric-target tasks: mathematical reasoning, arithmetic calculation, clock-time recognition, and chart question answering, across multiple open-weight LLM and VLM backbones. SMMD consistently improves accuracy over both cross-entropy and recent numeric-target losses; analyses show complementary effects between MMD and smoothness and underscore the importance of distance-based kernel design. Code is available at https://github.com/Zuozhuo/smmd-loss.

2.3SPMay 26, 2023Code

Pulse shape discrimination based on the Tempotron: a powerful classifier on GPU

Haoran Liu, Peng Li, Ming-Zhe Liu et al.

This study utilized the Tempotron, a robust classifier based on a third-generation neural network model, for pulse shape discrimination. By eliminating the need for manual feature extraction, the Tempotron model can process pulse signals directly, generating discrimination results based on prior knowledge. The study performed experiments using GPU acceleration, resulting in over 500 times faster compared to the CPU-based model, and investigated the impact of noise augmentation on the Tempotron performance. Experimental results substantiated that Tempotron serves as a formidable classifier, adept at accomplishing high discrimination accuracy on both AmBe and time-of-flight PuBe datasets. Furthermore, analyzing the neural activity of Tempotron during training shed light on its learning characteristics and aided in selecting its hyperparameters. Moreover, the study addressed the constraints and potential avenues for future development in utilizing the Tempotron for pulse shape discrimination. The dataset used in this study and the GPU-based Tempotron are publicly available on GitHub at https://github.com/HaoranLiu507/TempotronGPU.

5.2CVMar 26, 2024

Random-coupled Neural Network

Haoran Liu, Mingzhe Liu, Peng Li et al.

Improving the efficiency of current neural networks and modeling them in biological neural systems have become popular research directions in recent years. Pulse-coupled neural network (PCNN) is a well applicated model for imitating the computation characteristics of the human brain in computer vision and neural network fields. However, differences between the PCNN and biological neural systems remain: limited neural connection, high computational cost, and lack of stochastic property. In this study, random-coupled neural network (RCNN) is proposed. It overcomes these difficulties in PCNN's neuromorphic computing via a random inactivation process. This process randomly closes some neural connections in the RCNN model, realized by the random inactivation weight matrix of link input. This releases the computational burden of PCNN, making it affordable to achieve vast neural connections. Furthermore, the image and video processing mechanisms of RCNN are researched. It encodes constant stimuli as periodic spike trains and periodic stimuli as chaotic spike trains, the same as biological neural information encoding characteristics. Finally, the RCNN is applicated to image segmentation, fusion, and pulse shape discrimination subtasks. It is demonstrated to be robust, efficient, and highly anti-noised, with outstanding performance in all applications mentioned above.