SYMar 26, 2017
Improving Localization Accuracy in Connected Vehicle Networks Using Rao-Blackwellized Particle Filters: Theory, Simulations, and ExperimentsMacheng Shen, Ding Zhao, Jing Sun et al.
A crucial function for automated vehicle technologies is accurate localization. Lane-level accuracy is not readily available from low-cost Global Navigation Satellite System (GNSS) receivers because of factors such as multipath error and atmospheric bias. Approaches such as Differential GNSS can improve localization accuracy, but usually require investment in expensive base stations. Connected vehicle technologies provide an alternative approach to improving the localization accuracy. It will be shown in this paper that localization accuracy can be enhanced using crude GNSS measurements from a group of connected vehicles, by matching their locations to a digital map. A Rao-Blackwellized particle filter (RBPF) is used to jointly estimate the common biases of the pseudo-ranges and the vehicle positions. Multipath biases, which introduce receiver-specific (non-common) error, are mitigated by a multi-hypothesis detection-rejection approach. The temporal correlation of the estimations is exploited through the prediction-update process. The proposed approach is compared to existing methods using both simulations and experimental results. It was found that the proposed algorithm can eliminate the common biases and reduce the localization error to below 1 meter under open sky conditions.
SYMar 2, 2018
Model Predictive Climate Control of Connected and Automated Vehicles for Improved Energy EfficiencyHao Wang, Ilya Kolmanovsky, Mohammad Reza Amini et al.
This paper considers an application of model predictive control to automotive air conditioning (A/C) system in future connected and automated vehicles (CAVs) with battery electric or hybrid electric powertrains. A control-oriented prediction model for A/C system is proposed, identified, and validated against a higher fidelity simulation model (CoolSim). Based on the developed prediction model, a nonlinear model predictive control (NMPC) problem is formulated and solved online to minimize the energy consumption of the A/C system. Simulation results illustrate the desirable characteristics of the proposed NMPC solution such as being able to enforce physical constraints of the A/C system and maintain cabin temperature within a specified range. Moreover, it is shown that by utilizing the vehicle speed preview and through coordinated adjustment of the cabin temperature constraints, energy efficiency improvements of up to 9% can be achieved.
SYMar 20, 2019
Sequential Optimization of Speed, Thermal Load, and Power Split in Connected HEVsMohammad Reza Amini, Xun Gong, Yiheng Feng et al.
The emergence of connected and automated vehicles (CAVs) provides an unprecedented opportunity to capitalize on these technologies well beyond their original designed intents. While abundant evidence has been accumulated showing substantial fuel economy improvement benefits achieved through advanced powertrain control, the implications of the CAV operation on power and thermal management have not been fully investigated. In this paper, in order to explore the opportunities for the coordination between the onboard thermal management and the power split control, we present a sequential optimization solution for eco-driving speed trajectory planning, air conditioning (A/C) thermal load planning (eco-cooling), and powertrain control in hybrid electric CAVs to evaluate the individual as well as the collective energy savings through proactive usage of traffic data for vehicle speed prediction. Simulation results over a real-world driving cycle show that compared to a baseline non-CAV, 11.9%, 14.2%, and 18.8% energy savings can be accumulated sequentially through speed, thermal load, and power split optimizations, respectively.
AIAug 19, 2024Code
MSDiagnosis: A Benchmark for Evaluating Large Language Models in Multi-Step Clinical DiagnosisRuihui Hou, Shencheng Chen, Yongqi Fan et al.
Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a Chinese clinical diagnostic benchmark, called MSDiagnosis. This benchmark consists of 2,225 cases from 12 departments, covering tasks such as primary diagnosis, differential diagnosis, and final diagnosis. Additionally, we propose a novel and effective framework. This framework combines forward inference, backward inference, reflection, and refinement, enabling the large language model to self-evaluate and adjust its diagnostic results. To this end, we test open-source models, closed-source models, and our proposed framework.The experimental results demonstrate the effectiveness of the proposed method. We also provide a comprehensive experimental analysis and suggest future research directions for this task.
NAFeb 7, 2018
Algorithm implementation and numerical analysis for the two-dimensional tempered fractional LaplacianJing Sun, Daxin Nie, Weihua Deng
Tempered fractional Laplacian is the generator of the tempered isotropic Lévy process [W.H. Deng, B.Y. Li, W.Y. Tian, and P.W. Zhang, Multiscale Model. Simul., 16(1), 125-149, 2018]. This paper provides the finite difference discretization for the two dimensional tempered fractional Laplacian $(Δ+λ)^{\fracβ{2}}$. Then we use it to solve the tempered fractional Poisson equation with Dirichlet boundary conditions and derive the error estimates. Numerical experiments verify the convergence rates and effectiveness of the schemes.
GEO-PHSep 12, 2024
A convolutional neural network approach to deblending seismic dataJing Sun, Sigmund Slang, Thomas Elboth et al.
For economic and efficiency reasons, blended acquisition of seismic data is becoming more and more commonplace. Seismic deblending methods are always computationally demanding and normally consist of multiple processing steps. Besides, the parameter setting is not always trivial. Machine learning-based processing has the potential to significantly reduce processing time and to change the way seismic deblending is carried out. We present a data-driven deep learning-based method for fast and efficient seismic deblending. The blended data are sorted from the common source to the common channel domain to transform the character of the blending noise from coherent events to incoherent distributions. A convolutional neural network (CNN) is designed according to the special character of seismic data, and performs deblending with comparable results to those obtained with conventional industry deblending algorithms. To ensure authenticity, the blending was done numerically and only field seismic data were employed, including more than 20000 training examples. After training and validation of the network, seismic deblending can be performed in near real time. Experiments also show that the initial signal to noise ratio (SNR) is the major factor controlling the quality of the final deblended result. The network is also demonstrated to be robust and adaptive by using the trained model to firstly deblend a new data set from a different geological area with a slightly different delay time setting, and secondly deblend shots with blending noise in the top part of the data.
CVJul 5, 2023Code
Task-Specific Alignment and Multiple Level Transformer for Few-Shot Action RecognitionFei Guo, Li Zhu, YiWang Wang et al.
In the research field of few-shot learning, the main difference between image-based and video-based is the additional temporal dimension. In recent years, some works have used the Transformer to deal with frames, then get the attention feature and the enhanced prototype, and the results are competitive. However, some video frames may relate little to the action, and only using single frame-level or segment-level features may not mine enough information. We address these problems sequentially through an end-to-end method named "Task-Specific Alignment and Multiple-level Transformer Network (TSA-MLT)". The first module (TSA) aims at filtering the action-irrelevant frames for action duration alignment. Affine Transformation for frame sequence in the time dimension is used for linear sampling. The second module (MLT) focuses on the Multiple-level feature of the support prototype and query sample to mine more information for the alignment, which operates on different level features. We adopt a fusion loss according to a fusion distance that fuses the L2 sequence distance, which focuses on temporal order alignment, and the Optimal Transport distance, which focuses on measuring the gap between the appearance and semantics of the videos. Extensive experiments show our method achieves state-of-the-art results on the HMDB51 and UCF101 datasets and a competitive result on the benchmark of Kinetics and something 2-something V2 datasets. Our code is available at the URL: https://github.com/cofly2014/tsa-mlt.git
SYMay 31, 2019
Thermal Responses of Connected HEVs Engine and Aftertreatment Systems to Eco-DrivingMohammad Reza Amini, Yiheng Feng, Hao Wang et al.
Connected and automated vehicles (CAVs) have been recognized as providing unprecedented opportunities for substantial fuel economy improvement through CAV-based vehicle speed trajectory optimization (eco-driving). At the same time, the implications of the CAV operation on thermal responses, including those of engine and exhaust aftertreatment system, have not been fully investigated. To this end, firstly, a sequential optimization framework for vehicle speed trajectory planning and powertrain control in hybrid electric CAVs is proposed in this paper. Next, the impact of eco-driving and power split optimization on the engine and catalytic converter thermal responses, as well as on the tailpipe emissions is characterized. Despite an average 16% improvement in fuel economy through sequential optimization, this study shows that eco-driving slows down the thermal responses, which could unfavorably affect the tailpipe emissions.
SYJun 1, 2018
Evaluation of the Energy Efficiency in a Mixed Traffic with Automated Vehicles and Human Controlled VehiclesXun Gong, Yaohui Guo, Yiheng Feng et al.
The energy efficiency of Connected and Automated Vehicles (CAVs) is significantly influenced by surrounding road users. This paper presents the evaluation of energy efficiency of CAVs in a mixed traffic interacted with human controlled vehicles. To simulate the interaction between the CAVs and the cut-in vehicles controlled by human drivers near the intersection, a lane changing model is proposed to emulate the politeness and patience characteristics of the human driver. The proposed lane changing model is then calibrated based on over 100,000 naturalistic lane changing events collected by the University of Michigan Safety Pilot Model Deployment Program. A case study on simulation of the cut-in scenario is carried out to demonstrate the human driver's lane changing sensitivity under different driving trajectories of a frontal CAV and the influence on the energy consumption of the CAV due to the cut-in vehicle is evaluated. The simulation results indicate that the fuel economy of the CAV can be substantially improved if its surrounding cut-in vehicles can be well handled.
CVJul 8, 2024Code
DMSD-CDFSAR: Distillation from Mixed-Source Domain for Cross-Domain Few-shot Action RecognitionFei Guo, YiKang Wang, Han Qi et al.
Few-shot action recognition is an emerging field in computer vision, primarily focused on meta-learning within the same domain. However, challenges arise in real-world scenario deployment, as gathering extensive labeled data within a specific domain is laborious and time-intensive. Thus, attention shifts towards cross-domain few-shot action recognition, requiring the model to generalize across domains with significant deviations. Therefore, we propose a novel approach, ``Distillation from Mixed-Source Domain", tailored to address this conundrum. Our method strategically integrates insights from both labeled data of the source domain and unlabeled data of the target domain during the training. The ResNet18 is used as the backbone to extract spatial features from the source and target domains. We design two branches for meta-training: the original-source and the mixed-source branches. In the first branch, a Domain Temporal Encoder is employed to capture temporal features for both the source and target domains. Additionally, a Domain Temporal Decoder is employed to reconstruct all extracted features. In the other branch, a Domain Mixed Encoder is used to handle labeled source domain data and unlabeled target domain data, generating mixed-source domain features. We incorporate a pre-training stage before meta-training, featuring a network architecture similar to that of the first branch. Lastly, we introduce a dual distillation mechanism to refine the classification probabilities of source domain features, aligning them with those of mixed-source domain features. This iterative process enriches the insights of the original-source branch with knowledge from the mixed-source branch, thereby enhancing the model's generalization capabilities. Our code is available at URL: \url{https://xxxx/xxxx/xxxx.git}
SYFeb 5, 2018
Predictive Second Order Sliding Control of Constrained Linear Systems with Application to Automotive Control SystemsMohammad Reza Amini, Mahdi Shahbakhti, Jing Sun
This paper presents a new predictive second order sliding controller (PSSC) formulation for setpoint tracking of constrained linear systems. The PSSC scheme is developed by combining the concepts of model predictive control (MPC) and second order discrete sliding mode control. In order to guarantee the feasibility of the PSSC during setpoint changes, a virtual reference variable is added to the PSSC cost function to calculate the closest admissible set point. The states of the system are then driven asymptotically to this admissible setpoint by the control action of the PSSC. The performance of the proposed PSSC is evaluated for an advanced automotive engine case study, where a high fidelity physics-based model of a reactivity controlled compression ignition (RCCI) engine is utilized to serve as the virtual test-bed for the simulations. Considering the hard physical constraints on the RCCI engine states and control inputs, simultaneous tracking of engine load and optimal combustion phasing is a challenging objective to achieve. The simulation results of testing the proposed PSSC on the high fidelity RCCI model show that the developed predictive controller is able to track desired engine load and combustion phasing setpoints, with minimum steady state error, and no overshoot. Moreover, the simulation results confirm the robust tracking performance of the PSSC during transient operations, in the presence of engine cyclic variability.
NANov 12, 2018
Numerical scheme for the Fokker-Planck equations describing anomalous diffusions with two internal statesDaxin Nie, Jing Sun, Weihua Deng
Recently, the fractional Fokker-Planck equations (FFPEs) with multiple internal states are built for the particles undergoing anomalous diffusion with different waiting time distributions for different internal states, which describe the distribution of positions of the particles [Xu and Deng, Math. Model. Nat. Phenom., $\mathbf{13}$, 10 (2018)]. In this paper, we first develop the Sobolev regularity of the FFPEs with two internal states, including the homogeneous problem with smooth and nonsmooth initial values and the inhomogeneous problem with vanishing initial value, and then we design the numerical scheme for the system of fractional partial differential equations based on the finite element method for the space derivatives and convolution quadrature for the time fractional derivatives. The optimal error estimates of the scheme under the above three different conditions are provided for both space semidiscrete and fully discrete schemes. Finally, one- and two-dimensional numerical experiments are performed to confirm our theoretical analysis and the predicted convergence order.
NAAug 14, 2018
Numerical algorithms of the two-dimensional Feynman-Kac equation for reaction and diffusion processesDaxin Nie, Jing Sun, Weihua Deng
This paper provides a finite difference discretization for the backward Feynman-Kac equation, governing the distribution of functionals of the path for a particle undergoing both reaction and diffusion [Hou and Deng, J. Phys. A: Math. Theor., {\bf51}, 155001 (2018)]. Numerically solving the equation with the time tempered fractional substantial derivative and tempered fractional Laplacian consists in discretizing these two non-local operators. Here, using convolution quadrature, we provide a first-order and second-order schemes for discretizing the time tempered fractional substantial derivative, which doesn't require the assumption of the regularity of the solution in time; we use the finite difference method to approximate the two-dimensional tempered fractional Laplacian, and the accuracy of the scheme depends on the regularity of the solution on $\barΩ$ rather than the whole space. Lastly, we verify the predicted convergence orders and the effectiveness of the presented schemes by numerical examples.
NAApr 9, 2018
A reduced finite element formulation for space fractional partial differential equationJing Sun, Daxin Nie, Weihua Deng
Applying proper orthogonal decomposition to a usual finite element (FE) formulation for space fractional partial differential equation, we get a reduced FE model, which greatly reduces the complexity of computation. Then, the stability analysis and error estimate for the reduced model are presented. Finally, we verify the effectiveness of the algorithm by numerical experiments.
GEO-PHSep 13, 2024
Using Convolutional Neural Networks for Denoising and Deblending of Marine Seismic DataSigmund Slang, Jing Sun, Thomas Elboth et al.
Processing marine seismic data is computationally demanding and consists of multiple time-consuming steps. Neural network based processing can, in theory, significantly reduce processing time and has the potential to change the way seismic processing is done. In this paper we are using deep convolutional neural networks (CNNs) to remove seismic interference noise and to deblend seismic data. To train such networks, a significant amount of computational memory is needed since a single shot gather consists of more than 106 data samples. Preliminary results are promising both for denoising and deblending. However, we also observed that the results are affected by the signal-to-noise ratio (SnR). Moving to common channel domain is a way of breaking the coherency of the noise while also reducing the input volume size. This makes it easier for the network to distinguish between signal and noise. It also increases the efficiency of the GPU memory usage by enabling better utilization of multi core processing. Deblending in common channel domain with the use of a CNN yields relatively good results and is an improvement compared to shot domain.
SYMar 6, 2017
The Impact of Road Configuration on V2V-based Cooperative LocalizationMacheng Shen, Ding Zhao, Jing Sun
Cooperative localization with map matching has been shown to reduce Global Navigation Satellite System (GNSS) localization error from several meters to sub-meter level by fusing the GNSS measurements of four vehicles in our previous work. While further error reduction is expected to be achievable by increasing the number of vehicles, the quantitative relationship between the estimation error and the number of connected vehicles has neither been systematically investigated nor analytically proved. In this work, a theoretical study is presented that analytically proves the correlation between the localization error and the number of connected vehicles in two cases of practical interest. More specifically, it is shown that, under the assumption of small non-common error, the expected square error of the GNSS common error correction is inversely proportional to the number of vehicles, if the road directions obey a uniform distribution, or inversely proportional to logarithm of the number of vehicles, if the road directions obey a Bernoulli distribution. Numerical simulations are conducted to justify these analytic results. Moreover, the simulation results show that the aforementioned error decrement rates hold even when the assumption of small non-common error is violated.
GEO-PHSep 13, 2024
Deep learning-based shot-domain seismic deblendingJing Sun, Song Hou, Vetle Vinje et al.
To streamline fast-track processing of large data volumes, we have developed a deep learning approach to deblend seismic data in the shot domain based on a practical strategy for generating high-quality training data along with a list of data conditioning techniques to improve performance of the data-driven model. We make use of unblended shot gathers acquired at the end of each sail line, to which the access requires no additional time or labor costs beyond the blended acquisition. By manually blending these data we obtain training data with good control of the ground truth and fully adapted to the given survey. Furthermore, we train a deep neural network using multi-channel inputs that include adjacent blended shot gathers as additional channels. The prediction of the blending noise is added in as a related and auxiliary task with the main task of the network being the prediction of the primary-source events. Blending noise in the ground truth is scaled down during the training and validation process due to its excessively strong amplitudes. As part of the process, the to-be-deblended shot gathers are aligned by the blending noise. Implementation on field blended-by-acquisition data demonstrates that introducing the suggested data conditioning steps can considerably reduce the leakage of primary-source events in the deep part of the blended section. The complete proposed approach performs almost as well as a conventional algorithm in the shallow section and shows great advantage in efficiency. It performs slightly worse for larger traveltimes, but still removes the blending noise efficiently.
SYApr 25, 2018
Interpenetrating Cooperative Localization in Dynamic Connected Vehicle NetworksHuajing Zhao, Zhaobin Mo, Macheng Shen et al.
In this paper, we proposed the Interpenetrating Cooperative Localization (ICL) method to enhance the localization accuracy in dynamic connected vehicle networks. This mechanism makes the information from one group of connected vehicles interpenetrate to other groups without full communication between all nodes, thus improving the utility of information in a low connected vehicle penetration situation. We tested the approach using the dynamic traffic data collected in the Safety Pilot Model Deployment program in Ann Arbor Michigan, USA, with dynamic changing networks due to the traveling of vehicles and packet drops of the Dedicated Short-Range Communication. Results show enhancement of localization accuracy with errors reduced by up to 70 % even in complex dynamic scenarios.
SYSep 16, 2017
Semi-Interpenetrating Cooperative Localization in Connected Vehicle NetworksMacheng Shen, Huajing Zhao, Jing Sun et al.
We proposed a fusion mechanism for the distributed cooperative map matching (CMM) within the vehicular ad-hoc network. This mechanism makes the information from each node reachable within the network by other nodes without direct communication, thus improving the overall localization accuracy and robustness. Each node runs a Rao-Blackwellized particle filter (RBPF) that processes the Global Navigation Satellite System (GNSS) measurements of its own and its neighbors, followed by a map matching step that reduces or eliminates the GNSS atmospheric error. Then each node fuses its own filtered results with those from its neighbors for a better estimation. In this work, the complicated dynamics and fusion mechanics of these RBPFs are represented by a linear dynamical system. We proposed a distributed optimization framework that explores the model to improve both robustness and accuracy of the distributed CMM. The effectiveness of this distributed optimization framework is illustrated by simulation results on realistic vehicular networks drawn from data, compared with the centralized one and a decentralized one with random fusion weights.
NAOct 26, 2018
Central local discontinuous Galerkin method for the space fractional diffusion equationJing Sun, Daxin Nie, Weihua Deng
This paper provides the semi-discrete scheme by the central local discontinuous Galerkin method for space fractional diffusion equation on two sets of overlapping cells, and then we give the stability analysis and error estimates for the scheme. Lastly, we verify the effectiveness of the proposed scheme by the one- and two-dimensional numerical experiments.
63.5SYApr 18
Experimental Characterization Data for Battery Modules with Parallel-Connected Cells across Diverse Module-Level State of Health and Cell-to-Cell VariationsQinan Zhou, Daniel Stephens, Jing Sun
This experimental dataset presents both module-level and cell-level characterization data for lithium-ion battery modules composed of three parallel-connected inhomogeneous cells across a wide range of module-level state of health (M-SoH) and cell-to-cell variation (CtCV). First, 70 cells are aged to establish an inventory with cell-level state of health (C-SoH) ranging approximately from 100% to 80% (80% is considered as the end-of-life for automotive applications). From this inventory, 78 battery modules are then assembled, each exhibiting a distinct M-SoH value (from 100% to 80.98%) and a unique CtCV value (from 0% to 9.31%, defined as population standard deviation of C-SoH within each module). Module-level characterization data are collected at 25°C under 0.5C and 0.25C conditions, enabling extraction of module-level capacities and supporting diagnostic analyses such as incremental capacity analysis and differential voltage analysis. Before a module is assembled and tested, cell-level characterization tests are conducted for every individual cell within that module under 1C conditions, enabling direct quantification of CtCV and providing accurate labels for cell-level capacities and internal resistances. The dataset is organized with both raw time-series data and processed summary information such as C-SoH, M-SoH, and CtCV for all modules. With the paired module-level and cell-level characterization data, this dataset enables understanding and development of advanced degradation monitoring mechanisms for battery modules with parallel-connected cells in the presence of CtCVs.
AINov 22, 2022
Decision-making with Speculative Opponent ModelsJing Sun, Shuo Chen, Cong Zhang et al.
Opponent modelling has proven effective in enhancing the decision-making of the controlled agent by constructing models of opponent agents. However, existing methods often rely on access to the observations and actions of opponents, a requirement that is infeasible when such information is either unobservable or challenging to obtain. To address this issue, we introduce Distributional Opponent-aided Multi-agent Actor-Critic (DOMAC), the first speculative opponent modelling algorithm that relies solely on local information (i.e., the controlled agent's observations, actions, and rewards). Specifically, the actor maintains a speculated belief about the opponents using the tailored speculative opponent models that predict the opponents' actions using only local information. Moreover, DOMAC features distributional critic models that estimate the return distribution of the actor's policy, yielding a more fine-grained assessment of the actor's quality. This thus more effectively guides the training of the speculative opponent models that the actor depends upon. Furthermore, we formally derive a policy gradient theorem with the proposed opponent models. Extensive experiments under eight different challenging multi-agent benchmark tasks within the MPE, Pommerman and StarCraft Multiagent Challenge (SMAC) demonstrate that our DOMAC successfully models opponents' behaviours and delivers superior performance against state-of-the-art methods with a faster convergence speed.
CLApr 28, 2024Code
PatentGPT: A Large Language Model for Intellectual PropertyZilong Bai, Ruiji Zhang, Linqing Chen et al.
In recent years, large language models(LLMs) have attracted significant attention due to their exceptional performance across a multitude of natural language process tasks, and have been widely applied in various fields. However, the application of large language models in the Intellectual Property (IP) domain is challenging due to the strong need for specialized knowledge, privacy protection, processing of extremely long text in this field. In this technical report, we present for the first time a low-cost, standardized procedure for training IP-oriented LLMs, meeting the unique requirements of the IP domain. Using this standard process, we have trained the PatentGPT series models based on open-source pretrained models. By evaluating them on the open-source IP-oriented benchmark MOZIP, our domain-specific LLMs outperforms GPT-4, indicating the effectiveness of the proposed training procedure and the expertise of the PatentGPT models in the IP domain. Remarkably, our model surpassed GPT-4 on the 2019 China Patent Agent Qualification Examination, scoring 65 and matching human expert levels. Additionally, the PatentGPT model, which utilizes the SMoE architecture, achieves performance comparable to that of GPT-4 in the IP domain and demonstrates a better cost-performance ratio on long-text tasks, potentially serving as an alternative to GPT-4 within the IP domain.
CROct 17, 2024Code
FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMsZhiyuan Wei, Jing Sun, Zijian Zhang et al.
The rapid growth of blockchain technology has driven the widespread adoption of smart contracts. However, their inherent vulnerabilities have led to significant financial losses. Traditional auditing methods, while essential, struggle to keep pace with the increasing complexity and scale of smart contracts. Large Language Models (LLMs) offer promising capabilities for automating vulnerability detection, but their adoption is often limited by high computational costs. Although prior work has explored leveraging large models through agents or workflows, relatively little attention has been given to improving the performance of smaller, fine-tuned models--a critical factor for achieving both efficiency and data privacy. In this paper, we introduce HKT-SmartAudit, a framework for developing lightweight models optimized for smart contract auditing. It features a multi-stage knowledge distillation pipeline that integrates classical distillation, external domain knowledge, and reward-guided learning to transfer high-quality insights from large teacher models. A single-task learning strategy is employed to train compact student models that maintain high accuracy and robustness while significantly reducing computational overhead. Experimental results show that our distilled models outperform both commercial tools and larger models in detecting complex vulnerabilities and logical flaws, offering a practical, secure, and scalable solution for smart contract auditing. The source code is available at Github repository.
LGFeb 27, 2024Code
Learning Topological Representations with Bidirectional Graph Attention Network for Solving Job Shop Scheduling ProblemCong Zhang, Zhiguang Cao, Yaoxin Wu et al.
Existing learning-based methods for solving job shop scheduling problems (JSSP) usually use off-the-shelf GNN models tailored to undirected graphs and neglect the rich and meaningful topological structures of disjunctive graphs (DGs). This paper proposes the topology-aware bidirectional graph attention network (TBGAT), a novel GNN architecture based on the attention mechanism, to embed the DG for solving JSSP in a local search framework. Specifically, TBGAT embeds the DG from a forward and a backward view, respectively, where the messages are propagated by following the different topologies of the views and aggregated via graph attention. Then, we propose a novel operator based on the message-passing mechanism to calculate the forward and backward topological sorts of the DG, which are the features for characterizing the topological structures and exploited by our model. In addition, we theoretically and experimentally show that TBGAT has linear computational complexity to the number of jobs and machines, respectively, strengthening our method's practical value. Besides, extensive experiments on five synthetic datasets and seven classic benchmarks show that TBGAT achieves new SOTA results by outperforming a wide range of neural methods by a large margin. All the code and data are publicly available online at https://github.com/zcaicaros/TBGAT.
CVJan 13, 2025Code
Aligning First, Then Fusing: A Novel Weakly Supervised Multimodal Violence Detection MethodWenping Jin, Li Zhu, Jing Sun
Weakly supervised violence detection refers to the technique of training models to identify violent segments in videos using only video-level labels. Among these approaches, multimodal violence detection, which integrates modalities such as audio and optical flow, holds great potential. Existing methods in this domain primarily focus on designing multimodal fusion models to address modality discrepancies. In contrast, we take a different approach; leveraging the inherent discrepancies across modalities in violence event representation to propose a novel multimodal semantic feature alignment method. This method sparsely maps the semantic features of local, transient, and less informative modalities ( such as audio and optical flow ) into the more informative RGB semantic feature space. Through an iterative process, the method identifies the suitable no-zero feature matching subspace and aligns the modality-specific event representations based on this subspace, enabling the full exploitation of information from all modalities during the subsequent modality fusion stage. Building on this, we design a new weakly supervised violence detection framework that consists of unimodal multiple-instance learning for extracting unimodal semantic features, multimodal alignment, multimodal fusion, and final detection. Experimental results on benchmark datasets demonstrate the effectiveness of our method, achieving an average precision (AP) of 86.07% on the XD-Violence dataset. Our code is available at https://github.com/xjpp2016/MAVD.
SEOct 25, 2024Code
MaCTG: Multi-Agent Collaborative Thought Graph for Automatic ProgrammingZixiao Zhao, Jing Sun, Zhe Hou et al.
With the rapid advancement of Large Language Models (LLMs), LLM-based approaches have demonstrated strong problem-solving capabilities across various domains. However, in automatic programming, a single LLM is typically limited to function-level code generation, while multi-agent systems composed of multiple LLMs often suffer from inefficient task planning. This lack of structured coordination can lead to cascading hallucinations, where accumulated errors across agents result in suboptimal workflows and excessive computational costs. To overcome these challenges, we introduce MaCTG (Multi-Agent Collaborative Thought Graph), a novel multi-agent framework that employs a dynamic graph structure to facilitate precise task allocation and controlled collaboration among LLM agents. MaCTG autonomously assigns agent roles based on programming requirements, dynamically refines task distribution through context-aware adjustments, and systematically verifies and integrates project-level code, effectively reducing hallucination errors and improving overall accuracy. MaCTG enhances cost-effectiveness by implementing a hybrid LLM deployment, where proprietary models handle complex reasoning, while open-source models are used for routine coding and validation tasks. To evaluate MaCTG's effectiveness, we applied it to traditional image processing auto-programming tasks, achieving a state-of-the-art accuracy of 83.33%. Additionally, by leveraging its hybrid LLM configuration, MaCTG significantly reduced operational costs by 89.09% compared to existing multi-agent frameworks, demonstrating its efficiency, scalability, and real-world applicability.
15.3LGApr 15
Representation over Routing: Overcoming Surrogate Hacking in Multi-Timescale PPOJing Sun
Temporal credit assignment in reinforcement learning has long been a central challenge. Inspired by the multi-timescale encoding of the dopamine system in neurobiology, recent research has sought to introduce multiple discount factors into Actor-Critic architectures, such as Proximal Policy Optimization (PPO), to balance short-term responses with long-term planning. However, this paper reveals that blindly fusing multi-timescale signals in complex delayed-reward tasks can lead to severe algorithmic pathologies. We systematically demonstrate that exposing a temporal attention routing mechanism to policy gradients results in surrogate objective hacking, while adopting gradient-free uncertainty weighting triggers irreversible myopic degeneration, a phenomenon we term the Paradox of Temporal Uncertainty. To address these issues, we propose a Target Decoupling architecture: on the Critic side, we retain multi-timescale predictions to enforce auxiliary representation learning, while on the Actor side, we strictly isolate short-term signals and update the policy based solely on long-term advantages. Rigorous empirical evaluations across multiple independent random seeds in the LunarLander-v2 environment demonstrate that our proposed architecture achieves statistically significant performance improvements. Without relying on hyperparameter hacking, it consistently surpasses the ''Environment Solved'' threshold with minimal variance, completely eliminates policy collapse, and escapes the hovering local optima that trap single-timescale baselines.
CROct 8, 2025Code
Distilling Lightweight Language Models for C/C++ VulnerabilitiesZhiyuan Wei, Xiaoxuan Yang, Jing Sun et al.
The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and substantial economic loss. Consequently, robust code vulnerability detection is essential for software security. While Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing, their potential for automated code vulnerability detection remains underexplored. This paper presents FineSec, a novel framework that harnesses LLMs through knowledge distillation to enable efficient and precise vulnerability identification in C/C++ codebases. FineSec utilizes knowledge distillation to transfer expertise from large teacher models to compact student models, achieving high accuracy with minimal computational cost. By integrating data preparation, training, evaluation, and continuous learning into a unified, single-task workflow, FineSec offers a streamlined approach. Extensive evaluations on C/C++ codebases demonstrate its superiority over both base models and larger LLMs in identifying complex vulnerabilities and logical flaws, establishing FineSec as a practical and scalable solution for real-world software security. To facilitate reproducibility, the datasets, source code, and experimental results are made publicly available at: https://github.com/yangxiaoxuan123/FineSec_detect.
8.0SYMar 20
Estimation of Cell-to-Cell Variation and State of Health for Battery Modules with Parallel-Connected CellsQinan Zhou, Jing Sun
Estimating cell-to-cell variation (CtCV) and state of health (SoH) for battery modules with parallel-connected cells is challenging when only module-level signals are measurable and individual cell behaviors remain unobserved. Although progress has been made in SoH estimation, CtCV estimation remains unresolved in the literature. This paper proposes a unified framework that accurately estimates both CtCV and SoH for modules using only module-level information extracted from incremental capacity analysis (ICA) and differential voltage analysis (DVA). With the proposed framework, CtCV and SoH estimations can be decoupled into two separate tasks, allowing each to be solved with dedicated algorithms without mutual interference and providing greater design flexibility. The framework also exhibits strong versatility in accommodating different CtCV metrics, highlighting its general-purpose nature. Experimental validation on modules with three parallel-connected cells demonstrates that the proposed framework can systematically select optimal module-level features for CtCV and SoH estimations, deliver accurate CtCV and SoH estimates with high confidence and low computational complexity, remain effective across different C-rates, and be suitable for onboard implementation.
43.9GEO-PHApr 30
Parameter-Efficient Adaptation of Pre-Trained Vision Foundation Models for Active and Passive Seismic Data DenoisingJiahua Zhao, Umair bin Waheed, Jing Sun et al.
The demand for high-resolution subsurface imaging and continuous Earth monitoring has driven rapid growth in active and passive seismic data from dense geophone deployments, distributed acoustic sensing (DAS) arrays, and large-scale 2D and 3D surveys. This expansion makes complex noise suppression increasingly challenging, especially when signal fidelity must be preserved. Conventional supervised deep learning methods are often task-specific, require large paired datasets, and can suffer from domain shift under new acquisition conditions. Foundation models offer a promising alternative, but pre-training seismic foundation models from scratch requires massive domain-specific data and substantial computation. We propose an efficient framework that repurposes general-purpose Vision Foundation Models (VFMs) for geophysical tasks through Parameter-Efficient Fine-Tuning. The architecture uses a pre-trained VFM, a DINOv3 encoder, adapted with Low-Rank Adaptation (LoRA) to enable effective feature adaptation with few additional parameters. To improve robustness under unseen field conditions without ground truth, we introduce a kurtosis-guided unsupervised test-time adaptation module that updates only LoRA parameters during inference. This module self-calibrates the model to site-specific noise by identifying information-rich regions via kurtosis and performing self-training without labeled data. Experiments on public exploration seismic images and DAS vertical seismic profiling data from the Utah FORGE site show that the framework matches or outperforms domain-specific models. Tests on unseen cross-site data from a land survey in China and the Groß Schönebeck geothermal site in Germany further demonstrate strong generalization and effective signal-noise separation. These results highlight the potential of adapting pre-trained VFMs to data-intensive problems in exploration seismology.
AIDec 9, 2024
The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents: A RoadmapYedi Zhang, Yufan Cai, Xinyue Zuo et al.
Large Language Models (LLMs) have emerged as a transformative AI paradigm, profoundly influencing daily life through their exceptional language understanding and contextual generation capabilities. Despite their remarkable performance, LLMs face a critical challenge: the propensity to produce unreliable outputs due to the inherent limitations of their learning-based nature. Formal methods (FMs), on the other hand, are a well-established computation paradigm that provides mathematically rigorous techniques for modeling, specifying, and verifying the correctness of systems. FMs have been extensively applied in mission-critical software engineering, embedded systems, and cybersecurity. However, the primary challenge impeding the deployment of FMs in real-world settings lies in their steep learning curves, the absence of user-friendly interfaces, and issues with efficiency and adaptability. This position paper outlines a roadmap for advancing the next generation of trustworthy AI systems by leveraging the mutual enhancement of LLMs and FMs. First, we illustrate how FMs, including reasoning and certification techniques, can help LLMs generate more reliable and formally certified outputs. Subsequently, we highlight how the advanced learning capabilities and adaptability of LLMs can significantly enhance the usability, efficiency, and scalability of existing FM tools. Finally, we show that unifying these two computation paradigms -- integrating the flexibility and intelligence of LLMs with the rigorous reasoning abilities of FMs -- has transformative potential for the development of trustworthy AI software systems. We acknowledge that this integration has the potential to enhance both the trustworthiness and efficiency of software engineering practices while fostering the development of intelligent FM tools capable of addressing complex yet real-world challenges.
53.2CRApr 26
Rényi Pufferfish Privacy with Gaussian-based Priors: From Single Gaussian to Mixture ModelWenjin Yang, Ni Ding, Zijian Zhang et al.
Rényi Pufferfish Privacy (RPP) provides a Rényi divergence-based privacy framework for correlated data, but existing $\infty$-Wasserstein mechanisms are often conservative and sacrifice data utility. We study Gaussian mechanisms for RPP under Gaussian and Gaussian-mixture priors. For single Gaussian priors, we derive the exact Rényi divergence after Gaussian perturbation, obtain a relaxed closed-form sufficient condition for $(α,ε)$-RPP, and characterize the monotonicity of the calibrated noise with respect to the privacy budget $ε$ and the Rényi order $α$. To handle more general non-Gaussian and multimodal priors, we approximate secret-conditioned outputs with Gaussian mixture models and introduce an optimal-transport-based sufficient condition for RPP. Experiments on three UCI datasets with statistical (\textsc{RAW}, \textsc{MEAN}) and model-output (\textsc{BNN}, \textsc{GP}) queries show that our prior-aware mechanisms consistently require less noise than a recent RPP additive-noise baseline, achieving an average noise reduction of 48.9\%. These results show that our mechanisms can substantially improve the privacy-utility trade-off under RPP.
COMP-PHMar 2, 2025
Insights into dendritic growth mechanisms in batteries: A combined machine learning and computational studyZirui Zhao, Junchao Xia, Si Wu et al.
In recent years, researchers have increasingly sought batteries as an efficient and cost-effective solution for energy storage and supply, owing to their high energy density, low cost, and environmental resilience. However, the issue of dendrite growth has emerged as a significant obstacle in battery development. Excessive dendrite growth during charging and discharging processes can lead to battery short-circuiting, degradation of electrochemical performance, reduced cycle life, and abnormal exothermic events. Consequently, understanding the dendrite growth process has become a key challenge for researchers. In this study, we investigated dendrite growth mechanisms in batteries using a combined machine learning approach, specifically a two-dimensional artificial convolutional neural network (CNN) model, along with computational methods. We developed two distinct computer models to predict dendrite growth in batteries. The CNN-1 model employs standard convolutional neural network techniques for dendritic growth prediction, while CNN-2 integrates additional physical parameters to enhance model robustness. Our results demonstrate that CNN-2 significantly enhances prediction accuracy, offering deeper insights into the impact of physical factors on dendritic growth. This improved model effectively captures the dynamic nature of dendrite formation, exhibiting high accuracy and sensitivity. These findings contribute to the advancement of safer and more reliable energy storage systems.
ROSep 23, 2025
Pure Vision Language Action (VLA) Models: A Comprehensive SurveyDapeng Zhang, Jing Sun, Chenghui Hu et al.
The emergence of Vision Language Action (VLA) models marks a paradigm shift from traditional policy-based control to generalized robotics, reframing Vision Language Models (VLMs) from passive sequence generators into active agents for manipulation and decision-making in complex, dynamic environments. This survey delves into advanced VLA methods, aiming to provide a clear taxonomy and a systematic, comprehensive review of existing research. It presents a comprehensive analysis of VLA applications across different scenarios and classifies VLA approaches into several paradigms: autoregression-based, diffusion-based, reinforcement-based, hybrid, and specialized methods; while examining their motivations, core strategies, and implementations in detail. In addition, foundational datasets, benchmarks, and simulation platforms are introduced. Building on the current VLA landscape, the review further proposes perspectives on key challenges and future directions to advance research in VLA models and generalizable robotics. By synthesizing insights from over three hundred recent studies, this survey maps the contours of this rapidly evolving field and highlights the opportunities and challenges that will shape the development of scalable, general-purpose VLA methods.
IVDec 13, 2024
A Single-Frame and Multi-Frame Cascaded Image Super-Resolution MethodJing Sun, Qiangqiang Yuan, Huanfeng Shen et al.
The objective of image super-resolution is to reconstruct a high-resolution (HR) image with the prior knowledge from one or several low-resolution (LR) images. However, in the real world, due to the limited complementary information, the performance of both single-frame and multi-frame super-resolution reconstruction degrades rapidly as the magnification increases. In this paper, we propose a novel two-step image super resolution method concatenating multi-frame super-resolution (MFSR) with single-frame super-resolution (SFSR), to progressively upsample images to the desired resolution. The proposed method consisting of an L0-norm constrained reconstruction scheme and an enhanced residual back-projection network, integrating the flexibility of the variational modelbased method and the feature learning capacity of the deep learning-based method. To verify the effectiveness of the proposed algorithm, extensive experiments with both simulated and real world sequences were implemented. The experimental results show that the proposed method yields superior performance in both objective and perceptual quality measurements. The average PSNRs of the cascade model in set5 and set14 are 33.413 dB and 29.658 dB respectively, which are 0.76 dB and 0.621 dB more than the baseline method. In addition, the experiment indicates that this cascade model can be robustly applied to different SFSR and MFSR methods.
CRMay 21, 2025
Adaptive Plan-Execute Framework for Smart Contract Security AuditingZhiyuan Wei, Jing Sun, Zijian Zhang et al.
Large Language Models (LLMs) have shown great promise in code analysis and auditing; however, they still struggle with hallucinations and limited context-aware reasoning. We introduce SmartAuditFlow, a novel Plan-Execute framework that enhances smart contract security analysis through dynamic audit planning and structured execution. Unlike conventional LLM-based auditing approaches that follow fixed workflows and predefined steps, SmartAuditFlow dynamically generates and refines audit plans based on the unique characteristics of each smart contract. It continuously adjusts its auditing strategy in response to intermediate LLM outputs and newly detected vulnerabilities, ensuring a more adaptive and precise security assessment. The framework then executes these plans step by step, applying a structured reasoning process to enhance vulnerability detection accuracy while minimizing hallucinations and false positives. To further improve audit precision, SmartAuditFlow integrates iterative prompt optimization and external knowledge sources, such as static analysis tools and Retrieval-Augmented Generation (RAG). This ensures audit decisions are contextually informed and backed by real-world security knowledge, producing comprehensive security reports. Extensive evaluations across multiple benchmarks demonstrate that SmartAuditFlow outperforms existing methods, achieving 100 percent accuracy on common and critical vulnerabilities, 41.2 percent accuracy for comprehensive coverage of known smart contract weaknesses in real-world projects, and successfully identifying all 13 tested CVEs. These results highlight SmartAuditFlow's scalability, cost-effectiveness, and superior adaptability over traditional static analysis tools and contemporary LLM-based approaches, establishing it as a robust solution for automated smart contract auditing.
CLApr 15, 2025
Streamlining Biomedical Research with Specialized LLMsLinqing Chen, Weilei Wang, Yubin Xia et al.
In this paper, we propose a novel system that integrates state-of-the-art, domain-specific large language models with advanced information retrieval techniques to deliver comprehensive and context-aware responses. Our approach facilitates seamless interaction among diverse components, enabling cross-validation of outputs to produce accurate, high-quality responses enriched with relevant data, images, tables, and other modalities. We demonstrate the system's capability to enhance response precision by leveraging a robust question-answering model, significantly improving the quality of dialogue generation. The system provides an accessible platform for real-time, high-fidelity interactions, allowing users to benefit from efficient human-computer interaction, precise retrieval, and simultaneous access to a wide range of literature and data. This dramatically improves the research efficiency of professionals in the biomedical and pharmaceutical domains and facilitates faster, more informed decision-making throughout the R\&D process. Furthermore, the system proposed in this paper is available at https://synapse-chat.patsnap.com.
CVDec 13, 2024
Super-Resolution for Remote Sensing Imagery via the Coupling of a Variational Model and Deep LearningJing Sun, Huanfeng Shen, Qiangqiang Yuan et al.
Image super-resolution (SR) is an effective way to enhance the spatial resolution and detail information of remote sensing images, to obtain a superior visual quality. As SR is severely ill-conditioned, effective image priors are necessary to regularize the solution space and generate the corresponding high-resolution (HR) image. In this paper, we propose a novel gradient-guided multi-frame super-resolution (MFSR) framework for remote sensing imagery reconstruction. The framework integrates a learned gradient prior as the regularization term into a model-based optimization method. Specifically, the local gradient regularization (LGR) prior is derived from the deep residual attention network (DRAN) through gradient profile transformation. The non-local total variation (NLTV) prior is characterized using the spatial structure similarity of the gradient patches with the maximum a posteriori (MAP) model. The modeled prior performs well in preserving edge smoothness and suppressing visual artifacts, while the learned prior is effective in enhancing sharp edges and recovering fine structures. By incorporating the two complementary priors into an adaptive norm based reconstruction framework, the mixed L1 and L2 regularization minimization problem is optimized to achieve the required HR remote sensing image. Extensive experimental results on remote sensing data demonstrate that the proposed method can produce visually pleasant images and is superior to several of the state-of-the-art SR algorithms in terms of the quantitative evaluation.
CROct 10, 2025
Goal-oriented Backdoor Attack against Vision-Language-Action Models via Physical ObjectsZirun Zhou, Zhengyang Xiao, Haochuan Xu et al.
Recent advances in vision-language-action (VLA) models have greatly improved embodied AI, enabling robots to follow natural language instructions and perform diverse tasks. However, their reliance on uncurated training datasets raises serious security concerns. Existing backdoor attacks on VLAs mostly assume white-box access and result in task failures instead of enforcing specific actions. In this work, we reveal a more practical threat: attackers can manipulate VLAs by simply injecting physical objects as triggers into the training dataset. We propose goal-oriented backdoor attacks (GoBA), where the VLA behaves normally in the absence of physical triggers but executes predefined and goal-oriented actions in the presence of physical triggers. Specifically, based on a popular VLA benchmark LIBERO, we introduce BadLIBERO that incorporates diverse physical triggers and goal-oriented backdoor actions. In addition, we propose a three-level evaluation that categorizes the victim VLA's actions under GoBA into three states: nothing to do, try to do, and success to do. Experiments show that GoBA enables the victim VLA to successfully achieve the backdoor goal in 97 percentage of inputs when the physical trigger is present, while causing zero performance degradation on clean inputs. Finally, by investigating factors related to GoBA, we find that the action trajectory and trigger color significantly influence attack performance, while trigger size has surprisingly little effect. The code and BadLIBERO dataset are accessible via the project page at https://goba-attack.github.io/.
AIApr 26, 2025
Reshaping MOFs text mining with a dynamic multi-agents framework of large language modelZuhong Lin, Daoyuan Ren, Kai Ran et al.
Accurately identifying the synthesis conditions of metal-organic frameworks (MOFs) is essential for guiding experimental design, yet remains challenging because relevant information in the literature is often scattered, inconsistent, and difficult to interpret. We present MOFh6, a large language model driven system that reads raw articles or crystal codes and converts them into standardized synthesis tables. It links related descriptions across paragraphs, unifies ligand abbreviations with full names, and outputs structured parameters ready for use. MOFh6 achieved 99% extraction accuracy, resolved 94.1% of abbreviation cases across five major publishers, and maintained a precision of 0.93 +/- 0.01. Processing a full text takes 9.6 s, locating synthesis descriptions 36 s, with 100 papers processed for USD 4.24. By replacing static database lookups with real-time extraction, MOFh6 reshapes MOF synthesis research, accelerating the conversion of literature knowledge into practical synthesis protocols and enabling scalable, data-driven materials discovery.
GEO-PHJan 26, 2025
Physics-Trained Neural Network as Inverse Problem Solver for Potential Fields: An Example of Downward Continuation between Arbitrary SurfacesJing Sun, Lu Li, Liang Zhang
Downward continuation is a critical task in potential field processing, including gravity and magnetic fields, which aims to transfer data from one observation surface to another that is closer to the source of the field. Its effectiveness directly impacts the success of detecting and highlighting subsurface anomalous sources. We treat downward continuation as an inverse problem that relies on solving a forward problem defined by the formula for upward continuation, and we propose a new physics-trained deep neural network (DNN)-based solution for this task. We hard-code the upward continuation process into the DNN's learning framework, where the DNN itself learns to act as the inverse problem solver and can perform downward continuation without ever being shown any ground truth data. We test the proposed method on both synthetic magnetic data and real-world magnetic data from West Antarctica. The preliminary results demonstrate its effectiveness through comparison with selected benchmarks, opening future avenues for the combined use of DNNs and established geophysical theories to address broader potential field inverse problems, such as density and geometry modelling.
GEO-PHJan 26, 2025
Physics-Driven Self-Supervised Deep Learning for Free-Surface Multiple EliminationJing Sun, Tiexing Wang, Eric Verschuur et al.
In recent years, deep learning (DL) has emerged as a promising alternative approach for various seismic processing tasks, including primary estimation (or multiple elimination), a crucial step for accurate subsurface imaging. In geophysics, DL methods are commonly based on supervised learning from large amounts of high-quality labelled data. Instead of relying on traditional supervised learning, in the context of free-surface multiple elimination, we propose a method in which the DL model learns to effectively parameterize the free-surface multiple-free wavefield from the full wavefield by incorporating the underlying physics into the loss computation. This, in turn, yields high-quality estimates without ever being shown any ground truth data. Currently, the network reparameterization is performed independently for each dataset. We demonstrate its effectiveness through tests on both synthetic and field data. We employ industry-standard Surface-Related Multiple Elimination (SRME) using, respectively, global least-squares adaptive subtraction and local least-squares adaptive subtraction as benchmarks. The comparison shows that the proposed method outperforms the benchmarks in estimation accuracy, achieving the most complete primary estimation and the least multiple energy leakage, but at the cost of a higher computational burden.
CLJun 26, 2024
PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and ChemistryLinqing Chen, Weilei Wang, Zilong Bai et al.
Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmaGPT, a suite of domain specilized LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus tailored to the Bio-Pharmaceutical and Chemical domains. Our evaluation shows that PharmaGPT surpasses existing general models on specific-domain benchmarks such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. Remarkably, this performance is achieved with a model that has only a fraction, sometimes just one-tenth-of the parameters of general-purpose large models. This advancement establishes a new benchmark for LLMs in the bio-pharmaceutical and chemical fields, addressing the existing gap in specialized language modeling. It also suggests a promising path for enhanced research and development, paving the way for more precise and effective NLP applications in these areas.
LGOct 1, 2021
Artificial Neural Network and its Application Research Progress in DistillationJing Sun, Qi Tang
Artificial neural networks learn various rules and algorithms to form different ways of processing information, and have been widely used in various chemical processes. Among them, with the development of rectification technology, its production scale continues to expand, and its calculation requirements are also more stringent, because the artificial neural network has the advantages of self-learning, associative storage and high-speed search for optimized solutions, it can make high-precision simulation predictions for rectification operations, so it is widely used in the chemical field of rectification. This article gives a basic overview of artificial neural networks, and introduces the application research of artificial neural networks in distillation at home and abroad.
SDAug 6, 2021
An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder ArchitecturesDengfeng Ke, Yuxing Lu, Xudong Liu et al.
With the rapid development of neural network architectures and speech processing models, singing voice synthesis with neural networks is becoming the cutting-edge technique of digital music production. In this work, in order to explore how to improve the quality and efficiency of singing voice synthesis, in this work, we use encoder-decoder neural models and a number of vocoders to achieve singing voice synthesis. We conduct experiments to demonstrate that the models can be trained using voice data with pitch information, lyrics and beat information, and the trained models can produce smooth, clear and natural singing voice that is close to real human voice. As the models work in the end-to-end manner, they allow users who are not domain experts to directly produce singing voice by arranging pitches, lyrics and beats.
LGJan 25, 2021
A Unified Joint Maximum Mean Discrepancy for Domain AdaptationWei Wang, Baopu Li, Shuhui Yang et al.
Domain adaptation has received a lot of attention in recent years, and many algorithms have been proposed with impressive progress. However, it is still not fully explored concerning the joint probability distribution (P(X, Y)) distance for this problem, since its empirical estimation derived from the maximum mean discrepancy (joint maximum mean discrepancy, JMMD) will involve complex tensor-product operator that is hard to manipulate. To solve this issue, this paper theoretically derives a unified form of JMMD that is easy to optimize, and proves that the marginal, class conditional and weighted class conditional probability distribution distances are our special cases with different label kernels, among which the weighted class conditional one not only can realize feature alignment across domains in the category level, but also deal with imbalance dataset using the class prior probabilities. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence (discriminability) that benefits to classification, and it is sensitive to the label distribution shift when the label kernel is the weighted class conditional one. Therefore, we leverage Hilbert Schmidt independence criterion and propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift. Finally, we conduct extensive experiments on several cross-domain datasets to demonstrate the validity and effectiveness of the revealed theoretical results.
CVMay 8, 2020
Sparsely-Labeled Source Assisted Domain AdaptationWei Wang, Zhihui Wang, Yuankai Xiang et al.
Domain Adaptation (DA) aims to generalize the classifier learned from the source domain to the target domain. Existing DA methods usually assume that rich labels could be available in the source domain. However, there are usually a large number of unlabeled data but only a few labeled data in the source domain, and how to transfer knowledge from this sparsely-labeled source domain to the target domain is still a challenge, which greatly limits their application in the wild. This paper proposes a novel Sparsely-Labeled Source Assisted Domain Adaptation (SLSA-DA) algorithm to address the challenge with limited labeled source domain samples. Specifically, due to the label scarcity problem, the projected clustering is conducted on both the source and target domains, so that the discriminative structures of data could be leveraged elegantly. Then the label propagation is adopted to propagate the labels from those limited labeled source samples to the whole unlabeled data progressively, so that the cluster labels are revealed correctly. Finally, we jointly align the marginal and conditional distributions to mitigate the cross-domain mismatch problem, and optimize those three procedures iteratively. However, it is nontrivial to incorporate those three procedures into a unified optimization framework seamlessly since some variables to be optimized are implicitly involved in their formulas, thus they could not promote to each other. Remarkably, we prove that the projected clustering and conditional distribution alignment could be reformulated as different expressions, thus the implicit variables are revealed in different optimization steps. As such, the variables related to those three quantities could be optimized in a unified optimization framework and facilitate to each other, to improve the recognition performance obviously.
MLMar 20, 2020
aphBO-2GP-3B: A budgeted asynchronous parallel multi-acquisition functions for constrained Bayesian optimization on high-performing computing architectureAnh Tran, Mike Eldred, Tim Wildey et al.
High-fidelity complex engineering simulations are highly predictive, but also computationally expensive and often require substantial computational efforts. The mitigation of computational burden is usually enabled through parallelism in high-performance cluster (HPC) architecture. In this paper, an asynchronous constrained batch-parallel Bayesian optimization method is proposed to efficiently solve the computationally-expensive simulation-based optimization problems on the HPC platform, with a budgeted computational resource, where the maximum number of simulations is a constant. The advantages of this method are three-fold. First, the efficiency of the Bayesian optimization is improved, where multiple input locations are evaluated massively parallel in an asynchronous manner to accelerate the optimization convergence with respect to physical runtime. This efficiency feature is further improved so that when each of the inputs is finished, another input is queried without waiting for the whole batch to complete. Second, the method can handle both known and unknown constraints. Third, the proposed method considers several acquisition functions at the same time and sample based on an evolving probability mass distribution function using a modified GP-Hedge scheme, where parameters are corresponding to the performance of each acquisition function. The proposed framework is termed aphBO-2GP-3B, which corresponds to asynchronous parallel hedge Bayesian optimization with two Gaussian processes and three batches. The aphBO-2GP-3B framework is demonstrated using two high-fidelity expensive industrial applications, where the first one is based on finite element analysis (FEA) and the second one is based on computational fluid dynamics (CFD) simulations.
SEMar 4, 2020
Measuring the Quality of B Abstract Machines with ISO/IEC 25010Cheng-Hao Cai, Jing Sun, Gillian Dobbie
The B method has facilitated the development of software by specifying the design of software as abstract machines and formally verifying the correctness of the abstract machines. The quality of B abstract machines can significantly impact the quality of final software products. In this paper, we propose a set of criteria for measuring the quality of B abstract machines based on ISO/IEC 25010, which is one of the latest international standards for evaluating software quality in software engineering. These criteria evaluate abstract machines using a number of general-purpose and domain-independent equations and model checking techniques, so that the quality of abstract machines can be quantified as vectors. The proposed criteria are implemented as a B model quality evaluator, and they are explained and justified using a number of examples.