Yan Tan

AR
4papers
2citations
Novelty63%
AI Score51

4 Papers

91.5ARApr 20Code
VerilogCL: A Contrastive Learning Framework for Robust LLM-Based Verilog Generation

Yan Tan, Tong Liu, Xiangchen Meng et al.

Large Language Models (LLMs) have recently achieved strong performance in software code generation. However, applying them to hardware description languages (HDLs), such as Verilog, remains challenging because high-quality training data are relatively scarce. In practice, LLM-generated Verilog often contains syntactic or structural errors that either cause compilation failures or produce functionally incorrect designs, which limit its reliability in hardware design workflows. In this work, we propose VerilogCL, an integrated framework that enhances Verilog code generation by explicitly learning the boundary between correct and erroneous RTL through contrastive learning and proactive error screening. Our approach introduces minimal-error data augmentation, generating paired training samples of correct RTL and minimally perturbed erroneous RTL to teach the model to recognize fine-grained distinctions between correct and erroneous code. We then apply contrastive learning to learn a clearer validity boundary in the representation space, improving the separation between correct and erroneous RTL code. In addition, we introduce a proactive screening module that combines semantic embeddings with token-level uncertainty features to filter low-confidence candidates during generation. Experiments on public benchmarks, including VerilogEval and RTLLM, show that our 7B-parameter model outperforms the evaluated open-source, Verilog-specialized, and commercial baselines in both compilation success rate and functional correctness.

73.0CVApr 29
TRIP-Evaluate: An Open Multimodal Benchmark for Evaluating Large Models in Transportation

Han Gong, Zhen Zhou, Yunyang Shi et al.

Large language models (LLMs) and multimodal large models (MLLMs) are increasingly used for transportation tasks such as regulation question answering, traffic management support, engineering review, and autonomous-driving scene reasoning. Yet transportation workflows are rule-intensive, computation-intensive, safety-critical, and inherently multimodal. Existing general benchmarks provide limited evidence of whether a model can apply regulations correctly, perform verifiable engineering calculations, or interpret traffic scenes reliably, while the small number of public transportation benchmarks remain narrow in scope and rarely support fine-grained diagnosis across text, images, and point-cloud data. To address this gap, we present TRIP-Evaluate, an open multimodal benchmark for large models in transportation. The benchmark organizes 837 items using a role-task-knowledge taxonomy that covers vehicle, traffic-management, traveler, and planning-and-design functions. Each item is annotated with capability, modality, and difficulty labels, enabling diagnosis from overall accuracy down to specific failure modes. The current release includes 596 text items, 198 image items, and 43 point-cloud items. TRIP-Evaluate also standardizes item construction, quality control, prompting, decoding, and scoring to improve cross-model comparability. Results on a diverse panel of models show that text-based performance is improving, but substantial weaknesses remain in multi-step engineering calculation, rule-constrained reasoning, multimodal scene understanding, and point-cloud understanding. Overall, TRIP-Evaluate provides a reproducible, diagnosable, and engineering-aligned evaluation baseline for model selection, regression testing, and safer deployment in transportation applications.

91.0PLMar 12
AutoVeriFix+: High-Correctness RTL Generation via Trace-Aware Causal Fix and Semantic Redundancy Pruning

Yan Tan, Xiangchen Meng, Zijun Jiang et al.

Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as Python and C++. However, their application to hardware description languages, such as Verilog, is challenging due to the scarcity of high-quality training data. Current approaches to Verilog code generation using LLMs often focus on syntactic correctness, resulting in code with functional errors. To address these challenges, we propose AutoVeriFix+, a novel three-stage framework that integrates high-level semantic reasoning with state-space exploration to enhance functional correctness and design efficiency. In the first stage, an LLM is employed to generate high-level Python reference models that define the intended circuit behavior. In the second stage, another LLM generates initial Verilog RTL candidates and iteratively fixes syntactic errors. In the third stage, we introduce a Concolic testing engine to exercise deep sequential logic and identify corner-case vulnerabilities. With cycle-accurate execution traces and internal register snapshots, AutoVeriFix+ provides the LLM with the causal context necessary to resolve complex state-transition errors. Furthermore, it will generate a coverage report to identify functionally redundant branches, enabling the LLM to perform semantic pruning for area optimization. Experimental results demonstrate that AutoVeriFix+ achieves over 80% functional correctness on rigorous benchmarks, reaching a pass@10 score of 90.2% on the VerilogEval-machine dataset. In addition, it eliminates an average of 25% redundant logic across benchmarks through trace-aware optimization.

87.6NAApr 3
A multiphase cubic MARS method for fourth- and higher-order interface tracking of two or more materials with arbitrary topology and geometry

Yan Tan, Yixiao Qian, Zhiqi Li et al.

For interface tracking of an arbitrary number of materials in two dimensions, we propose a multiphase cubic MARS method that (a) represents the topology and geometry of the interface via graphs, cycles, and cubic splines, (b) applies to any number of materials with arbitrarily complex topology and geometry, (c) maintains an $(r,h)$-regularity of the interface so that the distance between any pair of adjacent markers is within a user-specified range, (d) distributes the markers adaptively along the interface so that arcs with high curvature are resolved by densely populated markers, and (e) achieves fourth-, sixth-, and eighth-order accuracy both in time and in space.} In particular, all possible types of junctions, which pose challenges to VOF methods and level-set methods, are handled with ease. Results of a variety of benchmark tests confirm the analysis and demonstrate the superior accuracy, efficiency, and versatility of the proposed method.