Cheng Tan

h-index17

6papers

119citations

Novelty44%

AI Score49

Ranked #24,755 of 194,257 authors (top 13%)#5,924 in LG (top 15%)

6 Papers

10.0ARMay 30Code

MACO: A Multi-Agent LLM Framework for Automated CGRA Hardware/Software Co-Design

Zesong Jiang, Yuqi Sun, Qing Zhong et al.

Designing optimal Coarse-Grained Reconfigurable Arrays (CGRAs) requires navigating a vast, interdependent hardware/software space bottlenecked by costly manual iteration. We present MACO, an open-source, multi-agent LLM framework that automates CGRA HW/SW co-design. MACO decomposes the design loop into four collaborative stages, HW/SW Co-design, Error Correction, Best-Design Selection, and Evaluation & Feedback, to iteratively optimize power, performance, and area (PPA). To accelerate convergence and efficiently traverse the design space, MACO introduces an exponentially decaying exploration strategy, EDA-guided LLM self-learning, and robust rule-based error correction. Evaluated against state-of-the-art baselines, MACO reduces power consumption by 25.9%, improves performance by 20.0%, and accelerates the search process by 5x. Finally, we validate MACO's physical design through a complete 7nm ASIC design flow.

4.6LGSep 17, 2024

AutoSpec: Automated Generation of Neural Network Specifications

Shuowei Jin, Francis Y. Yan, Cheng Tan et al.

The increasing adoption of neural networks in learning-augmented systems highlights the importance of model safety and robustness, particularly in safety-critical domains. Despite progress in the formal verification of neural networks, current practices require users to manually define model specifications -- properties that dictate expected model behavior in various scenarios. This manual process, however, is prone to human error, limited in scope, and time-consuming. In this paper, we introduce AutoSpec, the first framework to automatically generate comprehensive and accurate specifications for neural networks in learning-augmented systems. We also propose the first set of metrics for assessing the accuracy and coverage of model specifications, establishing a benchmark for future comparisons. Our evaluation across four distinct applications shows that AutoSpec outperforms human-defined specifications as well as two baseline approaches introduced in this study.

6.2CVDec 1, 2025

Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights

Juanxi Tian, Siyuan Li, Conghui He et al.

Current multimodal models aim to transcend the limitations of single-modality representations by unifying understanding and generation, often using text-to-image (T2I) tasks to calibrate semantic consistency. However, their reliance on static, single-image generation in training and evaluation leads to overfitting to static pattern matching and semantic fusion, while fundamentally hindering their ability to model dynamic processes that unfold over time. To address these constraints, we propose Envision-a causal event progression benchmark for chained text-to-multi-image generation. Grounded in world knowledge and structured by spatiotemporal causality, it reorganizes existing evaluation dimensions and includes 1,000 four-stage prompts spanning six scientific and humanities domains. To transition evaluation from single images to sequential frames and assess whether models truly internalize world knowledge while adhering to causal-temporal constraints, we introduce Envision-Score, a holistic metric integrating multi-dimensional consistency, physicality, and aesthetics. Comprehensive evaluation of 15 models (10 specialized T2I models, 5 unified models) uncovers: specialized T2I models demonstrate proficiency in aesthetic rendering yet lack intrinsic world knowledge. Unified multimodal models bridge this gap, consistently outperforming specialized counterparts in causal narrative coherence. However, even these unified architectures remain subordinate to closed-source models and struggle to overcome the core challenge of spatiotemporal consistency. This demonstrates that a focus on causally-isolated single images impedes multi-frame reasoning and generation, promoting static pattern matching over dynamic world modeling-ultimately limiting world knowledge internalization, generation.

6.4LGOct 19, 2024Code

FlexMol: A Flexible Toolkit for Benchmarking Molecular Relational Learning

Sizhe Liu, Jun Xia, Lecheng Zhang et al.

Molecular relational learning (MRL) is crucial for understanding the interaction behaviors between molecular pairs, a critical aspect of drug discovery and development. However, the large feasible model space of MRL poses significant challenges to benchmarking, and existing MRL frameworks face limitations in flexibility and scope. To address these challenges, avoid repetitive coding efforts, and ensure fair comparison of models, we introduce FlexMol, a comprehensive toolkit designed to facilitate the construction and evaluation of diverse model architectures across various datasets and performance metrics. FlexMol offers a robust suite of preset model components, including 16 drug encoders, 13 protein sequence encoders, 9 protein structure encoders, and 7 interaction layers. With its easy-to-use API and flexibility, FlexMol supports the dynamic construction of over 70, 000 distinct combinations of model architectures. Additionally, we provide detailed benchmark results and code examples to demonstrate FlexMol's effectiveness in simplifying and standardizing MRL model development and comparison.

2.3AIDec 4, 2024

Specification Generation for Neural Networks in Systems

Isha Chaudhary, Shuyi Lin, Cheng Tan et al.

Specifications - precise mathematical representations of correct domain-specific behaviors - are crucial to guarantee the trustworthiness of computer systems. With the increasing development of neural networks as computer system components, specifications gain more importance as they can be used to regulate the behaviors of these black-box models. Traditionally, specifications are designed by domain experts based on their intuition of correct behavior. However, this is labor-intensive and hence not a scalable approach as computer system applications diversify. We hypothesize that the traditional (aka reference) algorithms that neural networks replace for higher performance can act as effective proxies for correct behaviors of the models, when available. This is because they have been used and tested for long enough to encode several aspects of the trustworthy/correct behaviors in the underlying domain. Driven by our hypothesis, we develop a novel automated framework, SpecTRA to generate specifications for neural networks using references. We formulate specification generation as an optimization problem and solve it with observations of reference behaviors. SpecTRA clusters similar observations into compact specifications. We present specifications generated by SpecTRA for neural networks in adaptive bit rate and congestion control algorithms. Our specifications show evidence of being correct and matching intuition. Moreover, we use our specifications to show several unknown vulnerabilities of the SOTA models for computer systems.

5.1DCMar 28, 2018

Synergy: A HW/SW Framework for High Throughput CNNs on Embedded Heterogeneous SoC

Guanwen Zhong, Akshat Dubey, Tan Cheng et al.

Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on embedded-class, resource-constrained platforms. In this context, we present {\em Synergy}, an automated, hardware-software co-designed, pipelined, high-throughput CNN inference framework on embedded heterogeneous system-on-chip (SoC) architectures (Xilinx Zynq). {\em Synergy} leverages, through multi-threading, all the available on-chip resources, which includes the dual-core ARM processor along with the FPGA and the NEON SIMD engines as accelerators. Moreover, {\em Synergy} provides a unified abstraction of the heterogeneous accelerators (FPGA and NEON) and can adapt to different network configurations at runtime without changing the underlying hardware accelerator architecture by balancing workload across accelerators through work-stealing. {\em Synergy} achieves 7.3X speedup, averaged across seven CNN models, over a well-optimized software-only solution. {\em Synergy} demonstrates substantially better throughput and energy-efficiency compared to the contemporary CNN implementations on the same SoC architecture.