LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
This work addresses the problem of generating synthetic DAGs for benchmarking computing systems while preserving intellectual property, representing an incremental advancement in DAG generative models.
The paper tackles the challenge of generating realistic directed acyclic graphs (DAGs) for domains like hardware synthesis and computing system benchmarking, introducing LayerDAG, an autoregressive diffusion model that outperforms existing methods in expressiveness and generalization, particularly for large-scale DAGs with up to 400 nodes, and enhances ML-based surrogate model training for improved accuracy in performance prediction.
Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can be used for benchmarking computing systems while preserving intellectual property. However, generating realistic DAGs is challenging due to their inherent directional and logical dependencies. This paper introduces LayerDAG, an autoregressive diffusion model, to address these challenges. LayerDAG decouples the strong node dependencies into manageable units that can be processed sequentially. By interpreting the partial order of nodes as a sequence of bipartite graphs, LayerDAG leverages autoregressive generation to model directional dependencies and employs diffusion models to capture logical dependencies within each bipartite graph. Comparative analyses demonstrate that LayerDAG outperforms existing DAG generative models in both expressiveness and generalization, particularly for generating large-scale DAGs with up to 400 nodes-a critical scenario for system benchmarking. Extensive experiments on both synthetic and real-world flow graphs from various computing platforms show that LayerDAG generates valid DAGs with superior statistical properties and benchmarking performance. The synthetic DAGs generated by LayerDAG enhance the training of ML-based surrogate models, resulting in improved accuracy in predicting performance metrics of real-world DAGs across diverse computing platforms.