Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models
This work addresses the scalability and novelty limitations of current graph generative models, which are critical for applications like molecular discovery and circuit design.
The paper proposes a lightweight autoregressive framework for graph generation that uses structure-guided topological ordering and a two-phase training strategy to improve scalability and novelty, achieving near log-linear complexity and outperforming baselines in novelty while maintaining high validity and uniqueness.
Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cybersecurity, and beyond. However, current graph generative models remain limited by scalability and novelty. Diffusion-based methods often require costly full-adjacency operations and long denoising chains, while many autoregressive and hybrid models have at least quadratic complexity. In addition, these models often imitate training graphs rather than generalize beyond them. We propose a lightweight autoregressive framework to address these issues. It uses a structure-guided topological ordering to serialize graphs into regular edge sequences, enabling near log-linear generation, and a two-phase training strategy that combines exploration-oriented augmentation with iterative refinement to reduce overfitting and promote controlled novelty. Experiments on molecular and non-molecular benchmarks show that our approach improves novelty while preserving high validity and uniqueness. The framework also supports both LSTM and Mamba-style causal sequence backbones, with large-memory accelerators enabling longer graph-sequence experiments beyond typical GPU limits.