Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
This addresses the challenge of efficient and scalable graph generation for domains like molecular design, though it builds incrementally on existing transformer and sequence modeling techniques.
The authors tackled the problem of scalable attributed graph generation by introducing AutoGraph, an autoregressive model using transformers that flattens graphs into sequences, achieving state-of-the-art performance with up to 100x faster generation and 3x faster training than diffusion models.
We introduce AutoGraph, a scalable autoregressive model for attributed graph generation using decoder-only transformers. By flattening graphs into random sequences of tokens through a reversible process, AutoGraph enables modeling graphs as sequences without relying on additional node features that are expensive to compute, in contrast to diffusion-based approaches. This results in sampling complexity and sequence lengths that scale optimally linearly with the number of edges, making it scalable and efficient for large, sparse graphs. A key success factor of AutoGraph is that its sequence prefixes represent induced subgraphs, creating a direct link to sub-sentences in language modeling. Empirically, AutoGraph achieves state-of-the-art performance on synthetic and molecular benchmarks, with up to 100x faster generation and 3x faster training than leading diffusion models. It also supports substructure-conditioned generation without fine-tuning and shows promising transferability, bridging language modeling and graph generation to lay the groundwork for graph foundation models. Our code is available at https://github.com/BorgwardtLab/AutoGraph.