DBAIDec 11, 2024

Efficient Dynamic Attributed Graph Generation

arXiv:2412.08810v19 citationsh-index: 9ICDE
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for data management and graph analysis applications, representing an incremental improvement over existing dynamic graph generation methods.

The paper tackles the problem of generating dynamic attributed graphs that capture co-evolution of structure and attributes, addressing limitations in existing methods such as inefficiency in temporal random walk approaches. The proposed VRDAG framework reduces synthesis time significantly compared to state-of-the-art methods, as demonstrated through experiments on real-world datasets.

Data generation is a fundamental research problem in data management due to its diverse use cases, ranging from testing database engines to data-specific applications. However, real-world entities often involve complex interactions that cannot be effectively modeled by traditional tabular data. Therefore, graph data generation has attracted increasing attention recently. Although various graph generators have been proposed in the literature, there are three limitations: i) They cannot capture the co-evolution pattern of graph structure and node attributes. ii) Few of them consider edge direction, leading to substantial information loss. iii) Current state-of-the-art dynamic graph generators are based on the temporal random walk, making the simulation process time-consuming. To fill the research gap, we introduce VRDAG, a novel variational recurrent framework for efficient dynamic attributed graph generation. Specifically, we design a bidirectional message-passing mechanism to encode both directed structural knowledge and attribute information of a snapshot. Then, the temporal dependency in the graph sequence is captured by a recurrence state updater, generating embeddings that can preserve the evolution pattern of early graphs. Based on the hidden node embeddings, a conditional variational Bayesian method is developed to sample latent random variables at the neighboring timestep for new snapshot generation. The proposed generation paradigm avoids the time-consuming path sampling and merging process in existing random walk-based methods, significantly reducing the synthesis time. Finally, comprehensive experiments on real-world datasets are conducted to demonstrate the effectiveness and efficiency of the proposed model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes