LGSep 14, 2021

Sum-Product-Attention Networks: Leveraging Self-Attention in Probabilistic Circuits

Zhongjie Yu, Devendra Singh Dhami, Kristian Kersting

arXiv:2109.06587v11.6

Originality Incremental advance

AI Analysis

This work addresses the need for more capable probabilistic generative models, particularly for tasks like image generation, though it appears incremental as it builds on existing probabilistic circuits and Transformers.

The paper tackles the problem of improving probabilistic circuits for generative modeling by integrating them with Transformers, resulting in SPAN, which outperforms state-of-the-art models on benchmark datasets and serves as an efficient generative image model.

Probabilistic circuits (PCs) have become the de-facto standard for learning and inference in probabilistic modeling. We introduce Sum-Product-Attention Networks (SPAN), a new generative model that integrates probabilistic circuits with Transformers. SPAN uses self-attention to select the most relevant parts of a probabilistic circuit, here sum-product networks, to improve the modeling capability of the underlying sum-product network. We show that while modeling, SPAN focuses on a specific set of independent assumptions in every product layer of the sum-product network. Our empirical evaluations show that SPAN outperforms state-of-the-art probabilistic generative models on various benchmark data sets as well is an efficient generative image model.

View on arXiv PDF

Similar