LGCLJun 8, 2022

Set Interdependence Transformer: Set-to-Sequence Neural Networks for Permutation Learning and Structure Prediction

arXiv:2206.03720v14 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses set-to-sequence problems in NLP, computer vision, and structure prediction, offering incremental improvements for handling varying cardinalities and interactions.

The paper tackles the problem of mapping input sets to permuted sequences, which is challenging due to relational reasoning and combinatorial complexity, and proposes the Set Interdependence Transformer to enhance higher-order interactions, achieving state-of-the-art performance on tasks like combinatorial optimization, sentence ordering, and product catalog structure prediction.

The task of learning to map an input set onto a permuted sequence of its elements is challenging for neural networks. Set-to-sequence problems occur in natural language processing, computer vision and structure prediction, where interactions between elements of large sets define the optimal output. Models must exhibit relational reasoning, handle varying cardinalities and manage combinatorial complexity. Previous attention-based methods require $n$ layers of their set transformations to explicitly represent $n$-th order relations. Our aim is to enhance their ability to efficiently model higher-order interactions through an additional interdependence component. We propose a novel neural set encoding method called the Set Interdependence Transformer, capable of relating the set's permutation invariant representation to its elements within sets of any cardinality. We combine it with a permutation learning module into a complete, 3-part set-to-sequence model and demonstrate its state-of-the-art performance on a number of tasks. These range from combinatorial optimization problems, through permutation learning challenges on both synthetic and established NLP datasets for sentence ordering, to a novel domain of product catalog structure prediction. Additionally, the network's ability to generalize to unseen sequence lengths is investigated and a comparative empirical analysis of the existing methods' ability to learn higher-order interactions is provided.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes