LGCLDec 21, 2023

Structure-Aware Path Inference for Neural Finite State Transducers

arXiv:2312.13614v1h-index: 6
Originality Synthesis-oriented
AI Analysis

This work addresses a specific inference problem in neurosymbolic models for researchers in computational linguistics, but it is incremental as it compares existing methods with a novel one that underperforms.

The paper tackled the challenge of imputing latent alignment paths in neural finite-state transducers (NFSTs) for sequence transduction, and found that simpler autoregressive models outperformed a more sophisticated structure-aware approach, except on an artificial task designed to confuse them.

Neural finite-state transducers (NFSTs) form an expressive family of neurosymbolic sequence transduction models. An NFST models each string pair as having been generated by a latent path in a finite-state transducer. As they are deep generative models, both training and inference of NFSTs require inference networks that approximate posterior distributions over such latent variables. In this paper, we focus on the resulting challenge of imputing the latent alignment path that explains a given pair of input and output strings (e.g., during training). We train three autoregressive approximate models for amortized inference of the path, which can then be used as proposal distributions for importance sampling. All three models perform lookahead. Our most sophisticated (and novel) model leverages the FST structure to consider the graph of future paths; unfortunately, we find that it loses out to the simpler approaches -- except on an artificial task that we concocted to confuse the simpler approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes