LGSTAT-MECHITCDMLOct 17, 2019

Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited

arXiv:1910.07663v14 citations
Originality Synthesis-oriented
AI Analysis

This reveals a surprising predictive gap in widely used neural network models for simple stimuli, highlighting the need for alternative methods like causal state inference, which is incremental but important for the machine learning community.

The study tested generalized linear models, reservoir computers, and LSTM RNNs on predicting stochastic processes from probabilistic deterministic finite-state automata, finding that these methods can fall short of maximal predictive accuracy by up to 50% after training and about 5% when optimized, despite previous methods achieving full accuracy with much less data.

Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA). PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. Unsurprisingly, LSTMs outperform RCs, which outperform generalized linear models. Surprisingly, each of these methods can fall short of the maximal predictive accuracy by as much as 50% after training and, when optimized, tend to fall short of the maximal predictive accuracy by ~5%, even though previously available methods achieve maximal predictive accuracy with orders-of-magnitude less data. Thus, despite the representational universality of RCs and RNNs, using them can engender a surprising predictive gap for simple stimuli. One concludes that there is an important and underappreciated role for methods that infer "causal states" or "predictive state representations".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes