Deviation bound for non-causal machine learning
This work addresses a foundational limitation in analyzing machine learning algorithms for non-causal data, such as in NLP, but it appears incremental as it extends existing concentration inequality methods to a new modeling framework.
The paper tackles the problem that existing concentration inequalities cannot be applied to popular deep neural networks in natural language processing due to non-causal data dependencies, and it provides a framework for modeling non-causal random fields and obtains a Hoeffding-type concentration inequality for this framework.
Concentration inequalities are widely used for analyzing machine learning algorithms. However, current concentration inequalities cannot be applied to some of the most popular deep neural networks, notably in natural language processing. This is mostly due to the non-causal nature of such involved data, in the sense that each data point depends on other neighbor data points. In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework. The proof of this result relies on a local approximation of the non-causal random field by a function of a finite number of i.i.d. random variables.