A Generative Model for Multi-Dialect Representation
This work addresses the challenge of efficient representation for real-life handwritten multi-dialect language data, which is an incremental improvement over existing methods.
The paper tackles the problem of representing unlabeled handwritten multi-dialect data by proposing the Mode Synthesizing Machine (MSM), a generative model that achieves much lower error values than Restricted Boltzmann Machines (RBM) on both independent and mixed datasets.
In the era of deep learning several unsupervised models have been developed to capture the key features in unlabeled handwritten data. Popular among them is the Restricted Boltzmann Machines RBM. However, due to the novelty in handwritten multidialect data, the RBM may fail to generate an efficient representation. In this paper we propose a generative model, the Mode Synthesizing Machine MSM for on-line representation of real life handwritten multidialect language data. The MSM takes advantage of the hierarchical representation of the modes of a data distribution using a two-point error update to learn a sequence of representative multidialects in a generative way. Experiments were performed to evaluate the performance of the MSM over the RBM with the former attaining much lower error values than the latter on both independent and mixed data set.