On equivalence between linear-chain conditional random fields and hidden Markov chains
This clarifies a foundational misconception in machine learning, showing equivalence between generative and discriminative models, which is incremental but important for theoretical understanding.
The paper demonstrates that linear-chain conditional random fields (CRFs) and hidden Markov chains (HMCs) are equivalent, showing that for each CRF, an HMC can be explicitly constructed to have the same posterior distribution, proving they are not fundamentally different but differently parametrized models.
Practitioners successfully use hidden Markov chains (HMCs) in different problems for about sixty years. HMCs belong to the family of generative models and they are often compared to discriminative models, like conditional random fields (CRFs). Authors usually consider CRFs as quite different from HMCs, and CRFs are often presented as interesting alternative to HMCs. In some areas, like natural language processing (NLP), discriminative models have completely supplanted generative models. However, some recent results show that both families of models are not so different, and both of them can lead to identical processing power. In this paper we compare the simple linear-chain CRFs to the basic HMCs. We show that HMCs are identical to CRFs in that for each CRF we explicitly construct an HMC having the same posterior distribution. Therefore, HMCs and linear-chain CRFs are not different but just differently parametrized models.