LGJun 8, 2022

Hidden Markov Models with Momentum

Andrew Miller, Fabio Di Troia, Mark Stamp

arXiv:2206.04057v11.8h-index: 36

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for practitioners using HMMs in applications like text and malware analysis, as it speeds up training without enhancing ultimate accuracy.

The researchers tackled the problem of slow convergence in training Hidden Markov Models by adding momentum to the Baum-Welch algorithm, finding that it reduces the number of iterations needed for initial convergence but does not improve final model performance after many iterations.

Momentum is a popular technique for improving convergence rates during gradient descent. In this research, we experiment with adding momentum to the Baum-Welch expectation-maximization algorithm for training Hidden Markov Models. We compare discrete Hidden Markov Models trained with and without momentum on English text and malware opcode data. The effectiveness of momentum is determined by measuring the changes in model score and classification accuracy due to momentum. Our extensive experiments indicate that adding momentum to Baum-Welch can reduce the number of iterations required for initial convergence during HMM training, particularly in cases where the model is slow to converge. However, momentum does not seem to improve the final model performance at a high number of iterations.

View on arXiv PDF

Similar