Learning a Hybrid Architecture for Sequence Regression and Annotation
This work addresses a domain-specific problem in fields like biology for motif discovery and genome annotation, presenting an incremental improvement by extending existing HMM frameworks.
The paper tackles the problem of jointly modeling latent sequence features and functional mappings between hidden states and continuous response variables in hidden Markov models, showing that incorporating additional continuous responses improves sequence annotation and prediction performance on synthetic and real-world datasets.
When learning a hidden Markov model (HMM), sequen- tial observations can often be complemented by real-valued summary response variables generated from the path of hid- den states. Such settings arise in numerous domains, includ- ing many applications in biology, like motif discovery and genome annotation. In this paper, we present a flexible frame- work for jointly modeling both latent sequence features and the functional mapping that relates the summary response variables to the hidden state sequence. The algorithm is com- patible with a rich set of mapping functions. Results show that the availability of additional continuous response vari- ables can simultaneously improve the annotation of the se- quential observations and yield good prediction performance in both synthetic data and real-world datasets.