LGDec 18, 2014

Dynamic Structure Embedded Online Multiple-Output Regression for Stream Data

arXiv:1412.5732v29 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for applications requiring fast, memory-efficient modeling of multi-dimensional correlated data streams.

The paper tackles the problem of online multiple-output regression for stream data by proposing MORES, a method that dynamically learns coefficient and residual error structures to improve prediction accuracy, achieving an update speed of at least 2000 samples per second, which is over 15 times faster than the state-of-the-art.

Online multiple-output regression is an important machine learning technique for modeling, predicting, and compressing multi-dimensional correlated data streams. In this paper, we propose a novel online multiple-output regression method, called MORES, for stream data. MORES can \emph{dynamically} learn the structure of the coefficients change in each update step to facilitate the model's continuous refinement. We observe that limited expressive ability of the regression model, especially in the preliminary stage of online update, often leads to the variables in the residual errors being dependent. In light of this point, MORES intends to \emph{dynamically} learn and leverage the structure of the residual errors to improve the prediction accuracy. Moreover, we define three statistical variables to \emph{exactly} represent all the seen samples for \emph{incrementally} calculating prediction loss in each online update round, which can avoid loading all the training data into memory for updating model, and also effectively prevent drastic fluctuation of the model in the presence of noise. Furthermore, we introduce a forgetting factor to set different weights on samples so as to track the data streams' evolving characteristics quickly from the latest samples. Experiments on one synthetic dataset and three real-world datasets validate the effectiveness of the proposed method. In addition, the update speed of MORES is at least 2000 samples processed per second on the three real-world datasets, more than 15 times faster than the state-of-the-art online learning algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes