LGDBMar 14, 2024

Iterative Forgetting: Online Data Stream Regression Using Database-Inspired Adaptive Granulation

arXiv:2403.09588v11 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient data stream regression in real-time systems like finance and telecommunications, though it is incremental as it builds on existing R*-tree and granulation ideas.

The paper tackled the problem of low-latency predictions for time-sensitive systems with continuous data streams and concept drift, resulting in a method that achieved a significant order-of-magnitude improvement in latency and training time while providing competitively accurate predictions.

Many modern systems, such as financial, transportation, and telecommunications systems, are time-sensitive in the sense that they demand low-latency predictions for real-time decision-making. Such systems often have to contend with continuous unbounded data streams as well as concept drift, which are challenging requirements that traditional regression techniques are unable to cater to. There exists a need to create novel data stream regression methods that can handle these scenarios. We present a database-inspired datastream regression model that (a) uses inspiration from R*-trees to create granules from incoming datastreams such that relevant information is retained, (b) iteratively forgets granules whose information is deemed to be outdated, thus maintaining a list of only recent, relevant granules, and (c) uses the recent data and granules to provide low-latency predictions. The R*-tree-inspired approach also makes the algorithm amenable to integration with database systems. Our experiments demonstrate that the ability of this method to discard data produces a significant order-of-magnitude improvement in latency and training time when evaluated against the most accurate state-of-the-art algorithms, while the R*-tree-inspired granulation technique provides competitively accurate predictions

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes