ML AI LG SYOct 6, 2018

Discretizing Logged Interaction Data Biases Learning for Decision-Making

arXiv:1810.03025v11.91 citations

Originality Synthesis-oriented

AI Analysis

This addresses a preprocessing issue for researchers and practitioners in fields like customer analytics, though it is incremental as it focuses on a specific bias in existing methods.

The paper tackles the problem of bias introduced by discretizing irregularly measured time series data for decision-making models, showing that using continuous-time models avoids this discretization bias.

Time series data that are not measured at regular intervals are commonly discretized as a preprocessing step. For example, data about customer arrival times might be simplified by summing the number of arrivals within hourly intervals, which produces a discrete-time time series that is easier to model. In this abstract, we show that discretization introduces a bias that affects models trained for decision-making. We refer to this phenomenon as discretization bias, and show that we can avoid it by using continuous-time models instead.

View on arXiv PDF

Similar