Correlated daily time series and forecasting in the M4 competition
This work addresses forecasting accuracy in time series competitions, but it is incremental as it highlights issues with competition design rather than proposing a novel solution.
The authors tackled the problem of forecasting daily time series in the M4 competition by using an ensemble of statistical methods and a 'correlator' method, which they found responsible for most gains over a naive baseline, though they identified data leakage as a key factor in its success.
We participated in the M4 competition for time series forecasting and describe here our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naive constant forecasting method. We identify data leakage as one reason for its success, partly due to test data selected from different time intervals, and partly due to quality issues in the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of those leakages could be avoided by the participants.