The Age of Correlated Features in Supervised Learning based Forecasting
This work addresses the problem of optimizing feature freshness for forecasting accuracy in time-series applications like solar power prediction, offering incremental improvements to existing methods.
The paper analyzes how the freshness (age) of correlated features affects supervised learning-based forecasting, proving that training loss depends on feature ages and is not always monotonic, but becomes non-decreasing under Markov chain conditions, with experiments on solar power prediction showing benefits from combining data with different ages and including age as an input feature.
In this paper, we analyze the impact of information freshness on supervised learning based forecasting. In these applications, a neural network is trained to predict a time-varying target (e.g., solar power), based on multiple correlated features (e.g., temperature, humidity, and cloud coverage). The features are collected from different data sources and are subject to heterogeneous and time-varying ages. By using an information-theoretic approach, we prove that the minimum training loss is a function of the ages of the features, where the function is not always monotonic. However, if the empirical distribution of the training data is close to the distribution of a Markov chain, then the training loss is approximately a non-decreasing age function. Both the training loss and testing loss depict similar growth patterns as the age increases. An experiment on solar power prediction is conducted to validate our theory. Our theoretical and experimental results suggest that it is beneficial to (i) combine the training data with different age values into a large training dataset and jointly train the forecasting decisions for these age values, and (ii) feed the age value as a part of the input feature to the neural network.