Generation of synthetic delay time series for air transport applications
This addresses data scarcity and privacy issues in air transport applications, though it is incremental as it compares existing methods on a specific domain.
The paper tackled generating realistic synthetic time series of airport delays using large collections of operations from Europe and the US, and found that a simplified Genetic Algorithm approach produced time series almost indistinguishable from real ones while maintaining high variability.
The generation of synthetic data is receiving increasing attention from the scientific community, thanks to its ability to solve problems like data scarcity and privacy, and is starting to find applications in air transport. We here tackle the problem of generating synthetic, yet realistic, time series of delays at airports, starting from large collections of operations in Europe and the US. We specifically compare three models, two of them based on state of the art Deep Learning algorithms, and one simplified Genetic Algorithm approach. We show how the latter can generate time series that are almost indistinguishable from real ones, while maintaining a high variability. We further validate the resulting time series in a problem of detecting delay propagations between airports. We finally make the synthetic data available to the scientific community.