MELGMLDec 9, 2024

Variable Selection for Comparing High-dimensional Time-Series Data

arXiv:2412.06870v1h-index: 13
Originality Incremental advance
AI Analysis

This addresses the need for efficient validation and comparison of complex simulators and emulators in fields like fluid dynamics and traffic modeling, though it is incremental as it builds on existing statistical methods for time-series analysis.

The paper tackles the problem of identifying which variables and time intervals differ between two high-dimensional time-series datasets, proposing an approach that splits time into subintervals and performs statistical tests. It demonstrates usefulness in validating simulators and emulators, such as comparing a deep neural network against a particle-based fluid simulator and analyzing parameter changes in a traffic simulator.

Given a pair of multivariate time-series data of the same length and dimensions, an approach is proposed to select variables and time intervals where the two series are significantly different. In applications where one time series is an output from a computationally expensive simulator, the approach may be used for validating the simulator against real data, for comparing the outputs of two simulators, and for validating a machine learning-based emulator against the simulator. With the proposed approach, the entire time interval is split into multiple subintervals, and on each subinterval, the two sample sets are compared to select variables that distinguish their distributions and a two-sample test is performed. The validity and limitations of the proposed approach are investigated in synthetic data experiments. Its usefulness is demonstrated in an application with a particle-based fluid simulator, where a deep neural network model is compared against the simulator, and in an application with a microscopic traffic simulator, where the effects of changing the simulator's parameters on traffic flows are analysed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes