LG AI DS MLSep 17, 2020

Neural Rough Differential Equations for Long Time Series

James Morrill, Cristopher Salvi, Patrick Kidger, James Foster, Terry Lyons

arXiv:2009.08295v430.1187 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses memory and efficiency issues in continuous-time modeling of long time series for machine learning applications, representing an incremental advancement over existing Neural CDE methods.

The authors tackled the challenge of modeling long time series by extending neural controlled differential equations (CDEs) using rough path theory, introducing Neural Rough Differential Equations (RDEs). They demonstrated significant improvements, including training speed-ups, better model performance, and reduced memory usage on datasets with up to 17,000 observations.

Neural controlled differential equations (CDEs) are the continuous-time analogue of recurrent neural networks, as Neural ODEs are to residual networks, and offer a memory-efficient continuous-time way to model functions of potentially irregular time series. Existing methods for computing the forward pass of a Neural CDE involve embedding the incoming time series into path space, often via interpolation, and using evaluations of this path to drive the hidden state. Here, we use rough path theory to extend this formulation. Instead of directly embedding into path space, we instead represent the input signal over small time intervals through its \textit{log-signature}, which are statistics describing how the signal drives a CDE. This is the approach for solving \textit{rough differential equations} (RDEs), and correspondingly we describe our main contribution as the introduction of Neural RDEs. This extension has a purpose: by generalising the Neural CDE approach to a broader class of driving signals, we demonstrate particular advantages for tackling long time series. In this regime, we demonstrate efficacy on problems of length up to 17k observations and observe significant training speed-ups, improvements in model performance, and reduced memory requirements compared to existing approaches.

View on arXiv PDF Code

Similar