93.7NAJun 4
Concentrated real-pole uniform-in-time approximation of the matrix exponentialStefan Güttel, Shuai Shao
We propose an asympotically optimal choice of shared concentrated real poles of a family of rational approximants of time-dependent exponential functions $\exp(-tz)$ for $z \geq 0$ and $t$ in a positive time interval $T$. Our result extends a classical result by J.-E. Andersson [J. Approx. Theory, 32(2):85--95, 1981] on the asymptotic best rational approximation of $\exp(-z)$ with real poles. Numerical experiments demonstrate the near-optimality of our choice for various time ranges and for both small and large approximation degrees. An application of the uniform-in-time rational approximation using our proposed concentrated real poles to a linear constant-coefficient initial-value problem is also discussed.
62.6NAMay 19Code
Scalable parallel 3-D TEM inversion via rational approximation of the matrix exponentialRalph-Uwe Börner, Stefan Güttel, Thomas Günther
We present a novel parallel implementation for large-scale three-dimensional electromagnetic inversion based on a Gauss-Newton framework combined with a rational near-best approximation of the matrix exponential for transient simulations. The method employs parallel direct solvers for the shifted linear systems arising from the partial fraction representation of the rational approximation and demonstrates efficient parallel execution on a shared-memory architecture using MPI. A key property of the approach is that the time dependence is entirely contained in the residuals of the employed rational functions, such that the computation of forward responses and sensitivities becomes effectively independent of the number of desired observation times. Model regularization is done with smoothness constraints, formulated with Raviart-Thomas elements. The linearized inverse problems are solved using LSQR, using an implicit parallel Jacobian operator. Numerical experiments demonstrate the successful recovery of a synthetic 3-D conductivity structure with approximately 700,000 degrees of freedom. The study further discusses computational bottlenecks related to memory consumption and shared-memory scalability arising from the simultaneous storage of multiple sparse matrix factorizations. Possible improvements based on preconditioned iterative solvers and distributed high-performance computing architectures are outlined. The implementation in the Julia programming language is released as open-source software to support reproducible research and further development by the geophysical inversion community.
31.0NAMay 8
Stabilizing randomized GMRES through flexible GMRESStefan Güttel, John W. Pearson
We explore the use of flexible GMRES as an outer wrapper for sketched GMRES. Building on a new bound for the residual of FGMRES in terms of the residual of the preconditioner, we derive a practical randomized solver that requires very little parameter tuning, while still being efficient and robust in the sense of generating non-increasing residual norms.
NANov 3, 2017
Conversions between barycentric, RKFUN, and Newton representations of rational interpolantsSteven Elsworth, Stefan Güttel
We derive explicit formulas for converting between rational interpolants in barycentric, rational Krylov (RKFUN), and Newton form. We show applications of these conversions when working with rational approximants produced by the AAA algorithm [Y. Nakatsukasa, O. Sète, L. N. Trefethen, arXiv preprint 1612.00337, 2016] within the Rational Krylov Toolbox and for the solution of nonlinear eigenvalue problems.
13.3NAApr 30
Flexible GMRES converges in two phasesStefan Güttel, Lauri Nyman
We derive a sharp upper bound on the residuals produced by the flexible GMRES (FGMRES) method. The bound shows that FGMRES exhibits two phases of convergence depending on the residual tolerance of the inner preconditioner. For small tolerances, the convergence of FGMRES is practically geometric with a constant rate throughout, while for looser tolerances the two-phase behavior becomes more pronounced. We also show that the derived bound cannot be improved and construct an example for which it becomes an equality.
LGJan 13
Fast and explainable clustering in the Manhattan and Tanimoto distanceStefan Güttel, Kaustubh Roy
The CLASSIX algorithm is a fast and explainable approach to data clustering. In its original form, this algorithm exploits the sorting of the data points by their first principal component to truncate the search for nearby data points, with nearness being defined in terms of the Euclidean distance. Here we extend CLASSIX to other distance metrics, including the Manhattan distance and the Tanimoto distance. Instead of principal components, we use an appropriate norm of the data vectors as the sorting criterion, combined with the triangle inequality for search termination. In the case of Tanimoto distance, a provably sharper intersection inequality is used to further boost the performance of the new algorithm. On a real-world chemical fingerprint benchmark, CLASSIX Tanimoto is about 30 times faster than the Taylor--Butina algorithm, and about 80 times faster than DBSCAN, while computing higher-quality clusters in both cases.
LGFeb 3, 2022
Fast and explainable clustering based on sortingXinye Chen, Stefan Güttel
We introduce a fast and explainable clustering method called CLASSIX. It consists of two phases, namely a greedy aggregation phase of the sorted data into groups of nearby data points, followed by the merging of groups into clusters. The algorithm is controlled by two scalar parameters, namely a distance parameter for the aggregation and another parameter controlling the minimal cluster size. Extensive experiments are conducted to give a comprehensive evaluation of the clustering performance on synthetic and real-world datasets, with various cluster shapes and low to high feature dimensionality. Our experiments demonstrate that CLASSIX competes with state-of-the-art clustering algorithms. The algorithm has linear space complexity and achieves near linear time complexity on a wide range of problems. Its inherent simplicity allows for the generation of intuitive explanations of the computed clusters.
LGJan 14, 2022
An efficient aggregation method for the symbolic representation of temporal dataXinye Chen, Stefan Güttel
Symbolic representations are a useful tool for the dimension reduction of temporal data, allowing for the efficient storage of and information retrieval from time series. They can also enhance the training of machine learning algorithms on time series data through noise reduction and reduced sensitivity to hyperparameters. The adaptive Brownian bridge-based aggregation (ABBA) method is one such effective and robust symbolic representation, demonstrated to accurately capture important trends and shapes in time series. However, in its current form the method struggles to process very large time series. Here we present a new variant of the ABBA method, called fABBA. This variant utilizes a new aggregation approach tailored to the piecewise representation of time series. By replacing the k-means clustering used in ABBA with a sorting-based aggregation technique, and thereby avoiding repeated sum-of-squares error computations, the computational complexity is significantly reduced. In contrast to the original method, the new approach does not require the number of time series symbols to be specified in advance. Through extensive tests we demonstrate that the new method significantly outperforms ABBA with a considerable reduction in runtime while also outperforming the popular SAX and 1d-SAX representations in terms of reconstruction accuracy. We further demonstrate that fABBA can compress other data types such as images.
LGNov 19, 2021
Machine Learning-Based Soft Sensors for Vacuum Distillation UnitKamil Oster, Stefan Güttel, Lu Chen et al.
Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample processing (collection, laboratory measurements, results analysis, reporting) can lead to detrimental economic effects. One of the strategies to deal with this problem is soft sensors. Soft sensors are a collection of models that can be used to predict and forecast some infrequently measured properties (such as laboratory measurements of petroleum products) based on more frequent measurements of quantities like temperature, pressure and flow rate provided by physical sensors. Soft sensors short-cut the pathway to obtain relevant information about the product quality, often providing measurements as frequently as every minute. One of the applications of soft sensors is for the real-time optimization of a chemical process by a targeted adaptation of operating parameters. Models used for soft sensors can have various forms, however, among the most common are those based on artificial neural networks (ANNs). While soft sensors can deal with some of the issues in the refinery processes, their development and deployment can pose other challenges that are addressed in this paper. Firstly, it is important to enhance the quality of both sets of data (laboratory measurements and physical sensors) in a data pre-processing stage (as described in Methodology section). Secondly, once the data sets are pre-processed, different models need to be tested against prediction error and the model's interpretability. In this work, we present a framework for soft sensor development from raw data to ready-to-use models.
LGJul 5, 2021
A comparison of LSTM and GRU networks for learning symbolic sequencesRoberto Cahuantzi, Xinye Chen, Stefan Güttel
We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network's capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase in RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low-complexity sequences while on high-complexity sequences LSTMs perform better.
APJun 17, 2021
Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation UnitKamil Oster, Stefan Güttel, Jonathan L. Shapiro et al.
Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall statistics of the data. This problem can have a significant impact on further data analysis and can lead to reduction in the accuracy of predictive models. There is a plethora of various techniques for outliers detection, however, aside from theoretical work, they all require case study work. Two types of outliers were considered: short-term (erroneous data, noise) and long-term outliers (e.g. malfunctioning for longer periods). The data used were taken from the vacuum distillation unit (VDU) of an Asian refinery and included 40 physical sensors (temperature, pressure and flow rate). We used a modified method for 3$σ$ thresholds to identify the short-term outliers, i.e. ensors data are divided into chunks determined by change points and 3$σ$ thresholds are calculated within each chunk representing near-normal distribution. We have shown that piecewise 3$σ$ method offers a better approach to short-term outliers detection than 3$σ$ method applied to the entire time series. Nevertheless, this does not perform well for long-term outliers (which can represent another state in the data). In this case, we used principal component analysis (PCA) with Hotelling's $T^2$ statistics to identify the long-term outliers. The results obtained with PCA were subject to DBSCAN clustering method. The outliers (which were visually obvious and correctly detected by the PCA method) were also correctly identified by DBSCAN which supported the consistency and accuracy of the PCA method.
LGMar 27, 2020
ABBA: Adaptive Brownian bridge-based symbolic aggregation of time seriesSteven Elsworth, Stefan Güttel
A new symbolic representation of time series, called ABBA, is introduced. It is based on an adaptive polygonal chain approximation of the time series into a sequence of tuples, followed by a mean-based clustering to obtain the symbolic representation. We show that the reconstruction error of this representation can be modelled as a random walk with pinned start and end points, a so-called Brownian bridge. This insight allows us to make ABBA essentially parameter-free, except for the approximation tolerance which must be chosen. Extensive comparisons with the SAX and 1d-SAX representations are included in the form of performance profiles, showing that ABBA is able to better preserve the essential shape information of time series compared to other approaches. Advantages and applications of ABBA are discussed, including its in-built differencing property and use for anomaly detection, and Python implementations provided.
LGMar 12, 2020
Time Series Forecasting Using LSTM Networks: A Symbolic ApproachSteven Elsworth, Stefan Güttel
Machine learning methods trained on raw numerical time series data exhibit fundamental limitations such as a high sensitivity to the hyper parameters and even to the initialization of random weights. A combination of a recurrent neural network with a dimension-reducing symbolic representation is proposed and applied for the purpose of time series forecasting. It is shown that the symbolic representation can help to alleviate some of the aforementioned problems and, in addition, might allow for faster training without sacrificing the forecast performance.
NAJul 22, 2015
Near-optimal perfectly matched layers for indefinite Helmholtz problemsVladimir Druskin, Stefan Güttel, Leonid Knizhnerman
A new construction of an absorbing boundary condition for indefinite Helmholtz problems on unbounded domains is presented. This construction is based on a near-best uniform rational interpolant of the inverse square root function on the union of a negative and positive real interval, designed with the help of a classical result by Zolotarev. Using Krein's interpretation of a Stieltjes continued fraction, this interpolant can be converted into a three-term finite difference discretization of a perfectly matched layer (PML) which converges exponentially fast in the number of grid points. The convergence rate is asymptotically optimal for both propagative and evanescent wave modes. Several numerical experiments and illustrations are included.