58.9NAMay 16
Rational approximation and intrinsic Gaussian processesChristopher Beattie, David Higdon, Leanna House et al.
Gaussian processes (GPs) defined through intrinsic random fields provide a flexible framework for modeling spatial phenomena, and have been advocated in a variety of applications over the past several decades. Nevertheless, their adoption has lagged behind traditional, covariance-based approaches, in part because the intrinsic formulation has lacked an accompanying toolkit of computational methods and dependence specifications that facilitate fitting and prediction. We develop here a systematic framework for modeling intrinsic GPs and introduce practical algorithms and dependence/variogram models for modeling, inference and computation that parallel those of traditional, stationary GPs. We explore a close connection between intrinsic GP models and rational approximation, which clarifies the underlying problem structure. Numerical examples illustrate how the new tools can be deployed in practice, highlighting the advantages of intrinsic-field modeling in terms of robustness, interpretability, and computational efficiency.
LGJun 12, 2020
Smartphone Transportation Mode Recognition Using a Hierarchical Machine Learning Classifier and Pooled Features From Time and Frequency DomainsHuthaifa I. Ashqar, Mohammed H. Almannaa, Mohammed Elhenawy et al.
This paper develops a novel two-layer hierarchical classifier that increases the accuracy of traditional transportation mode classification algorithms. This paper also enhances classification accuracy by extracting new frequency domain features. Many researchers have obtained these features from global positioning system data; however, this data was excluded in this paper, as the system use might deplete the smartphone's battery and signals may be lost in some areas. Our proposed two-layer framework differs from previous classification attempts in three distinct ways: 1) the outputs of the two layers are combined using Bayes' rule to choose the transportation mode with the largest posterior probability; 2) the proposed framework combines the new extracted features with traditionally used time domain features to create a pool of features; and 3) a different subset of extracted features is used in each layer based on the classified modes. Several machine learning techniques were used, including k-nearest neighbor, classification and regression tree, support vector machine, random forest, and a heterogeneous framework of random forest and support vector machine. Results show that the classification accuracy of the proposed framework outperforms traditional approaches. Transforming the time domain features to the frequency domain also adds new features in a new space and provides more control on the loss of information. Consequently, combining the time domain and the frequency domain features in a large pool and then choosing the best subset results in higher accuracy than using either domain alone. The proposed two-layer classifier obtained a maximum classification accuracy of 97.02%.
CYJun 12, 2020
Modeling bike availability in a bike-sharing system using machine learningHuthaifa I. Ashqar, Mohammed Elhenawy, Mohammed H. Almannaa et al.
This paper models the availability of bikes at San Francisco Bay Area Bike Share stations using machine learning algorithms. Random Forest (RF) and Least-Squares Boosting (LSBoost) were used as univariate regression algorithms, and Partial Least-Squares Regression (PLSR) was applied as a multivariate regression algorithm. The univariate models were used to model the number of available bikes at each station. PLSR was applied to reduce the number of required prediction models and reflect the spatial correlation between stations in the network. Results clearly show that univariate models have lower error predictions than the multivariate model. However, the multivariate model results are reasonable for networks with a relatively large number of spatially correlated stations. Results also show that station neighbors and the prediction horizon time are significant predictors. The most effective prediction horizon time that produced the least prediction error was 15 minutes.