Jeremy Levesley

NA
h-index21
11papers
164citations
Novelty25%
AI Score21

11 Papers

NAApr 18, 2012
Multilevel Sparse Kernel-Based Interpolation

Emmanuil H. Georgoulis, Jeremy Levesley, Fazli Subhan

A multilevel kernel-based interpolation method, suitable for moderately high-dimensional function interpolation problems, is proposed. The method, termed multilevel sparse kernel-based interpolation (MLSKI, for short), uses both level-wise and direction-wise multilevel decomposition of structured (or mildly unstructured) interpolation data sites in conjunction with the application of kernel-based interpolants with different scaling in each direction. The multilevel interpolation algorithm is based on a hierarchical decomposition of the data sites, whereby at each level the detail is added to the interpolant by interpolating the resulting residual of the previous level. On each level, anisotropic radial basis functions are used for solving a number of small interpolation problems, which are subsequently linearly combined to produce the interpolant. MLSKI can be viewed as an extension of $d$-boolean interpolation (which is closely related to ideas in sparse grid and hyperbolic crosses literature) to kernel-based functions, within the hierarchical multilevel framework to achieve accelerated convergence. Numerical experiments suggest that the new algorithm is numerically stable and efficient for the reconstruction of large data in $\mathbb{R}^{d}\times \mathbb{R}$, for $d = 2, 3, 4$, with tens or even hundreds of thousands data points. Also, MLSKI appears to be generally superior over classical radial basis function methods in terms of complexity, run time and convergence at least for large data sets.

NAJan 14, 2015
Fast multilevel sparse Gaussian kernels for high-dimensional approximation and integration

Zhaonan Dong, Emmanuil H. Georgoulis, Jeremy Levesley et al.

A fast multilevel algorithm based on directionally scaled tensor-product Gaussian kernels on structured sparse grids is proposed for interpolation of high-dimensional functions and for the numerical integration of high-dimensional integrals. The algorithm is based on the recent Multilevel Sparse Kernel-based Interpolation (MLSKI) method (Georgoulis, Levesley \& Subhan, \emph{SIAM J. Sci. Comput.}, 35(2), pp.~A815--A831, 2013), with particular focus on the fast implementation of Gaussian-based MLSKI for interpolation and integration problems of high-dimen-sional functions $f:[0,1]^d\to\mathbb{R}$, with $5\le d\le 10$. The MLSKI interpolation procedure is shown to be interpolatory and a fast implementation is proposed. More specifically, exploiting the tensor-product nature of anisotropic Gaussian kernels, one-dimensional cardinal basis functions on a sequence of hierarchical equidistant nodes are precomputed to machine precision, rendering the interpolation problem into a fully parallelisable ensemble of linear combinations of function evaluations. A numerical integration algorithm is also proposed, based on interpolating the (high-dimensional) integrand. A series of numerical experiments highlights the applicability of the proposed algorithm for interpolation and integration for up to 10-dimensional problems.

NAOct 19, 2017
Multilevel sparse grids collocation for linear partial differential equations, with tensor product smooth basis functions

Yangzhang Zhao, Qi Zhang, Jeremy Levesley

Radial basis functions have become a popular tool for approximation and solution of partial differential equations (PDEs). The recently proposed multilevel sparse interpolation with kernels (MuSIK) algorithm proposed in \cite{Georgoulis} shows good convergence. In this paper we use a sparse kernel basis for the solution of PDEs by collocation. We will use the form of approximation proposed and developed by Kansa \cite{Kansa1986}. We will give numerical examples using a tensor product basis with the multiquadric (MQ) and Gaussian basis functions. This paper is novel in that we consider space-time PDEs in four dimensions using an easy-to-implement algorithm, with smooth approximations. The accuracy observed numerically is as good, with respect to the number of data points used, as other methods in the literature; see \cite{Langer1,Wang1}.

CLMay 31, 2022
An Informational Space Based Semantic Analysis for Scientific Texts

Neslihan Suzen, Alexander N. Gorban, Jeremy Levesley et al.

One major problem in Natural Language Processing is the automatic analysis and representation of human language. Human language is ambiguous and deeper understanding of semantics and creating human-to-machine interaction have required an effort in creating the schemes for act of communication and building common-sense knowledge bases for the 'meaning' in texts. This paper introduces computational methods for semantic analysis and the quantifying the meaning of short scientific texts. Computational methods extracting semantic feature are used to analyse the relations between texts of messages and 'representations of situations' for a newly created large collection of scientific texts, Leicester Scientific Corpus. The representation of scientific-specific meaning is standardised by replacing the situation representations, rather than psychological properties, with the vectors of some attributes: a list of scientific subject categories that the text belongs to. First, this paper introduces 'Meaning Space' in which the informational representation of the meaning is extracted from the occurrence of the word in texts across the scientific categories, i.e., the meaning of a word is represented by a vector of Relative Information Gain about the subject categories. Then, the meaning space is statistically analysed for Leicester Scientific Dictionary-Core and we investigate 'Principal Components of the Meaning' to describe the adequate dimensions of the meaning. The research in this paper conducts the base for the geometric representation of the meaning of texts.

NAMar 12, 2017
Convergence of Multilevel Stationary Gaussian Quasi-Interpolation

Simon Hubbert, Jeremy Levesley

In this paper we present a new multilevel quasi-interpolation algorithm for smooth periodic functions using scaled Gaussians as basis functions. Recent research in this area has focussed upon implementations using basis function with finite smoothness. In this paper we deliver a first error estimates for the multilevel algorithm using analytic basis functions. The estimate has two parts, one involving the convergence of a low degree polynomial truncation term and one involving the control of the remainder of the truncation as the algorithm proceeds. Thus, numerically one observes a convergent scheme. Numerical results suggest that the scheme converges much faster than the theory shows.

NAMar 5, 2017
Approximation of exponential-type functions on a uniform grid by shifts of a basis function

Alexander Kushpel, Jeremy Levesley, Xingping Sun

In this paper, we study the problem of interpolating a continuous function at $(n+1)$ equally-spaced points in the interval $[0,1]$, using shifts of a kernel on the $(1/n)$-spaced infinite grid. The archetypal example here is approximation using shifts of a Gaussian kernel. We present new results concerning interpolation of functions of exponential type, in particular, polynomials on the integer grid as a step en route to solve the general interpolation problem. For the Gaussian kernel we introduce a new class of polynomials, closely related to the probabilistic Hermite polynomials and show that evaluations of the polynomials at the integer points provide the coefficients of the interpolants. Taking cue from the classical Newton polynomial interpolation, we derive a closed formula for the Gaussian interpolant of a continuous function on a uniform grid in the unit interval.

LGFeb 9, 2024
What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Neslihan Suzen, Evgeny M. Mirkes, Damian Roland et al.

Electronic patient records (EPRs) produce a wealth of data but contain significant missing information. Understanding and handling this missing data is an important part of clinical data analysis and if left unaddressed could result in bias in analysis and distortion in critical conclusions. Missing data may be linked to health care professional practice patterns and imputation of missing data can increase the validity of clinical decisions. This study focuses on statistical approaches for understanding and interpreting the missing data and machine learning based clinical data imputation using a single centre's paediatric emergency data and the data from UK's largest clinical audit for traumatic injury database (TARN). In the study of 56,961 data points related to initial vital signs and observations taken on children presenting to an Emergency Department, we have shown that missing data are likely to be non-random and how these are linked to health care professional practice patterns. We have then examined 79 TARN fields with missing values for 5,791 trauma cases. Singular Value Decomposition (SVD) and k-Nearest Neighbour (kNN) based missing data imputation methods are used and imputation results against the original dataset are compared and statistically tested. We have concluded that the 1NN imputer is the best imputation which indicates a usual pattern of clinical decision making: find the most similar patients and take their attributes as imputation.

CLApr 26, 2021
Semantic Analysis for Automated Evaluation of the Potential Impact of Research Articles

Neslihan Suzen, Alexander Gorban, Jeremy Levesley et al.

Can the analysis of the semantics of words used in the text of a scientific paper predict its future impact measured by citations? This study details examples of automated text classification that achieved 80% success rate in distinguishing between highly-cited and little-cited articles. Automated intelligent systems allow the identification of promising works that could become influential in the scientific community. The problems of quantifying the meaning of texts and representation of human language have been clear since the inception of Natural Language Processing. This paper presents a novel method for vector representation of text meaning based on information theory and show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus. We describe the experimental framework used to evaluate the impact of scientific articles through their informational semantics. Our interest is in citation classification to discover how important semantics of texts are in predicting the citation count. We propose the semantics of texts as an important factor for citation prediction. For each article, our system extracts the abstract of paper, represents the words of the abstract as vectors in Meaning Space, automatically analyses the distribution of scientific categories (Web of Science categories) within the text of abstract, and then classifies papers according to citation counts (highly-cited, little-cited). We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.

CLSep 18, 2020
Principal Components of the Meaning

Neslihan Suzen, Alexander Gorban, Jeremy Levesley et al.

In this paper we argue that (lexical) meaning in science can be represented in a 13 dimension Meaning Space. This space is constructed using principal component analysis (singular decomposition) on the matrix of word category relative information gains, where the categories are those used by the Web of Science, and the words are taken from a reduced word set from texts in the Web of Science. We show that this reduced word set plausibly represents all texts in the corpus, so that the principal component analysis has some objective meaning with respect to the corpus. We argue that 13 dimensions is adequate to describe the meaning of scientific texts, and hypothesise about the qualitative meaning of the principal components.

CLJul 27, 2018
Automatic Short Answer Grading and Feedback Using Text Mining Methods

Neslihan Suzen, Alexander Gorban, Jeremy Levesley et al.

Automatic grading is not a new approach but the need to adapt the latest technology to automatic grading has become very important. As the technology has rapidly became more powerful on scoring exams and essays, especially from the 1990s onwards, partially or wholly automated grading systems using computational methods have evolved and have become a major area of research. In particular, the demand of scoring of natural language responses has created a need for tools that can be applied to automatically grade these responses. In this paper, we focus on the concept of automatic grading of short answer questions such as are typical in the UK GCSE system, and providing useful feedback on their answers to students. We present experimental results on a dataset provided from the introductory computer science class in the University of North Texas. We first apply standard data mining techniques to the corpus of student answers for the purpose of measuring similarity between the student answers and the model answer. This is based on the number of common words. We then evaluate the relation between these similarities and marks awarded by scorers. We then consider an approach that groups student answers into clusters. Each cluster would be awarded the same mark, and the same feedback given to each answer in a cluster. In this manner, we demonstrate that clusters indicate the groups of students who are awarded the same or the similar scores. Words in each cluster are compared to show that clusters are constructed based on how many and which words of the model answer have been used. The main novelty in this paper is that we design a model to predict marks based on the similarities between the student answers and the model answer.

NAAug 2, 2016
Quasi-interpolation on a sparse grid with Gaussian

Fuat Usta, Jeremy Levesley

Motivated by the recent multilevel sparse kernel-based interpolation (MuSIK) algorithm proposed in [Georgoulis, Levesley and Subhan, SIAM J. Sci. Comput., 35(2), pp. A815-A831, 2013], we introduce the new quasi-multilevel sparse interpolation with kernels (Q-MuSIK) via the combination technique. The Q-MuSIK scheme achieves better convergence and run time in comparison with classical quasi-interpolation; namely, the Q-MuSIK algorithm is generally superior to the MuSIK methods in terms of run time in particular in high-dimensional interpolation problems, since there is no need to solve large algebraic systems. We subsequently propose a fast, low complexity, high-dimensional quadrature formula based on Q-MuSIK interpolation of the integrand. We present the results of numerical experimentation for both interpolation and quadrature in high dimension.