PRJan 18, 2018
Doubling Algorithms for Stationary Distributions of Fluid Queues: A Probabilistic InterpretationNigel Bean, Giang T. Nguyen, Federico Poloni
Fluid queues are mathematical models frequently used in stochastic modelling. Their stationary distributions involve a key matrix recording the conditional probabilities of returning to an initial level from above, often known in the literature as the matrix $Ψ$. Here, we present a probabilistic interpretation of the family of algorithms known as \emph{doubling}, which are currently the most effective algorithms for computing the return probability matrix $Ψ$. To this end, we first revisit the links described in \cite{ram99, soares02} between fluid queues and Quasi-Birth-Death processes; in particular, we give new probabilistic interpretations for these connections. We generalize this framework to give a probabilistic meaning for the initial step of doubling algorithms, and include also an interpretation for the iterative step of these algorithms. Our work is the first probabilistic interpretation available for doubling algorithms.
APApr 15, 2019
A framework for streamlined statistical prediction using topic modelsVanessa Glenny, Jonathan Tuke, Nigel Bean et al.
In the Humanities and Social Sciences, there is increasing interest in approaches to information extraction, prediction, intelligent linkage, and dimension reduction applicable to large text corpora. With approaches in these fields being grounded in traditional statistical techniques, the need arises for frameworks whereby advanced NLP techniques such as topic modelling may be incorporated within classical methodologies. This paper provides a classical, supervised, statistical learning framework for prediction from text, using topic models as a data reduction method and the topics themselves as predictors, alongside typical statistical tools for predictive modelling. We apply this framework in a Social Sciences context (applied animal behaviour) as well as a Humanities context (narrative analysis) as examples of this framework. The results show that topic regression models perform comparably to their much less efficient equivalents that use individual words as predictors.
CYSep 22, 2018
Pachinko Prediction: A Bayesian method for event prediction from social media dataJonathan Tuke, Andrew Nguyen, Mehwish Nasim et al.
The combination of large open data sources with machine learning approaches presents a potentially powerful way to predict events such as protest or social unrest. However, accounting for uncertainty in such models, particularly when using diverse, unstructured datasets such as social media, is essential to guarantee the appropriate use of such methods. Here we develop a Bayesian method for predicting social unrest events in Australia using social media data. This method uses machine learning methods to classify individual postings to social media as being relevant, and an empirical Bayesian approach to calculate posterior event probabilities. We use the method to predict events in Australian cities over a period in 2017/18.