LGMay 20, 2022
A Hybrid Model for Forecasting Short-Term Electricity DemandMaria Eleni Athanasopoulou, Justina Deveikyte, Alan Mosca et al.
Currently the UK Electric market is guided by load (demand) forecasts published every thirty minutes by the regulator. A key factor in predicting demand is weather conditions, with forecasts published every hour. We present HYENA: a hybrid predictive model that combines feature engineering (selection of the candidate predictor features), mobile-window predictors and finally LSTM encoder-decoders to achieve higher accuracy with respect to mainstream models from the literature. HYENA decreased MAPE loss by 16\% and RMSE loss by 10\% over the best available benchmark model, thus establishing a new state of the art for the UK electric load (and price) forecasting.
LGMay 20, 2022
Predicting Seriousness of Injury in a Traffic Accident: A New Imbalanced Dataset and BenchmarkPaschalis Lagias, George D. Magoulas, Ylli Prifti et al.
The paper introduces a new dataset to assess the performance of machine learning algorithms in the prediction of the seriousness of injury in a traffic accident. The dataset is created by aggregating publicly available datasets from the UK Department for Transport, which are drastically imbalanced with missing attributes sometimes approaching 50\% of the overall data dimensionality. The paper presents the data analysis pipeline starting from the publicly available data of road traffic accidents and ending with predictors of possible injuries and their degree of severity. It addresses the huge incompleteness of public data with a MissForest model. The paper also introduces two baseline approaches to create injury predictors: a supervised artificial neural network and a reinforcement learning model. The dataset can potentially stimulate diverse aspects of machine learning research on imbalanced datasets and the two approaches can be used as baseline references when researchers test more advanced learning algorithms in this area.
DBFeb 18, 2025Code
Dr Web: a modern, query-based web data retrieval engineYlli Prifti, Alessandro Provetti, Pasquale de Meo
This article introduces the Data Retrieval Web Engine (also referred to as doctor web), a flexible and modular tool for extracting structured data from web pages using a simple query language. We discuss the engineering challenges addressed during its development, such as dynamic content handling and messy data extraction. Furthermore, we cover the steps for making the DR Web Engine public, highlighting its open source potential.
STDec 10, 2020
A Sentiment Analysis Approach to the Prediction of Market VolatilityJustina Deveikyte, Helyette Geman, Carlo Piccari et al.
Prediction and quantification of future volatility and returns play an important role in financial modelling, both in portfolio optimization and risk management. Natural language processing today allows to process news and social media comments to detect signals of investors' confidence. We have explored the relationship between sentiment extracted from financial news and tweets and FTSE100 movements. We investigated the strength of the correlation between sentiment measures on a given day and market volatility and returns observed the next day. The findings suggest that there is evidence of correlation between sentiment and stock market movements: the sentiment captured from news headlines could be used as a signal to predict market returns; the same does not apply for volatility. Also, in a surprising finding, for the sentiment found in Twitter comments we obtained a correlation coefficient of -0.7, and p-value below 0.05, which indicates a strong negative correlation between positive sentiment captured from the tweets on a given day and the volatility observed the next day. We developed an accurate classifier for the prediction of market volatility in response to the arrival of new information by deploying topic modelling, based on Latent Dirichlet Allocation, to extract feature vectors from a collection of tweets and financial news. The obtained features were used as additional input to the classifier. Thanks to the combination of sentiment and topic modelling our classifier achieved a directional prediction accuracy for volatility of 63%.
CYMay 22, 2020
COVID-19 Contact Tracing: Eight Privacy Questions ExploredHugh Lawson-Tancred, Henry C. W. Price, Alessandro Provetti
We respond to a recent short paper by de Motjoye et el. on privacy issues with Covid-19 tracking. Their paper, which we discuss here, is structured around three "toy protocols" for the design of an app which can maximise the utility of contact tracing information while minimising the more general risk to privacy. On this basis, the paper proceeds to introduce eight questions against which they should be assessed. The questions raised and the protocols proposed effectively amount to the creation of a game with different categories of players able to make different moves. It is therefore possible to analyse the model in terms of optimal game design.
AIMay 27, 2015
Qsmodels: ASP Planning in Interactive Gaming EnvironmentLuca Padovani, Alessandro Provetti
Qsmodels is a novel application of Answer Set Programming to interactive gaming environment. We describe a software architecture by which the behavior of a bot acting inside the Quake 3 Arena can be controlled by a planner. The planner is written as an Answer Set Program and is interpreted by the Smodels solver.
AIApr 9, 2015
RDF annotation of Second Life objects: Knowledge Representation meets Social Virtual realityCarlo Bernava, Giacomo Fiumara, Dario Maggiorini et al.
We have designed and implemented an application running inside Second Life that supports user annotation of graphical objects and graphical visualization of concept ontologies, thus providing a formal, machine-accessible description of objects. As a result, we offer a platform that combines the graphical knowledge representation that is expected from a MUVE artifact with the semantic structure given by the Resource Framework Description (RDF) representation of information.
AIFeb 21, 2014
Characterizing and computing stable models of logic programs: The non-stratified caseGianpaolo Brignoli, Stefania Costantini, Ottavio D'Antona et al.
Stable Logic Programming (SLP) is an emergent, alternative style of logic programming: each solution to a problem is represented by a stable model of a deductive database/function-free logic program encoding the problem itself. Several implementations now exist for stable logic programming, and their performance is rapidly improving. To make SLP generally applicable, it should be possible to check for consistency (i.e., existence of stable models) of the input program before attempting to answer queries. In the literature, only rather strong sufficient conditions have been proposed for consistency, e.g., stratification. This paper extends these results in several directions. First, the syntactic features of programs, viz. cyclic negative dependencies, affecting the existence of stable models are characterized, and their relevance is discussed. Next, a new graph representation of logic programs, the Extended Dependency Graph (EDG), is introduced, which conveys enough information for reasoning about stable models (while the traditional Dependency Graph does not). Finally, we show that the problem of the existence of stable models can be reformulated in terms of coloring of the EDG.