CLOct 24, 2022
A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement StandardJ. Ignacio Deza, Hisham Ihshaish, Lamine Mahdjoubi
We introduce the first automated models for classifying natural language descriptions provided in cost documents called "Bills of Quantities" (BoQs) popular in the infrastructure construction industry, into the International Construction Measurement Standard (ICMS). The models we deployed and systematically evaluated for multi-class text classification are learnt from a dataset of more than 50 thousand descriptions of items retrieved from 24 large infrastructure construction projects across the United Kingdom. We describe our approach to language representation and subsequent modelling to examine the strength of contextual semantics and temporal dependency of language used in construction project documentation. To do that we evaluate two experimental pipelines to inferring ICMS codes from text, on the basis of two different language representation models and a range of state-of-the-art sequence-based classification methods, including recurrent and convolutional neural network architectures. The findings indicate a highly effective and accurate ICMS automation model is within reach, with reported accuracy results above 90% F1 score on average, on 32 ICMS categories. Furthermore, due to the specific nature of language use in the BoQs text; short, largely descriptive and technical, we find that simpler models compare favourably to achieving higher accuracy results. Our analysis suggest that information is more likely embedded in local key features in the descriptive text, which explains why a simpler generic temporal convolutional network (TCN) exhibits comparable memory to recurrent architectures with the same capacity, and subsequently outperforms these at this task.
LGNov 18, 2022
Estimating defection in subscription-type markets: empirical analysis from the scholarly publishing industryMichael Roberts, J. Ignacio Deza, Hisham Ihshaish et al.
We present the first empirical study on customer churn prediction in the scholarly publishing industry. The study examines our proposed method for prediction on a customer subscription data over a period of 6.5 years, which was provided by a major academic publisher. We explore the subscription-type market within the context of customer defection and modelling, and provide analysis of the business model of such markets, and how these characterise the academic publishing business. The proposed method for prediction attempts to provide inference of customer's likelihood of defection on the basis of their re-sampled use of provider resources -in this context, the volume and frequency of content downloads. We show that this approach can be both accurate as well as uniquely useful in the business-to-business context, with which the scholarly publishing business model shares similarities. The main findings of this work suggest that whilst all predictive models examined, especially ensemble methods of machine learning, achieve substantially accurate prediction of churn, nearly a year ahead, this can be furthermore achieved even when the specific behavioural attributes that can be associated to each customer probability to churn are overlooked. Allowing as such highly accurate inference of churn from minimal possible data. We show that modelling churn on the basis of re-sampling customers' use of resources over subscription time is a better (simplified) approach than when considering the high granularity that can often characterise consumption behaviour.
LGAug 31, 2022
Integrating wind variability to modelling wind-ramp events using a non-binary ramp function and deep learning modelsRussell Sharp, Hisham Ihshaish, J. Ignacio Deza
The forecasting of large ramps in wind power output known as ramp events is crucial for the incorporation of large volumes of wind energy into national electricity grids. Large variations in wind power supply must be compensated by ancillary energy sources which can include the use of fossil fuels. Improved prediction of wind power will help to reduce dependency on supplemental energy sources along with their associated costs and emissions. In this paper, we discuss limitations of current predictive practices and explore the use of Machine Learning methods to enhance wind ramp event classification and prediction. We additionally outline a design for a novel approach to wind ramp prediction, in which high-resolution wind fields are incorporated to the modelling of wind power.
CLDec 21, 2021
Task-oriented Dialogue Systems: performance vs. quality-optima, a reviewRyan Fellows, Hisham Ihshaish, Steve Battle et al.
Task-oriented dialogue systems (TODS) are continuing to rise in popularity as various industries find ways to effectively harness their capabilities, saving both time and money. However, even state-of-the-art TODS are not yet reaching their full potential. TODS typically have a primary design focus on completing the task at hand, so the metric of task-resolution should take priority. Other conversational quality attributes that may point to the success, or otherwise, of the dialogue, may be ignored. This can cause interactions between human and dialogue system that leave the user dissatisfied or frustrated. This paper explores the literature on evaluative frameworks of dialogue systems and the role of conversational quality attributes in dialogue systems, looking at if, how, and where they are utilised, and examining their correlation with the performance of the dialogue system.