Guido Sciavicco

LG
h-index24
4papers
31citations
Novelty44%
AI Score30

4 Papers

CLMar 12, 2022
A combined approach to the analysis of speech conversations in a contact center domain

Andrea Brunello, Enrico Marzano, Angelo Montanari et al.

The ever more accurate search for deep analysis in customer data is a really strong technological trend nowadays, quite appealing to both private and public companies. This is particularly true in the contact center domain, where speech analytics is an extremely powerful methodology for gaining insights from unstructured data, coming from customer and human agent conversations. In this work, we describe an experimentation with a speech analytics process for an Italian contact center, that deals with call recordings extracted from inbound or outbound flows. First, we illustrate in detail the development of an in-house speech-to-text solution, based on Kaldi framework, and evaluate its performance (and compare it to Google Cloud Speech API). Then, we evaluate and compare different approaches to the semantic tagging of call transcripts, ranging from classic regular expressions to machine learning models based on ngrams and logistic regression, and propose a combination of them, which is shown to provide a consistent benefit. Finally, a decision tree inducer, called J48S, is applied to the problem of tagging. Such an algorithm is natively capable of exploiting sequential data, such as texts, for classification purposes. The solution is compared with the other approaches and is shown to provide competitive classification performances, while generating highly interpretable models and reducing the complexity of the data preparation phase. The potential operational impact of the whole process is thoroughly examined.

LGMay 26, 2023Code
Knowledge Extraction with Interval Temporal Logic Decision Trees

Guido Sciavicco, Stan Ionel Eduard

Multivariate temporal, or time, series classification is, in a way, the temporal generalization of (numeric) classification, as every instance is described by multiple time series instead of multiple values. Symbolic classification is the machine learning strategy to extract explicit knowledge from a data set, and the problem of symbolic classification of multivariate temporal series requires the design, implementation, and test of ad-hoc machine learning algorithms, such as, for example, algorithms for the extraction of temporal versions of decision trees. One of the most well-known algorithms for decision tree extraction from categorical data is Quinlan's ID3, which was later extended to deal with numerical attributes, resulting in an algorithm known as C4.5, and implemented in many open-sources data mining libraries, including the so-called Weka, which features an implementation of C4.5 called J48. ID3 was recently generalized to deal with temporal data in form of timelines, which can be seen as discrete (categorical) versions of multivariate time series, and such a generalization, based on the interval temporal logic HS, is known as Temporal ID3. In this paper we introduce Temporal C4.5, that allows the extraction of temporal decision trees from undiscretized multivariate time series, describe its implementation, called Temporal J48, and discuss the outcome of a set of experiments with the latter on a collection of public data sets, comparing the results with those obtained by other, classical, multivariate time series classification methods.

SDMar 21, 2025
Symbolic Audio Classification via Modal Decision Tree Learning

Enrico Marzano, Giovanni Pagliarini, Riccardo Pasini et al.

The range of potential applications of acoustic analysis is wide. Classification of sounds, in particular, is a typical machine learning task that received a lot of attention in recent years. The most common approaches to sound classification are sub-symbolic, typically based on neural networks, and result in black-box models with high performances but very low transparency. In this work, we consider several audio tasks, namely, age and gender recognition, emotion classification, and respiratory disease diagnosis, and we approach them with a symbolic technique, that is, (modal) decision tree learning. We prove that such tasks can be solved using the same symbolic pipeline, that allows to extract simple rules with very high accuracy and low complexity. In principle, all such tasks could be associated to an autonomous conversation system, which could be useful in different contexts, such as an automatic reservation agent for an hospital or a clinic.

LGSep 17, 2021
Decision Tree Learning with Spatial Modal Logics

Giovanni Pagliarini, Guido Sciavicco

Symbolic learning represents the most straightforward approach to interpretable modeling, but its applications have been hampered by a single structural design choice: the adoption of propositional logic as the underlying language. Recently, more-than-propositional symbolic learning methods have started to appear, in particular for time-dependent data. These methods exploit the expressive power of modal temporal logics in powerful learning algorithms, such as temporal decision trees, whose classification capabilities are comparable with the best non-symbolic ones, while producing models with explicit knowledge representation. With the intent of following the same approach in the case of spatial data, in this paper we: i) present a theory of spatial decision tree learning; ii) describe a prototypical implementation of a spatial decision tree learning algorithm based, and strictly extending, the classical C4.5 algorithm; and iii) perform a series of experiments in which we compare the predicting power of spatial decision trees with that of classical propositional decision trees in several versions, for a multi-class image classification problem, on publicly available datasets. Our results are encouraging, showing clear improvements in the performances from the propositional to the spatial models, which in turn show higher levels of interpretability.