CLDec 15, 2020Code
A Response Retrieval Approach for Dialogue Using a Multi-Attentive TransformerMatteo A. Senese, Alberto Benincasa, Barbara Caputo et al.
This paper presents our work for the ninth edition of the Dialogue System Technology Challenge (DSTC9). Our solution addresses the track number four: Simulated Interactive MultiModal Conversations. The task consists in providing an algorithm able to simulate a shopping assistant that supports the user with his/her requests. We address the task of response retrieval, that is the task of retrieving the most appropriate agent response from a pool of response candidates. Our approach makes use of a neural architecture based on transformer with a multi-attentive structure that conditions the response of the agent on the request made by the user and on the product the user is referring to. Final experiments on the SIMMC Fashion Dataset show that our approach achieves the second best scores on all the retrieval metrics defined by the organizers. The source code is available at https://github.com/D2KLab/dstc9-SIMMC.
IROct 11, 2018Code
A Distributed and Accountable Approach to Offline Recommender Systems EvaluationDiego Monti, Giuseppe Rizzo, Maurizio Morisio
Different software tools have been developed with the purpose of performing offline evaluations of recommender systems. However, the results obtained with these tools may be not directly comparable because of subtle differences in the experimental protocols and metrics. Furthermore, it is difficult to analyze in the same experimental conditions several algorithms without disclosing their implementation details. For these reasons, we introduce RecLab, an open source software for evaluating recommender systems in a distributed fashion. By relying on consolidated web protocols, we created RESTful APIs for training and querying recommenders remotely. In this way, it is possible to easily integrate into the same toolkit algorithms realized with different technologies. In details, the experimenter can perform an evaluation by simply visiting a web interface provided by RecLab. The framework will then interact with all the selected recommenders and it will compute and display a comprehensive set of measures, each representing a different metric. The results of all experiments are permanently stored and publicly available in order to support accountability and comparative analyses.
IROct 11, 2018Code
Sequeval: A Framework to Assess and Benchmark Sequence-based Recommender SystemsDiego Monti, Enrico Palumbo, Giuseppe Rizzo et al.
In this paper, we present sequeval, a software tool capable of performing the offline evaluation of a recommender system designed to suggest a sequence of items. A sequence-based recommender is trained considering the sequences already available in the system and its purpose is to generate a personalized sequence starting from an initial seed. This tool automatically evaluates the sequence-based recommender considering a comprehensive set of eight different metrics adapted to the sequential scenario. sequeval has been developed following the best practices of software extensibility. For this reason, it is possible to easily integrate and evaluate novel recommendation techniques. sequeval is publicly available as an open source tool and it aims to become a focal point for the community to assess sequence-based recommender systems.
CLApr 18, 2021
Attention-based Clinical Note SummarizationNeel Kanwal, Giuseppe Rizzo
In recent years, the trend of deploying digital systems in numerous industries has hiked. The health sector has observed an extensive adoption of digital systems and services that generate significant medical records. Electronic health records contain valuable information for prospective and retrospective analysis that is often not entirely exploited because of the complicated dense information storage. The crude purpose of condensing health records is to select the information that holds most characteristics of the original documents based on a reported disease. These summaries may boost diagnosis and save a doctor's time during a saturated workload situation like the COVID-19 pandemic. In this paper, we are applying a multi-head attention-based mechanism to perform extractive summarization of meaningful phrases on clinical notes. Our method finds major sentences for a summary by correlating tokens, segments, and positional embeddings of sentences in a clinical note. The model outputs attention scores that are statistically transformed to extract critical phrases for visualization on the heat-mapping tool and for human use.
IRSep 30, 2020
Understanding Twitter Engagement with a Click-Through Rate-based MethodAndrea Fiandro, Jeanpierre Francois, Isabeau Oliveri et al.
This paper presents the POLINKS solution to the RecSys Challenge 2020 that ranked 6th in the final leaderboard. We analyze the performance of our solution that utilizes the click-through rate value to address the challenge task, we compare it with a gradient boosting model, and we report the quality indicators utilized for computing the final leaderboard.
IRFeb 8, 2020
Predict your Click-out: Modeling User-Item Interactions and Session Actions in an Ensemble Learning FashionAndrea Fiandro, Giorgio Crepaldi, Diego Monti et al.
This paper describes the solution of the POLINKS team to the RecSys Challenge 2019 that focuses on the task of predicting the last click-out in a session-based interaction. We propose an ensemble approach comprising a matrix factorization for modeling the interaction user-item, and a session-aware learning model implemented with a recurrent neural network. This method appears to be effective in predicting the last click-out scoring a 0.60277 of Mean Reciprocal Rank on the local test set.
IRSep 2, 2019
All You Need is Ratings: A Clustering Approach to Synthetic Rating Datasets GenerationDiego Monti, Giuseppe Rizzo, Maurizio Morisio
The public availability of collections containing user preferences is of vital importance for performing offline evaluations in the field of recommender systems. However, the number of rating datasets is limited because of the costs required for their creation and the fear of violating the privacy of the users by sharing them. For this reason, numerous research attempts investigated the creation of synthetic collections of ratings using generative approaches. Nevertheless, these datasets are usually not reliable enough for conducting an evaluation campaign. In this paper, we propose a method for creating synthetic datasets with a configurable number of users that mimic the characteristics of already existing ones. We empirically validated the proposed approach by exploiting the synthetic datasets for evaluating different recommenders and by comparing the results with the ones obtained using real datasets.
SENov 30, 2018
Completeness and Consistency Analysis for Evolving Knowledge BasesMohammad Rifat Ahmmad Rashid, Giuseppe Rizzo, Marco Torchiano et al.
Assessing the quality of an evolving knowledge base is a challenging task as it often requires to identify correct quality assessment procedures. Since data is often derived from autonomous, and increasingly large data sources, it is impractical to manually curate the data, and challenging to continuously and automatically assess their quality. In this paper, we explore two main areas of quality assessment related to evolving knowledge bases: (i) identification of completeness issues using knowledge base evolution analysis, and (ii) identification of consistency issues based on integrity constraints, such as minimum and maximum cardinality, and range constraints. For completeness analysis, we use data profiling information from consecutive knowledge base releases to estimate completeness measures that allow predicting quality issues. Then, we perform consistency checks to validate the results of the completeness analysis using integrity constraints and learning models. The approach has been tested both quantitatively and qualitatively by using a subset of datasets from both DBpedia and 3cixty knowledge bases. The performance of the approach is evaluated using precision, recall, and F1 score. From completeness analysis, we observe a 94% precision for the English DBpedia KB and 95% precision for the 3cixty Nice KB. We also assessed the performance of our consistency analysis by using five learning models over three sub-tasks, namely minimum cardinality, maximum cardinality, and range constraint. We observed that the best performing model in our experimental setup is the Random Forest, reaching an F1 score greater than 90% for minimum and maximum cardinality and 84% for range constraints.
CLNov 13, 2018
A Multi-layer LSTM-based Approach for Robot Command Interaction ModelingMartino Mensio, Emanuele Bastianelli, Ilaria Tiddi et al.
As the first robotic platforms slowly approach our everyday life, we can imagine a near future where service robots will be easily accessible by non-expert users through vocal interfaces. The capability of managing natural language would indeed speed up the process of integrating such platform in the ordinary life. Semantic parsing is a fundamental task of the Natural Language Understanding process, as it allows extracting the meaning of a user utterance to be used by a machine. In this paper, we present a preliminary study to semantically parse user vocal commands for a House Service robot, using a multi-layer Long-Short Term Memory neural network with attention mechanism. The system is trained on the Human Robot Interaction Corpus, and it is preliminarily compared with previous approaches.
CLOct 27, 2014
Analysis of Named Entity Recognition and Linking for TweetsLeon Derczynski, Diana Maynard, Giuseppe Rizzo et al.
Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.