Florent Masseglia

3papers

19citations

Novelty43%

AI Score22

Ranked #185,483 of 201,326 authors (top 92%)#1,892 in IR (top 85%)

3 Papers

LGApr 2, 2023

SoftED: Metrics for Soft Evaluation of Time Series Event Detection

Rebecca Salles, Janio Lima, Michel Reis et al.

Time series event detection methods are evaluated mainly by standard classification metrics that focus solely on detection accuracy. However, inaccuracy in detecting an event can often result from its preceding or delayed effects reflected in neighboring detections. These detections are valuable to trigger necessary actions or help mitigate unwelcome consequences. In this context, current metrics are insufficient and inadequate for the context of event detection. There is a demand for metrics that incorporate both the concept of time and temporal tolerance for neighboring detections. This paper introduces SoftED metrics, a new set of metrics designed for soft evaluating event detection methods. They enable the evaluation of both detection accuracy and the degree to which their detections represent events. They improved event detection evaluation by associating events and their representative detections, incorporating temporal tolerance in over 36\% of experiments compared to the usual classification metrics. SoftED metrics were validated by domain specialists that indicated their contribution to detection evaluation and method selection.

IRJun 28, 2020

Feedback Clustering for Online Travel Agencies Searches: a Case Study

Sara Scaramuccia, Simon Nanty, Florent Masseglia

Understanding choices performed by online customers is a growing need in the travel industry. In many practical situations, the only available information is the flight search query performed by the customer with no additional profile knowledge. In general, customer flight bookings are driven by prices, duration, number of connections, and so on. However, not all customers might assign the same importance to each of those criteria. Here comes the need of grouping together all flight searches performed by the same kind of customer, that is having the same booking criteria. The effectiveness of some set of recommendations, for a single cluster, can be measured in terms of the number of bookings historically performed. This effectiveness measure plays the role of a feedback, that is an external knowledge which can be recombined to iteratively obtain a final segmentation. In this paper, we describe our Online Travel Agencies (OTA) flight search use case and highlight its specific features. We address the flight search segmentation problem motivated above by proposing a novel algorithm called Split-or-Merge (S/M). This algorithm is a variation of the Split-Merge-Evolve (SME) method. The SME method has already been introduced in the community as an iterative process updating a clustering given by the K-means algorithm by splitting and merging clusters subject to feedback independent evaluations. No previous application of the SME method to the real-word data is reported in literature to the best of our knowledge. Here, we provide experimental evaluations over real-world data to the SME and the S/M methods. The impact on our domain-specific metrics obtained under the SME and the S/M methods suggests that feedback clustering techniques can be very promising in the handling of the domain of OTA flight searches.

IRJul 16, 2018

A Distributed Collaborative Filtering Algorithm Using Multiple Data Sources

Mohamed Reda Bouadjenek, Esther Pacitti, Maximilien Servajean et al.

Collaborative Filtering (CF) is one of the most commonly used recommendation methods. CF consists in predicting whether, or how much, a user will like (or dislike) an item by leveraging the knowledge of the user's preferences as well as that of other users. In practice, users interact and express their opinion on only a small subset of items, which makes the corresponding user-item rating matrix very sparse. Such data sparsity yields two main problems for recommender systems: (1) the lack of data to effectively model users' preferences, and (2) the lack of data to effectively model item characteristics. However, there are often many other data sources that are available to a recommender system provider, which can describe user interests and item characteristics (e.g., users' social network, tags associated to items, etc.). These valuable data sources may supply useful information to enhance a recommendation system in modeling users' preferences and item characteristics more accurately and thus, hopefully, to make recommenders more precise. For various reasons, these data sources may be managed by clusters of different data centers, thus requiring the development of distributed solutions. In this paper, we propose a new distributed collaborative filtering algorithm, which exploits and combines multiple and diverse data sources to improve recommendation quality. Our experimental evaluation using real datasets shows the effectiveness of our algorithm compared to state-of-the-art recommendation algorithms.