Tommaso Teofili

5papers

25citations

Novelty32%

AI Score38

Ranked #111,942 of 201,326 authors (top 56%)#282 in DB (top 54%)

5 Papers

DBMar 24, 2022

Effective Explanations for Entity Resolution Models

Tommaso Teofili, Donatella Firmani, Nick Koudas et al.

Entity resolution (ER) aims at matching records that refer to the same real-world entity. Although widely studied for the last 50 years, ER still represents a challenging data management problem, and several recent works have started to investigate the opportunity of applying deep learning (DL) techniques to solve this problem. In this paper, we study the fundamental problem of explainability of the DL solution for ER. Understanding the matching predictions of an ER solution is indeed crucial to assess the trustworthiness of the DL model and to discover its biases. We treat the DL model as a black box classifier and - while previous approaches to provide explanations for DL predictions are agnostic to the classification task. we propose the CERTA approach that is aware of the semantics of the ER problem. Our approach produces both saliency explanations, which associate each attribute with a saliency score, and counterfactual explanations, which provide examples of values that can flip the prediction. CERTA builds on a probabilistic framework that aims at computing the explanations evaluating the outcomes produced by using perturbed copies of the input records. We experimentally evaluate CERTA's explanations of state-of-the-art ER solutions based on DL models using publicly available datasets, and demonstrate the effectiveness of CERTA over recently proposed methods for this problem.

DBMay 31

Can we trust LLM Self-Explanations for Entity Resolution?

Tommaso Teofili, Donatella Firmani, Nick Koudas et al.

Large Language Models (LLMs) have recently shown strong performance on Entity Resolution (ER). Additionally, akin to their prowess in providing accurate predictions, these models often generate self-explanations alongside their predictions through prompting. While such self-explanations are appealing due to their negligible computational cost, their actual reliability remains largely unexplored. In this paper, we present the first large-scale systematic evaluation of LLM self-explanations for ER, focusing on feature attribution and counterfactual explanations at both the attribute and token levels. Across three LLMs, ten datasets, and multiple prompting strategies, we show that self-explanations are often unstable, weakly faithful, and poorly aligned with counterfactual evidence, revealing a substantial gap between plausibility and causal relevance. We further demonstrate that established post-hoc explanation methods provide significantly higher trustworthiness, but at a prohibitive computational cost when applied to LLMs. To bridge this gap, we introduce \uncerta{}, a hybrid explanation framework that leverages self-explanations as priors to guide post-hoc exploration. \uncerta{} achieves explanation quality comparable to post-hoc methods while reducing cost by up to an order of magnitude.

IROct 22, 2019Code

Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors

Tommaso Teofili, Jimmy Lin

We demonstrate three approaches for adapting the open-source Lucene search library to perform approximate nearest-neighbor search on arbitrary dense vectors, using similarity search on word embeddings as a case study. At its core, Lucene is built around inverted indexes of a document collection's (sparse) term-document matrix, which is incompatible with the lower-dimensional dense vectors that are common in deep learning applications. We evaluate three techniques to overcome these challenges that can all be natively integrated into Lucene: the creation of documents populated with fake words, LSH applied to lexical realizations of dense vectors, and k-d trees coupled with dimensionality reduction. Experiments show that the "fake words" approach represents the best balance between effectiveness and efficiency. These techniques are integrated into the Anserini open-source toolkit and made available to the community.

AIApr 26, 2021

TrustyAI Explainability Toolkit

Rob Geada, Tommaso Teofili, Rui Vieira et al.

Artificial intelligence (AI) is becoming increasingly more popular and can be found in workplaces and homes around the world. The decisions made by such "black box" systems are often opaque; that is, so complex as to be functionally impossible to understand. How do we ensure that these systems are behaving as desired? TrustyAI is an initiative which looks into explainable artificial intelligence (XAI) solutions to address this issue of explainability in the context of both AI models and decision services. This paper presents the TrustyAI Explainability Toolkit, a Java and Python library that provides XAI explanations of decision services and predictive models for both enterprise and data science use-cases. We describe the TrustyAI implementations and extensions to techniques such as LIME, SHAP and counterfactuals, which are benchmarked against existing implementations in a variety of experiments.

IRSep 4, 2019

Affect Enriched Word Embeddings for News Information Retrieval

Tommaso Teofili, Niyati Chhaya

Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking. Algorithms like word2vec, GloVe and others are also key factors in many improvements in different NLP tasks. One common issue with such embedding models is that words like happy and sad appear in similar contexts and hence are wrongly clustered close in the embedding space. In this paper we leverage Aff2Vec, a set of word embeddings models which include affect information, in order to better capture the affect aspect in news text to achieve better results in information retrieval tasks, also such embeddings are less hit by the synonym/antonym issue. We evaluate their effectiveness on two IR related tasks (query expansion and ranking) over the New York Times dataset (TREC-core '17) comparing them against other word embeddings based models and classic ranking models.