Dimitris Sacharidis

LG
h-index22
9papers
31citations
Novelty46%
AI Score43

9 Papers

LGJun 26, 2023
Fairness Aware Counterfactuals for Subgroups

Loukas Kavouras, Konstantinos Tsopelas, Giorgos Giannopoulos et al.

In this work, we present Fairness Aware Counterfactuals for Subgroups (FACTS), a framework for auditing subgroup fairness through counterfactual explanations. We start with revisiting (and generalizing) existing notions and introducing new, more refined notions of subgroup fairness. We aim to (a) formulate different aspects of the difficulty of individuals in certain subgroups to achieve recourse, i.e. receive the desired outcome, either at the micro level, considering members of the subgroup individually, or at the macro level, considering the subgroup as a whole, and (b) introduce notions of subgroup fairness that are robust, if not totally oblivious, to the cost of achieving recourse. We accompany these notions with an efficient, model-agnostic, highly parameterizable, and explainable framework for evaluating subgroup fairness. We demonstrate the advantages, the wide applicability, and the efficiency of our approach through a thorough experimental evaluation of different benchmark datasets.

LGFeb 23, 2023
Auditing for Spatial Fairness

Dimitris Sacharidis, Giorgos Giannopoulos, George Papastefanatos et al.

This paper studies algorithmic fairness when the protected attribute is location. To handle protected attributes that are continuous, such as age or income, the standard approach is to discretize the domain into predefined groups, and compare algorithmic outcomes across groups. However, applying this idea to location raises concerns of gerrymandering and may introduce statistical bias. Prior work addresses these concerns but only for regularly spaced locations, while raising other issues, most notably its inability to discern regions that are likely to exhibit spatial unfairness. Similar to established notions of algorithmic fairness, we define spatial fairness as the statistical independence of outcomes from location. This translates into requiring that for each region of space, the distribution of outcomes is identical inside and outside the region. To allow for localized discrepancies in the distribution of outcomes, we compare how well two competing hypotheses explain the observed outcomes. The null hypothesis assumes spatial fairness, while the alternate allows different distributions inside and outside regions. Their goodness of fit is then assessed by a likelihood ratio test. If there is no significant difference in how well the two hypotheses explain the observed outcomes, we conclude that the algorithm is spatially fair.

LGNov 17, 2025
GLANCE: Global Actions in a Nutshell for Counterfactual Explainability

Loukas Kavouras, Eleni Psaroudaki, Konstantinos Tsopelas et al.

The widespread deployment of machine learning systems in critical real-world decision-making applications has highlighted the urgent need for counterfactual explainability methods that operate effectively. Global counterfactual explanations, expressed as actions to offer recourse, aim to provide succinct explanations and insights applicable to large population subgroups. High effectiveness, measured by the fraction of the population that is provided recourse, ensures that the actions benefit as many individuals as possible. Keeping the cost of actions low ensures the proposed recourse actions remain practical and actionable. Limiting the number of actions that provide global counterfactuals is essential to maximizing interpretability. The primary challenge, therefore, is to balance these trade-offs--maximizing effectiveness, minimizing cost, while maintaining a small number of actions. We introduce $\texttt{GLANCE}$, a versatile and adaptive algorithm that employs a novel agglomerative approach, jointly considering both the feature space and the space of counterfactual actions, thereby accounting for the distribution of points in a way that aligns with the model's structure. This design enables the careful balancing of the trade-offs among the three key objectives, with the size objective functioning as a tunable parameter to keep the actions few and easy to interpret. Our extensive experimental evaluation demonstrates that $\texttt{GLANCE}$ consistently shows greater robustness and performance compared to existing methods across various datasets and models.

LGJul 30, 2024
The Susceptibility of Example-Based Explainability Methods to Class Outliers

Ikhtiyor Nematov, Dimitris Sacharidis, Tomer Sagi et al.

This study explores the impact of class outliers on the effectiveness of example-based explainability methods for black-box machine learning models. We reformulate existing explainability evaluation metrics, such as correctness and relevance, specifically for example-based methods, and introduce a new metric, distinguishability. Using these metrics, we highlight the shortcomings of current example-based explainability methods, including those who attempt to suppress class outliers. We conduct experiments on two datasets, a text classification dataset and an image classification dataset, and evaluate the performance of four state-of-the-art explainability methods. Our findings underscore the need for robust techniques to tackle the challenges posed by class outliers.

LGJul 22, 2024
AIDE: Antithetical, Intent-based, and Diverse Example-Based Explanations

Ikhtiyor Nematov, Dimitris Sacharidis, Tomer Sagi et al.

For many use-cases, it is often important to explain the prediction of a black-box model by identifying the most influential training data samples. Existing approaches lack customization for user intent and often provide a homogeneous set of explanation samples, failing to reveal the model's reasoning from different angles. In this paper, we propose AIDE, an approach for providing antithetical (i.e., contrastive), intent-based, diverse explanations for opaque and complex models. AIDE distinguishes three types of explainability intents: interpreting a correct, investigating a wrong, and clarifying an ambiguous prediction. For each intent, AIDE selects an appropriate set of influential training samples that support or oppose the prediction either directly or by contrast. To provide a succinct summary, AIDE uses diversity-aware sampling to avoid redundancy and increase coverage of the training data. We demonstrate the effectiveness of AIDE on image and text classification tasks, in three ways: quantitatively, assessing correctness and continuity; qualitatively, comparing anecdotal evidence from AIDE and other example-based approaches; and via a user study, evaluating multiple aspects of AIDE. The results show that AIDE addresses the limitations of existing methods and exhibits desirable traits for an explainability method.

36.4CLMar 28
Text Data Integration

Md Ataur Rahman, Dimitris Sacharidis, Oscar Romero et al.

Data comes in many forms. From a shallow perspective, they can be viewed as being either in structured (e.g., as a relation, as key-value pairs) or unstructured (e.g., text, image) formats. So far, machines have been fairly good at processing and reasoning over structured data that follows a precise schema. However, the heterogeneity of data poses a significant challenge on how well diverse categories of data can be meaningfully stored and processed. Data Integration, a crucial part of the data engineering pipeline, addresses this by combining disparate data sources and providing unified data access to end-users. Until now, most data integration systems have leaned on only combining structured data sources. Nevertheless, unstructured data (a.k.a. free text) also contains a plethora of knowledge waiting to be utilized. Thus, in this chapter, we firstly make the case for the integration of textual data, to later present its challenges, state of the art and open problems.

LGJul 6, 2025
Source Attribution in Retrieval-Augmented Generation

Ikhtiyor Nematov, Tarik Kalai, Elizaveta Kuzmenko et al.

While attribution methods, such as Shapley values, are widely used to explain the importance of features or training data in traditional machine learning, their application to Large Language Models (LLMs), particularly within Retrieval-Augmented Generation (RAG) systems, is nascent and challenging. The primary obstacle is the substantial computational cost, where each utility function evaluation involves an expensive LLM call, resulting in direct monetary and time expenses. This paper investigates the feasibility and effectiveness of adapting Shapley-based attribution to identify influential retrieved documents in RAG. We compare Shapley with more computationally tractable approximations and some existing attribution methods for LLM. Our work aims to: (1) systematically apply established attribution principles to the RAG document-level setting; (2) quantify how well SHAP approximations can mirror exact attributions while minimizing costly LLM interactions; and (3) evaluate their practical explainability in identifying critical documents, especially under complex inter-document relationships such as redundancy, complementarity, and synergy. This study seeks to bridge the gap between powerful attribution techniques and the practical constraints of LLM-based RAG systems, offering insights into achieving reliable and affordable RAG explainability.

LGApr 29, 2024
FALE: Fairness-Aware ALE Plots for Auditing Bias in Subgroups

Giorgos Giannopoulos, Dimitris Sacharidis, Nikolas Theologitis et al.

Fairness is steadily becoming a crucial requirement of Machine Learning (ML) systems. A particularly important notion is subgroup fairness, i.e., fairness in subgroups of individuals that are defined by more than one attributes. Identifying bias in subgroups can become both computationally challenging, as well as problematic with respect to comprehensibility and intuitiveness of the finding to end users. In this work we focus on the latter aspects; we propose an explainability method tailored to identifying potential bias in subgroups and visualizing the findings in a user friendly manner to end users. In particular, we extend the ALE plots explainability method, proposing FALE (Fairness aware Accumulated Local Effects) plots, a method for measuring the change in fairness for an affected population corresponding to different values of a feature (attribute). We envision FALE to function as an efficient, user friendly, comprehensible and reliable first-stage tool for identifying subgroups with potential bias issues.

DLJun 1, 2020
Ranking Papers by their Short-Term Scientific Impact

Ilias Kanellos, Thanasis Vergoulis, Dimitris Sacharidis et al.

The constantly increasing rate at which scientific papers are published makes it difficult for researchers to identify papers that currently impact the research field of their interest. Hence, approaches to effectively identify papers of high impact have attracted great attention in the past. In this work, we present a method that seeks to rank papers based on their estimated short-term impact, as measured by the number of citations received in the near future. Similar to previous work, our method models a researcher as she explores the paper citation network. The key aspect is that we incorporate an attention-based mechanism, akin to a time-restricted version of preferential attachment, to explicitly capture a researcher's preference to read papers which received a lot of attention recently. A detailed experimental evaluation on four real citation datasets across disciplines, shows that our approach is more effective than previous work in ranking papers based on their short-term impact.