Roberto Labadie-Tamayo

CL
h-index12
3papers
5citations
Novelty52%
AI Score34

3 Papers

HCApr 28, 2025
Interactive Discovery and Exploration of Visual Bias in Generative Text-to-Image Models

Johannes Eschner, Roberto Labadie-Tamayo, Matthias Zeppelzauer et al.

Bias in generative Text-to-Image (T2I) models is a known issue, yet systematically analyzing such models' outputs to uncover it remains challenging. We introduce the Visual Bias Explorer (ViBEx) to interactively explore the output space of T2I models to support the discovery of visual bias. ViBEx introduces a novel flexible prompting tree interface in combination with zero-shot bias probing using CLIP for quick and approximate bias exploration. It additionally supports in-depth confirmatory bias analysis through visual inspection of forward, intersectional, and inverse bias queries. ViBEx is model-agnostic and publicly available. In four case study interviews, experts in AI and ethics were able to discover visual biases that have so far not been described in literature.

CLJul 28, 2025
FHSTP@EXIST 2025 Benchmark: Sexism Detection with Transparent Speech Concept Bottleneck Models

Roberto Labadie-Tamayo, Adrian Jaques Böck, Djordje Slijepčević et al.

Sexism has become widespread on social media and in online conversation. To help address this issue, the fifth Sexism Identification in Social Networks (EXIST) challenge is initiated at CLEF 2025. Among this year's international benchmarks, we concentrate on solving the first task aiming to identify and classify sexism in social media textual posts. In this paper, we describe our solutions and report results for three subtasks: Subtask 1.1 - Sexism Identification in Tweets, Subtask 1.2 - Source Intention in Tweets, and Subtask 1.3 - Sexism Categorization in Tweets. We implement three models to address each subtask which constitute three individual runs: Speech Concept Bottleneck Model (SCBM), Speech Concept Bottleneck Model with Transformer (SCBMT), and a fine-tuned XLM-RoBERTa transformer model. SCBM uses descriptive adjectives as human-interpretable bottleneck concepts. SCBM leverages large language models (LLMs) to encode input texts into a human-interpretable representation of adjectives, then used to train a lightweight classifier for downstream tasks. SCBMT extends SCBM by fusing adjective-based representation with contextual embeddings from transformers to balance interpretability and classification performance. Beyond competitive results, these two models offer fine-grained explanations at both instance (local) and class (global) levels. We also investigate how additional metadata, e.g., annotators' demographic profiles, can be leveraged. For Subtask 1.1, XLM-RoBERTa, fine-tuned on provided data augmented with prior datasets, ranks 6th for English and Spanish and 4th for English in the Soft-Soft evaluation. Our SCBMT achieves 7th for English and Spanish and 6th for Spanish.

CLJul 30, 2025
Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition

Roberto Labadie-Tamayo, Djordje Slijepčević, Xihui Chen et al.

The rapid increase in hate speech on social media has exposed an unprecedented impact on society, making automated methods for detecting such content important. Unlike prior black-box models, we propose a novel transparent method for automated hate and counter speech recognition, i.e., "Speech Concept Bottleneck Model" (SCBM), using adjectives as human-interpretable bottleneck concepts. SCBM leverages large language models (LLMs) to map input texts to an abstract adjective-based representation, which is then sent to a light-weight classifier for downstream tasks. Across five benchmark datasets spanning multiple languages and platforms (e.g., Twitter, Reddit, YouTube), SCBM achieves an average macro-F1 score of 0.69 which outperforms the most recently reported results from the literature on four out of five datasets. Aside from high recognition accuracy, SCBM provides a high level of both local and global interpretability. Furthermore, fusing our adjective-based concept representation with transformer embeddings, leads to a 1.8% performance increase on average across all datasets, showing that the proposed representation captures complementary information. Our results demonstrate that adjective-based concept representations can serve as compact, interpretable, and effective encodings for hate and counter speech recognition. With adapted adjectives, our method can also be applied to other NLP tasks.