Tiago Machado

AI
h-index12
10papers
117citations
Novelty32%
AI Score41

10 Papers

CLNov 11, 2025Code
A methodological analysis of prompt perturbations and their effect on attack success rates

Tiago Machado, Maysa Malfiza Garcia de Macedo, Rogerio Abreu de Paula et al.

This work aims to investigate how different Large Language Models (LLMs) alignment methods affect the models' responses to prompt attacks. We selected open source models based on the most common alignment methods, namely, Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning with Human Feedback (RLHF). We conducted a systematic analysis using statistical methods to verify how sensitive the Attack Success Rate (ASR) is when we apply variations to prompts designed to elicit inappropriate content from LLMs. Our results show that even small prompt modifications can significantly change the Attack Success Rate (ASR) according to the statistical tests we run, making the models more or less susceptible to types of attack. Critically, our results demonstrate that running existing 'attack benchmarks' alone may not be sufficient to elicit all possible vulnerabilities of both models and alignment methods. This paper thus contributes to ongoing efforts on model attack evaluation by means of systematic and statistically-based analyses of the different alignment methods and how sensitive their ASR is to prompt variation.

CLDec 1, 2025
Exploring Human Perceptions of AI Responses: Insights from a Mixed-Methods Study on Risk Mitigation in Generative Models

Heloisa Candello, Muneeza Azmat, Uma Sushmitha Gunturi et al.

With the rapid uptake of generative AI, investigating human perceptions of generated responses has become crucial. A major challenge is their `aptitude' for hallucinating and generating harmful contents. Despite major efforts for implementing guardrails, human perceptions of these mitigation strategies are largely unknown. We conducted a mixed-method experiment for evaluating the responses of a mitigation strategy across multiple-dimensions: faithfulness, fairness, harm-removal capacity, and relevance. In a within-subject study design, 57 participants assessed the responses under two conditions: harmful response plus its mitigation and solely mitigated response. Results revealed that participants' native language, AI work experience, and annotation familiarity significantly influenced evaluations. Participants showed high sensitivity to linguistic and contextual attributes, penalizing minor grammar errors while rewarding preserved semantic contexts. This contrasts with how language is often treated in the quantitative evaluation of LLMs. We also introduced new metrics for training and evaluating mitigation strategies and insights for human-AI evaluation studies.

IRMar 29, 2025Code
A Framework for Lightweight Responsible Prompting Recommendation

Tiago Machado, Sara E. Berger, Cassia Sanctos et al.

Computer Science and Design practitioners have been researching and proposing alternatives for a dearth of recommendations, standards, or best practices in user interfaces for decades. Now, with the advent of generative Artificial Intelligence (GenAI), we have yet again an emerging, powerful technology that lacks sufficient guidance in terms of possible interactions, inputs, and outcomes. In this context, this work proposes a lightweight framework for responsible prompting recommendation to be added before the prompt is sent to GenAI. The framework is comprised of (1) a human-curated dataset for recommendations, (2) a red team dataset for assessing recommendations, (3) a sentence transformer for semantics mapping, (4) a similarity metric to map input prompt to recommendations, (5) a set of similarity thresholds, (6) quantized sentence embeddings, (7) a recommendation engine, and (8) an evaluation step to use the red team dataset. With the proposed framework and open-source system, the contributions presented can be applied in multiple contexts where end-users can benefit from guidance for interacting with GenAI in a more responsible way, recommending positive values to be added and harmful sentences to be removed.

CLAug 13, 2025
A Comprehensive Evaluation framework of Alignment Techniques for LLMs

Muneeza Azmat, Momin Abbas, Maysa Malfiza Garcia de Macedo et al.

As Large Language Models (LLMs) become increasingly integrated into real-world applications, ensuring their outputs align with human values and safety standards has become critical. The field has developed diverse alignment approaches including traditional fine-tuning methods (RLHF, instruction tuning), post-hoc correction systems, and inference-time interventions, each with distinct advantages and limitations. However, the lack of unified evaluation frameworks makes it difficult to systematically compare these paradigms and guide deployment decisions. This paper introduces a multi-dimensional evaluation of alignment techniques for LLMs, a comprehensive evaluation framework that provides a systematic comparison across all major alignment paradigms. Our framework assesses methods along four key dimensions: alignment detection, alignment quality, computational efficiency, and robustness. Through experiments across diverse base models and alignment strategies, we demonstrate the utility of our framework in identifying strengths and limitations of current state-of-the-art models, providing valuable insights for future research directions.

AIMay 1, 2023
Procedural Content Generation via Knowledge Transformation (PCG-KT)

Anurag Sarkar, Matthew Guzdial, Sam Snodgrass et al.

We introduce the concept of Procedural Content Generation via Knowledge Transformation (PCG-KT), a new lens and framework for characterizing PCG methods and approaches in which content generation is enabled by the process of knowledge transformation -- transforming knowledge derived from one domain in order to apply it in another. Our work is motivated by a substantial number of recent PCG works that focus on generating novel content via repurposing derived knowledge. Such works have involved, for example, performing transfer learning on models trained on one game's content to adapt to another game's content, as well as recombining different generative distributions to blend the content of two or more games. Such approaches arose in part due to limitations in PCG via Machine Learning (PCGML) such as producing generative models for games lacking training data and generating content for entirely new games. In this paper, we categorize such approaches under this new lens of PCG-KT by offering a definition and framework for describing such methods and surveying existing works using this framework. Finally, we conclude by highlighting open problems and directions for future research in this area.

AISep 6, 2019
Automatic Critical Mechanic Discovery Using Playtraces in Video Games

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

We present a new method of automatic critical mechanic discovery for video games using a combination of game description parsing and playtrace information. This method is applied to several games within the General Video Game Artificial Intelligence (GVG-AI) framework. In a user study, human-identified mechanics are compared against system-identified critical mechanics to verify alignment between humans and the system. The results of the study demonstrate that the new method is able to match humans with higher consistency than baseline. Our system is further validated by comparing MCTS agents augmented with critical mechanics and vanilla MCTS agents on $4$ games from GVG-AI. Our new playtrace method shows a significant performance improvement over the baseline for all 4 tested games. The proposed method also shows either matched or improved performance over the old method, demonstrating that playtrace information is responsible for more complete critical mechanic discovery.

AIAug 13, 2019
Evaluation of a Recommender System for Assisting Novice Game Designers

Tiago Machado, Daniel Gopstein, Oded Nov et al.

Game development is a complex task involving multiple disciplines and technologies. Developers and researchers alike have suggested that AI-driven game design assistants may improve developer workflow. We present a recommender system for assisting humans in game design as well as a rigorous human subjects study to validate it. The AI-driven game design assistance system suggests game mechanics to designers based on characteristics of the game being developed. We believe this method can bring creative insights and increase users' productivity. We conducted quantitative studies that showed the recommender system increases users' levels of accuracy and computational affect, and decreases their levels of workload.

HCJul 8, 2019
Pitako -- Recommending Game Design Elements in Cicero

Tiago Machado, Dan Gopstein, Andy Nealen et al.

Recommender Systems are widely and successfully applied in e-commerce. Could they be used for design? In this paper, we introduce Pitako1, a tool that applies the Recommender System concept to assist humans in creative tasks. More specifically, Pitako provides suggestions by taking games designed by humans as inputs, and recommends mechanics and dynamics as outputs. Pitako is implemented as a new system within the mixed-initiative AI-based Game Design Assistant, Cicero. This paper discusses the motivation behind the implementation of Pitako as well as its technical details and presents usage examples. We believe that Pitako can influence the use of recommender systems to help humans in their daily tasks.

AIJul 11, 2018
AtDelfi: Automatically Designing Legible, Full Instructions For Games

Michael Cerny Green, Ahmed Khalifa, Gabriella A. B. Barros et al.

This paper introduces a fully automatic method for generating video game tutorials. The AtDELFI system (AuTomatically DEsigning Legible, Full Instructions for games) was created to investigate procedural generation of instructions that teach players how to play video games. We present a representation of game rules and mechanics using a graph system as well as a tutorial generation method that uses said graph representation. We demonstrate the concept by testing it on games within the General Video Game Artificial Intelligence (GVG-AI) framework; the paper discusses tutorials generated for eight different games. Our findings suggest that a graph representation scheme works well for simple arcade style games such as Space Invaders and Pacman, but it appears that tutorials for more complex games might require higher-level understanding of the game than just single mechanics.

AIFeb 1, 2017
Blue Sky Ideas in Artificial Intelligence Education from the EAAI 2017 New and Future AI Educator Program

Eric Eaton, Sven Koenig, Claudia Schulz et al.

The 7th Symposium on Educational Advances in Artificial Intelligence (EAAI'17, co-chaired by Sven Koenig and Eric Eaton) launched the EAAI New and Future AI Educator Program to support the training of early-career university faculty, secondary school faculty, and future educators (PhD candidates or postdocs who intend a career in academia). As part of the program, awardees were asked to address one of the following "blue sky" questions: * How could/should Artificial Intelligence (AI) courses incorporate ethics into the curriculum? * How could we teach AI topics at an early undergraduate or a secondary school level? * AI has the potential for broad impact to numerous disciplines. How could we make AI education more interdisciplinary, specifically to benefit non-engineering fields? This paper is a collection of their responses, intended to help motivate discussion around these issues in AI education.