Juho Hamari

HC
h-index67
6papers
133citations
Novelty32%
AI Score36

6 Papers

CYOct 25, 2024
Can We Trust AI Agents? A Case Study of an LLM-Based Multi-Agent System for Ethical AI

José Antonio Siqueira de Cerqueira, Mamia Agbese, Rebekah Rousi et al.

AI-based systems, including Large Language Models (LLM), impact millions by supporting diverse tasks but face issues like misinformation, bias, and misuse. AI ethics is crucial as new technologies and concerns emerge, but objective, practical guidance remains debated. This study examines the use of LLMs for AI ethics in practice, assessing how LLM trustworthiness-enhancing techniques affect software development in this context. Using the Design Science Research (DSR) method, we identify techniques for LLM trustworthiness: multi-agents, distinct roles, structured communication, and multiple rounds of debate. We design a multi-agent prototype LLM-MAS, where agents engage in structured discussions on real-world AI ethics issues from the AI Incident Database. We evaluate the prototype across three case scenarios using thematic analysis, hierarchical clustering, comparative (baseline) studies, and running source code. The system generates approximately 2,000 lines of code per case, compared to only 80 lines in baseline trials. Discussions reveal terms like bias detection, transparency, accountability, user consent, GDPR compliance, fairness evaluation, and EU AI Act compliance, showing this prototype ability to generate extensive source code and documentation addressing often overlooked AI ethics issues. However, practical challenges in source code integration and dependency management may limit its use by practitioners.

CLFeb 27, 2025
Mapping Trustworthiness in Large Language Models: A Bibliometric Analysis Bridging Theory to Practice

José Siqueira de Cerqueira, Kai-Kristian Kemell, Rebekah Rousi et al.

The rapid proliferation of Large Language Models (LLMs) has raised significant trustworthiness and ethical concerns. Despite the widespread adoption of LLMs across domains, there is still no clear consensus on how to define and operationalise trustworthiness. This study aims to bridge the gap between theoretical discussion and practical implementation by analysing research trends, definitions of trustworthiness, and practical techniques. We conducted a bibliometric mapping analysis of 2,006 publications from Web of Science (2019-2025) using the Bibliometrix, and manually reviewed 68 papers. We found a shift from traditional AI ethics discussion to LLM trustworthiness frameworks. We identified 18 different definitions of trust/trustworthiness, with transparency, explainability and reliability emerging as the most common dimensions. We identified 20 strategies to enhance LLM trustworthiness, with fine-tuning and retrieval-augmented generation (RAG) being the most prominent. Most of the strategies are developer-driven and applied during the post-training phase. Several authors propose fragmented terminologies rather than unified frameworks, leading to the risks of "ethics washing," where ethical discourse is adopted without a genuine regulatory commitment. Our findings highlight: persistent gaps between theoretical taxonomies and practical implementation, the crucial role of the developer in operationalising trust, and call for standardised frameworks and stronger regulatory measures to enable trustworthy and ethical deployment of LLMs.

HCApr 6
Demonstrating SIMA-Play: A Serious Game for Forest Management Decision-Making through Board Game and Digital Simulation

Arka Majhi, Daniel Fernández Galeote, Timo Nummenmaa et al.

Board games have shown promise as educational tools, but their use in engaging learners with the complex, long-term trade-offs of forest management remains strikingly underdeveloped. Addressing this gap, we investigate how forest growth simulation data can inform decision-making through information visualization and gameplay mechanics. We designed a serious game, SIMA-Play, that enables players to make informed forest management decisions under dynamic environmental and market conditions, simulating forest growth over time and comparing player performance across economic and sustainability outcomes. By using visualization to give players feedback on their choices, at the end of the game, it supports systems thinking and makes the trade-offs in forestry practices easier to understand and discuss. The study concludes with a research roadmap that outlines future experiments, longitudinal studies, and digital versions of SIMA-Play to assess its long-term effects on learning and engagement.

HCJun 18, 2021
Do people's user types change over time? An exploratory study

Ana Cláudia Guimarães Santos, Wilk Oliveira, Juho Hamari et al.

In recent years, different studies have proposed and validated user models (e.g., Bartle, BrainHex, and Hexad) to represent the different user profiles in games and gamified settings. However, the results of applying these user models in practice (e.g., to personalize gamified systems) are still contradictory. One of the hypotheses for these results is that the user types can change over time (i.e., user types are dynamic). To start to understand whether user types can change over time, we conducted an exploratory study analyzing data from 74 participants to identify if their user type (Achiever, Philanthropist, Socialiser, Free Spirit, Player, and Disruptor) had changed over time (six months). The results indicate that there is a change in the dominant user type of the participants, as well as the average scores in the Hexad sub-scales. These results imply that all the scores should be considered when defining the Hexad's user type and that the user types are dynamic. Our results contribute with practical implications, indicating that the personalization currently made (generally static) may be insufficient to improve the users' experience, requiring user types to be analyzed continuously and personalization to be done dynamically.

HCJun 18, 2021
Does gamification affect flow experience? A systematic literature review

Wilk Oliveira, Olena Pastushenko, Luiz Rodrigues et al.

In recent years, studies in different areas have used gamification to improve users' flow experience. However, due to the high variety of the conducted studies and the lack of secondary studies (e.g., systematic literature reviews) in this field, it is difficult to get the state-of-the-art of this research domain. To address this problem, we conducted a systematic literature review to identify i) which gamification design methods have been used in the studies about gamification and Flow Theory, ii) which gamification elements have been used in these studies, iii) which methods have been used to evaluate the users' flow experience in gamified settings, and iv) how gamification affects users' flow experience. The main results show that there is growing interest to this field, as the number of publications is increasing. The most significant interest is in the area of gamification in education. However, there is no unanimity regarding the preferred method of the study or the effects of gamification on users' experience. Our results highlight the importance of conducting new experimental studies investigating how gamification affects the users' flow experience in different gamified settings, applications and domains.

HCMar 28, 2019
A gradual approach for maximising user conversion without compromising experience with high visual intensity website elements

Jarosław Jankowski, Juho Hamari, Jarosław Wątróbski

The study develops and tests a method that can gradually find a sweet spot between user experience and visual intensity of website elements to maximise user conversion with minimal adverse effect. In the first phase of the study, we develop the method. In the second stage, we test and evaluate the method via an empirical study; also, an experiment was conducted within web interface with the gradual intensity of visual elements.The findings reveal that negative response grows faster than conversion when the visual intensity of the web interface is increased. However, a saturation point, where there is coexistence between maximum conversion and minimum negative response, can be found. The findings imply that efforts to attract user attention should be pursued with increased caution and that a gradual approach presented in this study helps in finding a site-specific sweet-spot for a level of visual intensity by incrementally adjusting the elements of the interface and tracking the changes in user behaviour. Web marketing and advertising professionals often face the dilemma of determining the optimal level of visual intensity of interface element. Excessive use of marketing component and attention-grabbing visual elements can lead to an inferior user experience and consequent user churn due to growing intrusiveness. At the same time, too little visual intensity can fail to steer users. The present study provides a gradual approach which aids in finding a balance between user experience and visual intensity, maximising user conversion and thus providing a practical solution for the problem.