HCJul 9, 2024
Evaluating Human-AI Collaboration: A Review and Methodological FrameworkGeorge Fragiadakis, Christos Diou, George Kousiouris et al.
The use of artificial intelligence (AI) in working environments with individuals, known as Human-AI Collaboration (HAIC), has become essential in a variety of domains, boosting decision-making, efficiency, and innovation. Despite HAIC's wide potential, evaluating its effectiveness remains challenging due to the complex interaction of components involved. This paper provides a detailed analysis of existing HAIC evaluation approaches and develops a fresh paradigm for more effectively evaluating these systems. Our framework includes a structured decision tree which assists to select relevant metrics based on distinct HAIC modes (AI-Centric, Human-Centric, and Symbiotic). By including both quantitative and qualitative metrics, the framework seeks to represent HAIC's dynamic and reciprocal nature, enabling the assessment of its impact and success. This framework's practicality can be examined by its application in an array of domains, including manufacturing, healthcare, finance, and education, each of which has unique challenges and requirements. Our hope is that this study will facilitate further research on the systematic evaluation of HAIC in real-world applications.
HCSep 4, 2025
HumAIne-Chatbot: Real-Time Personalized Conversational AI via Reinforcement LearningGeorgios Makridis, George Fragiadakis, Jorge Oliveira et al.
Current conversational AI systems often provide generic, one-size-fits-all interactions that overlook individual user characteristics and lack adaptive dialogue management. To address this gap, we introduce \textbf{HumAIne-chatbot}, an AI-driven conversational agent that personalizes responses through a novel user profiling framework. The system is pre-trained on a diverse set of GPT-generated virtual personas to establish a broad prior over user types. During live interactions, an online reinforcement learning agent refines per-user models by combining implicit signals (e.g. typing speed, sentiment, engagement duration) with explicit feedback (e.g., likes and dislikes). This profile dynamically informs the chatbot dialogue policy, enabling real-time adaptation of both content and style. To evaluate the system, we performed controlled experiments with 50 synthetic personas in multiple conversation domains. The results showed consistent improvements in user satisfaction, personalization accuracy, and task achievement when personalization features were enabled. Statistical analysis confirmed significant differences between personalized and nonpersonalized conditions, with large effect sizes across key metrics. These findings highlight the effectiveness of AI-driven user profiling and provide a strong foundation for future real-world validation.