CLMar 6
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM PersonalityXi Wang, Mengdie Zhuang, Jiqun Liu
Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that favour specific behavioural tendencies such as assertiveness. To investigate how diverse experiences shape machine personality and influence problem-solving, this study employs continued pre-training to expose models to domain-specific texts in an unsupervised manner, simulating the accumulation of experience. By adapting the Big Five framework via the Machine Personality Inventory (MPI), we quantify the personality traits of these model variants and analyse their relationship to linguistic style and reasoning behaviour. The findings reveal that model competence is bimodal, peaking at "Expressive Generalists" and "Suppressed Specialists," while identifying a "Suppression Advantage" where reduced social traits enhance complex reasoning performance. This study further establishes a causal link between training data linguistics, such as imperative frequency, and lexical diversity, providing a roadmap for "Personality Engineering".
AIApr 4, 2025
Towards deployment-centric multimodal AI beyond vision and languageXianyuan Liu, Jiayang Zhang, Shuo Zhou et al.
Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction, and decision-making across disciplines such as healthcare, science, and engineering. However, most multimodal AI advances focus on models for vision and language data, while their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasise deeper integration across multiple levels of multimodality and multidisciplinary collaboration to significantly broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design, and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability, and finance. By fostering multidisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact.
HCSep 10, 2020
A Framework for Evaluating Dashboards in HealthcareMengdie Zhuang, Dave Concannon, Ed Manley
In the era of "information overload", effective information provision is essential for enabling rapid response and critical decision making. In making sense of diverse information sources, data dashboards have become an indispensable tool, providing fast, effective, adaptable, and personalized access to information for professionals and the general public alike. However, these objectives place a heavy requirement on dashboards as information systems, resulting in poor usability and ineffective design. Understanding these shortfalls is a challenge given the absence of a consistent and comprehensive approach to dashboard evaluation. In this paper we systematically review literature on dashboard implementation in the healthcare domain, a field where dashboards have been employed widely, and in which there is widespread interest for improving the current state of the art, and subsequently analyse approaches taken towards evaluation. We draw upon consolidated dashboard literature and our own observations to introduce a general definition of dashboards which is more relevant to current trends, together with a dashboard task-based classification, which underpin our subsequent analysis. From a total of 81 papers we derive seven evaluation scenarios - task performance, behaviour change, interaction workflow, perceived engagement, potential utility, algorithm performance and system implementation. These scenarios distinguish different evaluation purposes which we illustrate through measurements, example studies, and common challenges in evaluation study design. We provide a breakdown of each evaluation scenario, and highlight some of the subtle and less well posed questions. We conclude by outlining a number of active discussion points and a set of dashboard evaluation best practices for the academic, clinical and software development communities alike.