HCAIDec 5, 2025

Future You: Designing and Evaluating Multimodal AI-generated Digital Twins for Strengthening Future Self-Continuity

arXiv:2512.06106v14 citations
Originality Synthesis-oriented
AI Analysis

This research addresses how AI-generated future selves can strengthen psychological well-being and decision-making for users, representing an incremental application of existing AI technologies to a specific psychological intervention.

The study investigated how different modalities (text, voice, avatar) of AI-generated digital twins representing future selves affect psychological outcomes like future self-continuity, emotional well-being, and motivation, finding that all personalized modalities improved these measures compared to a control, with avatar producing the largest vividness gains but no significant differences between formats, and that Claude 4 outperformed other LLMs in enhancing outcomes.

What if users could meet their future selves today? AI-generated future selves simulate meaningful encounters with a digital twin decades in the future. As AI systems advance, combining cloned voices, age-progressed facial rendering, and autobiographical narratives, a central question emerges: Does the modality of these future selves alter their psychological and affective impact? How might a text-based chatbot, a voice-only system, or a photorealistic avatar shape present-day decisions and our feeling of connection to the future? We report a randomized controlled study (N=92) evaluating three modalities of AI-generated future selves (text, voice, avatar) against a neutral control condition. We also report a systematic model evaluation between Claude 4 and three other Large Language Models (LLMs), assessing Claude 4 across psychological and interaction dimensions and establishing conversational AI quality as a critical determinant of intervention effectiveness. All personalized modalities strengthened Future Self-Continuity (FSC), emotional well-being, and motivation compared to control, with avatar producing the largest vividness gains, yet with no significant differences between formats. Interaction quality metrics, particularly persuasiveness, realism, and user engagement, emerged as robust predictors of psychological and affective outcomes, indicating that how compelling the interaction feels matters more than the form it takes. Content analysis found thematic patterns: text emphasized career planning, while voice and avatar facilitated personal reflection. Claude 4 outperformed ChatGPT 3.5, Llama 4, and Qwen 3 in enhancing psychological, affective, and FSC outcomes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes