94.1CYApr 2
Simulating Couple Conflict: Designing A Multi-Agent System for Therapy Training and PracticeCanwen Wang, Angela Chen, Catherine Bao et al.
Couples therapy requires managing complex, evolving emotional dynamics between partners, but traditional training methods for therapists, like role-play, lack realism, consistency, and control. We present a multi-modal simulation that models therapy as a controlled, multi-agent dynamical system with structured interaction stages. Therapists practice with a pair of client-agents who go through six evolving stages that respond to therapist actions. This simulation enables practice with demand-withdraw conflict patterns in a closed-loop environment. The simulation uses a sense-plan-act architecture: it detects the therapist's input, updates agents' interaction states based on psychotherapy theory and transcript analysis, and generates realistic verbal and emotional responses. In an experiment with 21 licensed U.S. therapists, participants more accurately identified state transitions and rated the system as more realistic and responsive than a prompt-based baseline, demonstrating the value of stateful, interpretable simulation for therapist training.
88.1CRApr 29Code
LATTICE: Evaluating Decision Support Utility of Crypto AgentsAaron Chan, Tengfei Li, Tianyi Xiao et al.
We introduce LATTICE, a benchmark for evaluating the decision support utility of crypto agents in realistic user-facing scenarios. Prior crypto agent benchmarks mainly focus on reasoning-based or outcome-based evaluation, but do not assess agents' ability to assist user decision-making. LATTICE addresses this gap by: (1) defining six evaluation dimensions that capture key decision support properties; (2) proposing 16 task types that span the end-to-end crypto copilot workflow; and (3) using LLM judges to automatically score agent outputs based on these dimensions and tasks. Crucially, the dimensions and tasks are designed to be evaluable at scale using LLM judges, without relying on ground truth from expert annotators or external data sources. In lieu of these dependencies, LATTICE's LLM judge rubrics can be continually audited and updated given new dimensions, tasks, criteria, and human feedback, thus promoting reliable and extensible evaluation. While other benchmarks often compare foundation models sharing a generic agent framework, we use LATTICE to assess production-level agents used in actual crypto copilot products, reflecting the importance of orchestration and UI/UX design in determining agent quality. In this paper, we evaluate six real-world crypto copilots on 1,200 diverse queries and report breakdowns across dimensions, tasks, and query categories. Our experiments show that most of the tested copilots achieve comparable aggregate scores, but differ more significantly on dimension-level and task-level performance. This pattern suggests meaningful trade-offs in decision support quality: users with different priorities may be better served by different copilots than the aggregate rankings alone would indicate. To support reproducible research, we open-source all LATTICE code and data used in this paper.
ROOct 28, 2023
"Do it my way!": Impact of Customizations on Trust perceptions in Human-Robot CollaborationParv Kapoor, Simon Chu, Angela Chen
Trust has been shown to be a key factor in effective human-robot collaboration. In the context of assistive robotics, the effect of trust factors on human experience is further pronounced. Personalization of assistive robots is an orthogonal factor positively correlated with robot adoption and user perceptions. In this work, we investigate the relationship between these factors through a within-subjects study (N=17). We provide different levels of customization possibilities over baseline autonomous robot behavior and investigate its impact on trust. Our findings indicate that increased levels of customization was associated with higher trust and comfort perceptions. The assistive robot design process can benefit significantly from our insights for designing trustworthy and customized robots.
CVDec 6, 2022
GAS-NeXt: Few-Shot Cross-Lingual Font GeneratorHaoyang He, Xin Jin, Angela Chen
Generating new fonts is a time-consuming and labor-intensive task, especially in a language with a huge amount of characters like Chinese. Various deep learning models have demonstrated the ability to efficiently generate new fonts with a few reference characters of that style, but few models support cross-lingual font generation. This paper presents GAS-NeXt, a novel few-shot cross-lingual font generator based on AGIS-Net and Font Translator GAN, and improve the performance metrics such as Fréchet Inception Distance (FID), Structural Similarity Index Measure(SSIM), and Pixel-level Accuracy (pix-acc). Our approaches include replacing the original encoder and decoder with the idea of layer attention and context-aware attention from Font Translator GAN, while utilizing the shape, texture, and local discriminators of AGIS-Net. In our experiment on English-to-Chinese font translation, we observed better results in fonts with distinct local features than conventional Chinese fonts compared to results obtained from Font Translator GAN. We also validate our method on multiple languages and datasets.