Yingfan Zhou

CL
h-index12
4papers
20citations
Novelty54%
AI Score45

4 Papers

39.3HCApr 7
Learning Password Best Practices Through In-Task Instruction

Qian Ma, Yingfan Zhou, Shubhang Kaushik et al.

Users often make security- and privacy-relevant decisions without a clear understanding of the rules that govern safe behavior. We introduce pedagogical friction, a design approach that inserts brief, instructional interactions at the moment of action. We evaluate this approach in the context of password creation, a familiar task with clear quality criteria. We conducted a randomized study with 128 participants across four interface conditions that varied the depth and interactivity of guidance. We assessed three outcomes: (1) rule compliance in a subsequent password task without guidance, (2) accuracy on survey questions tied to password rules, and (3) behavior-knowledge alignment, which captures whether participants who correctly followed a rule also recognized it on the survey. Across the guided conditions, participants corrected most rule violations in the follow-up task and showed high behavior-knowledge alignment. Survey results suggested clearer advantages for some rule types, especially symbol related questions. These results position pedagogical friction as a lightweight intervention for security- and privacy-critical interfaces.

CLMay 7, 2025
A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas

Pranav Narayanan Venkit, Jiayi Li, Yingfan Zhou et al.

As LLMs (large language models) are increasingly used to generate synthetic personas particularly in data-limited domains such as health, privacy, and HCI, it becomes necessary to understand how these narratives represent identity, especially that of minority communities. In this paper, we audit synthetic personas generated by 3 LLMs (GPT4o, Gemini 1.5 Pro, Deepseek 2.5) through the lens of representational harm, focusing specifically on racial identity. Using a mixed methods approach combining close reading, lexical analysis, and a parameterized creativity framework, we compare 1512 LLM generated personas to human-authored responses. Our findings reveal that LLMs disproportionately foreground racial markers, overproduce culturally coded language, and construct personas that are syntactically elaborate yet narratively reductive. These patterns result in a range of sociotechnical harms, including stereotyping, exoticism, erasure, and benevolent bias, that are often obfuscated by superficially positive narrations. We formalize this phenomenon as algorithmic othering, where minoritized identities are rendered hypervisible but less authentic. Based on these findings, we offer design recommendations for narrative-aware evaluation metrics and community-centered validation protocols for synthetic identity generation.

ROAug 18, 2025
Diff-MSM: Differentiable MusculoSkeletal Model for Simultaneous Identification of Human Muscle and Bone Parameters

Yingfan Zhou, Philip Sanderink, Sigurd Jager Lemming et al.

High-fidelity personalized human musculoskeletal models are crucial for simulating realistic behavior of physically coupled human-robot interactive systems and verifying their safety-critical applications in simulations before actual deployment, such as human-robot co-transportation and rehabilitation through robotic exoskeletons. Identifying subject-specific Hill-type muscle model parameters and bone dynamic parameters is essential for a personalized musculoskeletal model, but very challenging due to the difficulty of measuring the internal biomechanical variables in vivo directly, especially the joint torques. In this paper, we propose using Differentiable MusculoSkeletal Model (Diff-MSM) to simultaneously identify its muscle and bone parameters with an end-to-end automatic differentiation technique differentiating from the measurable muscle activation, through the joint torque, to the resulting observable motion without the need to measure the internal joint torques. Through extensive comparative simulations, the results manifested that our proposed method significantly outperformed the state-of-the-art baseline methods, especially in terms of accurate estimation of the muscle parameters (i.e., initial guess sampled from a normal distribution with the mean being the ground truth and the standard deviation being 10% of the ground truth could end up with an average of the percentage errors of the estimated values as low as 0.05%). In addition to human musculoskeletal modeling and simulation, the new parameter identification technique with the Diff-MSM has great potential to enable new applications in muscle health monitoring, rehabilitation, and sports science.

CLApr 25, 2025
Can Third-parties Read Our Emotions?

Jiayi Li, Yingfan Zhou, Pranav Narayanan Venkit et al.

Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition tasks that directly compare third-party annotations with first-party (author-provided) emotion labels. Our findings reveal significant limitations in third-party annotations-whether provided by human annotators or large language models (LLMs)-in faithfully representing authors' private states. However, LLMs outperform human annotators nearly across the board. We further explore methods to improve third-party annotation quality. We find that demographic similarity between first-party authors and third-party human annotators enhances annotation performance. While incorporating first-party demographic information into prompts leads to a marginal but statistically significant improvement in LLMs' performance. We introduce a framework for evaluating the limitations of third-party annotations and call for refined annotation practices to accurately represent and model authors' private states.