HCApr 21
Revisiting Framing Codebooks with AI: Employing Large Language Models as Analytical Collaborators in Deductive Content AnalysisDiego Gomez-Zara, Hernán Valdivieso, Jorge Pérez et al.
Codebooks are central to framing research, providing theoretically grounded criteria for analyzing news content. While traditionally codebooks are built from theoretical frameworks and researchers' knowledge, applying these codebooks to large news corpora often exposes ambiguities, borderline cases, and underspecified rules that are difficult to resolve through theory alone. Moreover, news corpora evolve over time and differ across cultures, necessitating that researchers revisit the theoretical frameworks underlying these codebooks. In this article, we propose a workflow that uses Large Language Models (LLMs) to augment the creation and refinement of framing codebooks by combining theoretical frameworks with data-driven exploration. Rather than treating LLMs as automated classifiers, this approach positions them as analytic collaborators that help externalize decision rules, surface latent dimensions, and support iterative revisions of codebooks through dialogues between researchers and their data. We illustrate this workflow using a dataset of Latin American news coverage, demonstrating how the application of LLMs' capabilities has led to the surfacing of latent patterns, the generation of frame distinctions, and the adaptation of frameworks to new contexts. This method provides an LLM-assisted strategy that supports methodology creativity while preserving researchers' interpretative authority.
HCApr 29
MultEval: Supporting Collaborative Alignment for LLM-as-a-Judge Evaluation CriteriaCharles Chiang, Simret Gebreegziabher, Annalisa Szymanski et al.
LLM-as-a-judge approaches have emerged as a scalable solution for evaluating model behaviors, yet they rely on evaluation criteria often created by a single individual, embedding that person's assumptions, priorities, and interpretive lens. In practice, defining such criteria is a collaborative and contested process involving multiple stakeholders with different values, interpretations, and priorities; an aspect largely unsupported by existing tools. To examine this problem in depth, we present a formative study examining how stakeholders collaboratively create, negotiate, and refine evaluation criteria for LLM-as-a-judge systems. Our findings reveal challenges in human oversight, including difficulties in establishing shared understanding, aligning values across stakeholders with different expertise and priorities, and translating nuanced human judgments into criteria that are interpretable and actionable for LLM judges. Based on these insights, we developed MultEval, a system that supports collaborative criteria by enabling multiple evaluators to surface and diagnose disagreements using consensus-building theory, iteratively revise criteria with attached examples and proposal history, and maintain transparency over how judgments are encoded into an automated evaluator. We further report a case study in which a team of domain experts used MultEval to collaboratively author criteria, illustrating how coordination and collaborative consensus-making shape criteria evolution.
CVApr 2
ViT-Explainer: An Interactive Walkthrough of the Vision Transformer PipelineJuan Manuel Hernandez, Mariana Fernandez-Espinosa, Denis Parra et al.
Transformer-based architectures have become the shared backbone of natural language processing and computer vision. However, understanding how these models operate remains challenging, particularly in vision settings, where images are processed as sequences of patch tokens. Existing interpretability tools often focus on isolated components or expert-oriented analysis, leaving a gap in guided, end-to-end understanding of the full inference pipeline. To bridge this gap, we present ViT-Explainer, a web-based interactive system that provides an integrated visualization of Vision Transformer inference, from patch tokenization to final classification. The system combines animated walkthroughs, patch-level attention overlays, and a vision-adapted Logit Lens within both guided and free exploration modes. A user study with six participants suggests that ViT-Explainer is easy to learn and use, helping users interpret and understand Vision Transformer behavior.
CEOct 3, 2025
Report of the 2025 Workshop on Next-Generation Ecosystems for Scientific Computing: Harnessing Community, Software, and AI for Cross-Disciplinary Team ScienceLois Curfman McInnes, Dorian Arnold, Prasanna Balaprakash et al.
This report summarizes insights from the 2025 Workshop on Next-Generation Ecosystems for Scientific Computing: Harnessing Community, Software, and AI for Cross-Disciplinary Team Science, which convened more than 40 experts from national laboratories, academia, industry, and community organizations to chart a path toward more powerful, sustainable, and collaborative scientific software ecosystems. To address urgent challenges at the intersection of high-performance computing (HPC), AI, and scientific software, participants envisioned agile, robust ecosystems built through socio-technical co-design--the intentional integration of social and technical components as interdependent parts of a unified strategy. This approach combines advances in AI, HPC, and software with new models for cross-disciplinary collaboration, training, and workforce development. Key recommendations include building modular, trustworthy AI-enabled scientific software systems; enabling scientific teams to integrate AI systems into their workflows while preserving human creativity, trust, and scientific rigor; and creating innovative training pipelines that keep pace with rapid technological change. Pilot projects were identified as near-term catalysts, with initial priorities focused on hybrid AI/HPC infrastructure, cross-disciplinary collaboration and pedagogy, responsible AI guidelines, and prototyping of public-private partnerships. This report presents a vision of next-generation ecosystems for scientific computing where AI, software, hardware, and human expertise are interwoven to drive discovery, expand access, strengthen the workforce, and accelerate scientific progress.