cs.HCComputer Science

Human-Computer Interaction

User interfaces, accessibility, interaction design

85SDFeb 24, 2025Code

AAD-LLM: Neural Attention-Driven Auditory Scene Understanding

Xilin Jiang, Sukru Samet Dindar, Vishal Choudhari et al.

This work addresses the limitation of auditory AI in aligning with human perception for applications like hearing aids or communication systems, representing a novel paradigm rather than an incremental improvement.

83CLFeb 17, 2025Code

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Ailin Huang, Boyong Wu, Bruce Wang et al.

This addresses high costs, weak dynamic control, and limited intelligence in speech interaction models for developers and researchers, representing a significant advancement rather than an incremental improvement.

81CLNov 16, 2024

Large Language Models (LLMs) as Traffic Control Systems at Urban Intersections: A New Paradigm

Sari Masri, Huthaifa I. Ashqar, Mohammed Elhenawy

This proposes a new paradigm for traffic management systems that could improve efficiency at intersections for drivers and autonomous vehicles.

79AIFeb 15, 2025Code

USER-VLM 360: Personalized Vision Language Models with User-aware Tuning for Social Human-Robot Interactions

Hamed Rahimi, Adil Bahaj, Mouad Abrini et al.

This work addresses the problem of personalized human-robot interactions for diverse users, providing a significant advancement in social robotics.

78CLNov 5, 2025Code

Step-Audio-EditX Technical Report

Chao Yan, Boyong Wu, Peng Yang et al.

This addresses the need for advanced audio editing tools for content creators and researchers, offering a novel approach that is not incremental.

78CVAug 30, 2025

Visually Grounded Narratives: Reducing Cognitive Burden in Researcher-Participant Interaction

Runtong Wu, Jiayao Song, Fei Teng et al.

This addresses the dual burden of data analysis and member checking for researchers and participants in narrative inquiry, representing a first attempt in the field.

78LGMay 20, 2025

Spiking Neural Networks with Temporal Attention-Guided Adaptive Fusion for imbalanced Multi-modal Learning

Jiangrong Shen, Yulin Xie, Qi Xu et al.

This work addresses critical challenges in energy-efficient multimodal sensory processing for neuromorphic systems, establishing a new paradigm rather than being incremental.

78HCFeb 12, 2025Code

Interactive Sketchpad: A Multimodal Tutoring System for Collaborative, Visual Problem-Solving

Steven-Shine Chen, Jimin Lee, Paul Pu Liang

This work addresses the need for more effective and engaging educational technologies, particularly for students struggling with complex math concepts.

77CLDec 16, 2024Code

LLMs Can Simulate Standardized Patients via Agent Coevolution

Zhuoyun Du, Lujie Zheng, Renjun Hu et al.

This addresses the problem of scalable and effective medical training for healthcare professionals, representing a novel application of agent coevolution rather than an incremental improvement.

77AIMay 20, 2025Code

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

Bufang Yang, Lilin Xu, Liekang Zeng et al.

This work addresses the need for more effective proactive AI assistants in daily scenarios, representing a novel approach rather than an incremental improvement.

77LGJul 28, 2025Code

Advancing Compositional LLM Reasoning with Structured Task Relations in Interactive Multimodal Communications

Xinye Cao, Hongcan Guo, Guoshun Nan et al.

This addresses efficiency and flexibility challenges for resource-constrained mobile environments in interactive multimodal applications like route planning.

77CLMar 6Code

Learning Next Action Predictors from Human-Computer Interaction

Omar Shaikh, Valentin Teutschbein, Kanishk Gandhi et al.

This work addresses the problem of anticipating user needs for proactive AI systems by predicting their next computer interaction, which is significant for developers of AI assistants.

76HCJul 8, 2025Code

SSSUMO: Real-Time Semi-Supervised Submovement Decomposition

Evgenii Rudakov, Jonathan Shock, Otto Lappi et al.

This addresses challenges in human-computer interaction, rehabilitation medicine, and motor control studies by providing a fast and accurate method for analyzing human movements.

76CLFeb 17, 2025Code

A-MEM: Agentic Memory for LLM Agents

Wujiang Xu, Zujie Liang, Kai Mei et al.

This addresses the need for more adaptive and context-aware memory management in LLM agents, representing a novel method rather than an incremental improvement.

76HCApr 21, 2025Code

NeuGaze: Reshaping the future BCI

Yiqian Yang

This provides a low-cost, accessible alternative to BCIs for motor-impaired users, enabling intuitive human-computer interaction in applications like assistive technology and entertainment.

75AIMar 11, 2025Code

AI-native Memory 2.0: Second Me

Jiale Wei, Xiang Ying, Tao Gao et al.

This addresses the inefficiency of repeated data input for users interacting with various digital platforms, representing a novel approach rather than an incremental improvement.

75CVOct 24, 2025Code

Group Inertial Poser: Multi-Person Pose and Global Translation from Sparse Inertial Sensors and Ultra-Wideband Ranging

Ying Xue, Jiaxi Jiang, Rayan Armani et al.

This addresses the challenge of multi-person motion capture in unconstrained environments for applications like virtual reality or sports analysis, representing a novel integration rather than an incremental improvement.

75AIOct 10, 2025Code

GTAlign: Game-Theoretic Alignment of LLM Assistants for Social Welfare

Siqi Zhu, David Zhang, Pedro Cisneros-Velarde et al.

This addresses the issue of misaligned LLM behavior for users in practical applications, offering a novel approach to enhance cooperative outcomes.

75AIJan 21, 2025

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Yujia Qin, Yining Ye, Junjie Fang et al.

This addresses the challenge of automating GUI tasks for users and developers, offering a novel approach that reduces reliance on heavily wrapped commercial models and expert-crafted workflows.

75LGMay 21, 2025Code

MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual Decoding

Yuxiang Wei, Yanteng Zhang, Xi Xiao et al.

This work addresses the need for interpretable and generalizable brain-computer interfaces for neuroscience and medical applications, representing a novel method rather than an incremental improvement.