HCApr 3
VisionClaw: Always-On AI Agents through Smart GlassesXiaoan Liu, DaeHo Lee, Eric J Gonzalez et al.
We present VisionClaw, an always-on wearable AI agent that integrates live egocentric perception with agentic task execution. Running on Meta Ray-Ban smart glasses, VisionClaw continuously perceives real-world context and enables in-situ, speech-driven action initiation and delegation via OpenClaw AI agents. Therefore, users can directly execute tasks through the smart glasses, such as adding real-world objects to an Amazon cart, generating notes from physical documents, receiving meeting briefings on the go, creating events from posters, or controlling IoT devices. We evaluate VisionClaw through a controlled laboratory study (N=12) and a longitudinal deployment study (N=5). Results show that integrating perception and execution enables faster task completion and reduces interaction overhead compared to non-always-on and non-agent baselines. Beyond performance gains, deployment findings reveal a shift in interaction: tasks are initiated opportunistically during ongoing activities, and execution is increasingly delegated rather than manually controlled. These results suggest a new paradigm for wearable AI agents, where perception and action are continuously coupled to support situated, hands-free interaction.
HCApr 6
Semantic Reality: Interactive Context-Aware Visualization of Inter-Object Relationships in Augmented RealityXiaoan Liu, Eric J Gonzalez, Nels Numan et al.
Bridging the physical and digital world through interaction remains a core challenge in augmented reality (AR). Existing systems target single objects, limiting support for planning, comparison, and assembly tasks that depend on relationships among multiple items. We present Semantic Reality, an AR system focused on surfacing inter-object connectivity and making it interactive. Leveraging multimodal reasoning, spatial anchoring, and physical action recognition, Semantic Reality maintains a persistent model of objects around the user and their relationships. Connections are visualized in-situ to highlight compatibility, reveal next steps, and reduce ambiguity during tasks. We contribute a connectivity-centered interaction paradigm and a system architecture that couples anchor tracking, action sensing, and model inference to construct a live connectivity graph. In an exploratory study comparing Semantic Reality to a single-object baseline, participants reported clearer inter-object understanding and higher engagement and satisfaction, without increased workload. A scenario study illustrates where connectivity aids planning, sequencing, and disambiguation.
HCApr 2
Conversational Successes and Breakdowns in Everyday Smart Glasses UseXiuqi Tommy Zhu, Xiaoan Liu, Casper Harteveld et al.
Non-Display Smart Glasses hold the potential to support everyday activities by combining continuous environmental sensing with voice-only interaction powered by large language models (LLMs). Understanding how conversational successes and breakdowns arise in everyday contexts can better inform the design of future voice-only interfaces. To investigate this, we conducted a month-long collaborative autoethnography (n=2) to identify patterns of successes and breakdowns when using such devices. We then compare these patterns with prior findings on voice-only interactions to highlight the unique affordances and opportunities offered by non-display smart glasses.
HCJul 23, 2025
Reality Proxy: Fluid Interactions with Real-World Objects in MR via Abstract RepresentationsXiaoan Liu, Difan Jia, Xianhao Carton Liu et al.
Interacting with real-world objects in Mixed Reality (MR) often proves difficult when they are crowded, distant, or partially occluded, hindering straightforward selection and manipulation. We observe that these difficulties stem from performing interaction directly on physical objects, where input is tightly coupled to their physical constraints. Our key insight is to decouple interaction from these constraints by introducing proxies-abstract representations of real-world objects. We embody this concept in Reality Proxy, a system that seamlessly shifts interaction targets from physical objects to their proxies during selection. Beyond facilitating basic selection, Reality Proxy uses AI to enrich proxies with semantic attributes and hierarchical spatial relationships of their corresponding physical objects, enabling novel and previously cumbersome interactions in MR - such as skimming, attribute-based filtering, navigating nested groups, and complex multi object selections - all without requiring new gestures or menu systems. We demonstrate Reality Proxy's versatility across diverse scenarios, including office information retrieval, large-scale spatial navigation, and multi-drone control. An expert evaluation suggests the system's utility and usability, suggesting that proxy-based abstractions offer a powerful and generalizable interaction paradigm for future MR systems.
HCMay 6, 2025
Augmenting Human Cognition through Everyday ARXiaoan Liu
As spatial computing and multimodal LLMs mature, AR is tending to become an intuitive "thinking tool," embedding semantic and context-aware intelligence directly into everyday environments. This paper explores how always-on AR can seamlessly bridge digital cognition and physical affordances, enabling proactive, context-sensitive interactions that enhance human task performance and understanding.