Shijing He

HC
h-index4
10papers
73citations
Novelty36%
AI Score50

10 Papers

CRJun 2
Generative AI-Enabled Refund Fraud in Chinese E-Commerce: Investigation on Merchants and Platform Workers

Shuning Zhang, Eve He, Xiao Zhan et al.

E-commerce dispute resolution typically relies on the security assumption that digital evidence truthfully reflects physical reality. Generative AI (GenAI) invalidates this threat model, enabling attackers to fabricate hyper-realistic evidence of product defects at negligible cost. Through semi-structured interviews with merchants (N=17) and platform workers (N=13) in the Chinese e-commerce market, we characterize this shift toward GenAI-enabled scalable fabrication. We outline a taxonomy of four GenAI-enabled threat vectors across the transaction, dispute, logistics and communication phases, highlighting how attackers exploit GenAI to synthesize physically plausible product defects at scale. To mitigate these threats, platforms and merchants are adapting verification strategies, relying on AI tools for automated screening and adversarial interrogation (e.g., requesting multi-angle videos) to increase attack complexity. However, we find several challenges that hinder the adoption of these defenses, including implementation hurdles like structural platform constraints and fundamental limitations regarding the technical sophistication of GenAI. We conclude by outlining design implications for privacy-preserving cross-platform fraud databases, and traceability mechanisms such as embedding verifiable material anchors into the product.

HCJun 30, 2023
An End-to-End Review of Gaze Estimation and its Interactive Applications on Handheld Mobile Devices

Yaxiong Lei, Shijing He, Mohamed Khamis et al.

In recent years we have witnessed an increasing number of interactive systems on handheld mobile devices which utilise gaze as a single or complementary interaction modality. This trend is driven by the enhanced computational power of these devices, higher resolution and capacity of their cameras, and improved gaze estimation accuracy obtained from advanced machine learning techniques, especially in deep learning. As the literature is fast progressing, there is a pressing need to review the state of the art, delineate the boundary, and identify the key research challenges and opportunities in gaze estimation and interaction. This paper aims to serve this purpose by presenting an end-to-end holistic view in this area, from gaze capturing sensors, to gaze estimation workflows, to deep learning techniques, and to gaze interactive applications.

HCJan 23
Privacy in Human-AI Romantic Relationships: Concerns, Boundaries, and Agency

Rongjun Ma, Shijing He, Jose Luis Martin-Navarro et al.

An increasing number of LLM-based applications are being developed to facilitate romantic relationships with AI partners, yet the safety and privacy risks in these partnerships remain largely underexplored. In this work, we investigate privacy in human-AI romantic relationships through an interview study (N=17), examining participants' experiences and privacy perceptions across the three stages of exploration, intimacy, and dissolution, alongside an analysis of the platforms they used. We found that these relationships took varied forms, from one-to-one to one-to-many, and were shaped by multiple actors, including creators, platforms, and moderators. AI partners were perceived as having agency, actively negotiating privacy boundaries with participants and sometimes encouraging disclosure of personal details. As intimacy deepened, these boundaries became more permeable, though some participants expressed concerns such as conversation exposure and sought to preserve anonymity. Overall, AI platform affordances and diverse relational dynamics expand the privacy landscape, underscoring the need to rethink how privacy is constructed in human-AI romantic relationships.

HCMar 24
The People's Gaze: Co-Designing and Refining Gaze Gestures with General Users and Gaze Interaction Experts

Yaxiong Lei, Xinya Gong, Shijing He et al.

As eye-tracking becomes increasingly common in modern mobile devices, the potential for hands-free, gaze-based interaction grows, but current gesture sets are largely expert-designed and often misaligned with how users naturally move their eyes. To address this gap, we introduce a two-phase methodology for developing intuitive gaze gestures. First, four co-design workshops with 20 non-expert participants generated 102 initial concepts. Next, four gaze interaction experts reviewed and refined these into a set of 32 gestures. We found that non-experts, after a brief introduction, intuitively anchor gestures in familiar metaphors and develop a compositional grammar; i.e., activation (dwell) + action (gaze gesture or blink), to ensure intentionality and mitigate the classic Midas Touch problem. Experts prioritized gestures that are ergonomically sound, aligned with natural saccades, and reliably distinguishable. The resulting user-grounded, expert-validated gesture set, along with actionable design principles, provides a foundation for developing intuitive, hands-free interfaces for gaze-enabled devices.

HCMar 30
GazeSync: A Mobile Eye-Tracking Tool for Analyzing Visual Attention on Dynamically Manipulated Content

Yaxiong Lei, Rishab Talwar, Shijing He et al.

Conventional mobile eye-tracking maps gaze to static screen coordinates, failing to capture user attention when content is dynamic. As users pinch, zoom, and rotate images, static coordinates lose their semantic meaning relative to the underlying visual content. To address this methodological gap, we present \textit{GazeSync}, a reusable mobile system that synchronizes on-device gaze estimation with real-time image transformation matrices (scale, rotation, and translation). By logging gaze coordinates alongside precise UI states, GazeSync enables the accurate reconstruction of \textit{image-relative} attention patterns, decoupling visual attention from device interaction. We validate our end-to-end toolchain through a formative study involving guided manipulation, reading, and visual search tasks. Our results demonstrate GazeSync's ability to recover ground-truth gaze locations on transforming content, explicitly showing how it outperforms static baselines, while also surfacing critical boundaries regarding calibration drift and reconstruction fragility under compound manipulations.

HCMar 30
GazeCode: Recall-Based Verification for Higher-Quality In-the-Wild Mobile Gaze Data Collection

Yaxiong Lei, Thomas Davies, Xinya Gong et al.

Large-scale mobile gaze estimation relies on in-the-wild datasets, yet unsupervised collection makes it difficult to verify whether participants truly foveate logged targets. Prior mobile protocols often use low-entropy validation (e.g., binary probes) that can be satisfied by guessing and may still allow peripheral viewing, introducing label noise. We present \textbf{GazeCode}, a recall-based verification paradigm for higher-confidence in-the-wild mobile gaze data collection that strengthens \emph{label validity} through a multi-digit recall task (reducing random success to $10^{-N}$) paired with anti-peripheral stimulus design (small, low-contrast, brief digits). The system logs synchronized front-camera video, IMU streams, and target events using high-resolution timestamps. In a formative study (N=3), we probe key parameters (opacity, duration) and directly test peripheral exploitability using an eccentricity-controlled \textit{RING} condition. Results show that low-opacity digits substantially reduce peripheral readability while remaining usable for attentive foveation, supporting the inference that correct recall corresponds to higher-confidence gaze labels. We conclude with actionable design guidelines for robust in-the-wild gaze data collection.

HCMar 30
TinyGaze: Lightweight Gaze-Gesture Recognition on Commodity Mobile Devices

Yaxiong Lei, Hyochan Cho, Fergus Buchanan et al.

Gaze gestures can provide hands free input on mobile devices, but practical use requires (i) gestures users can learn and recall and (ii) recognition models that are efficient enough for on-device deployment. We present an end-to-end pipeline using commodity ARKit head/eye transforms and a scaffolded guidance-to-recall protocol grounded in learning theory. In a pilot feasibility study (N=4 participants; 240 trials; controlled single-session setting), we benchmark a compact time-series model (TinyHAR) against deeper baselines (DeepConvLSTM, SA-HAR) on 5-way gesture recognition and 4-way user identification. TinyHAR achieves strong performance in this pilot benchmark (Macro F1 = 0.960 for gesture recognition; Macro F1 = 0.997 for user identification) while using only 46k parameters. A modality analysis further indicates that head pose dynamics are highly informative for mobile gaze gestures, highlighting embodied head--eye coordination as a key design consideration. Although the small sample size and controlled setting limit generalizability, these results indicate a potential direction for further investigation into on-device gaze gesture recognition.

HCFeb 14, 2025
Quantifying the Impact of Motion on 2D Gaze Estimation in Real-World Mobile Interactions

Yaxiong Lei, Yuheng Wang, Fergus Buchanan et al.

Mobile gaze tracking involves inferring a user's gaze point or direction on a mobile device's screen from facial images captured by the device's front camera. While this technology inspires an increasing number of gaze-interaction applications, achieving consistent accuracy remains challenging due to dynamic user-device spatial relationships and varied motion conditions inherent in mobile contexts. This paper provides empirical evidence on how user mobility and behaviour affect mobile gaze tracking accuracy. We conduct two user studies collecting behaviour and gaze data under various motion conditions - from lying to maze navigation - and during different interaction tasks. Quantitative analysis has revealed behavioural regularities among daily tasks and identified head distance, head pose, and device orientation as key factors affecting accuracy, with errors increasing by up to 48.91% in dynamic conditions compared to static ones. These findings highlight the need for more robust, adaptive eye-tracking systems that account for head movements and device deflection to maintain accuracy across diverse mobile contexts.

HCMay 28, 2025
MAC-Gaze: Motion-Aware Continual Calibration for Mobile Gaze Tracking

Yaxiong Lei, Mingyue Zhao, Yuheng Wang et al.

Mobile gaze tracking faces a fundamental challenge: maintaining accuracy as users naturally change their postures and device orientations. Traditional calibration approaches, like one-off, fail to adapt to these dynamic conditions, leading to degraded performance over time. We present MAC-Gaze, a Motion-Aware continual Calibration approach that leverages smartphone Inertial measurement unit (IMU) sensors and continual learning techniques to automatically detect changes in user motion states and update the gaze tracking model accordingly. Our system integrates a pre-trained visual gaze estimator and an IMU-based activity recognition model with a clustering-based hybrid decision-making mechanism that triggers recalibration when motion patterns deviate significantly from previously encountered states. To enable accumulative learning of new motion conditions while mitigating catastrophic forgetting, we employ replay-based continual learning, allowing the model to maintain performance across previously encountered motion conditions. We evaluate our system through extensive experiments on the publicly available RGBDGaze dataset and our own 10-hour multimodal MotionGaze dataset (481K+ images, 800K+ IMU readings), encompassing a wide range of postures under various motion conditions including sitting, standing, lying, and walking. Results demonstrate that our method reduces gaze estimation error by 19.9% on RGBDGaze (from 1.73 cm to 1.41 cm) and by 31.7% on MotionGaze (from 2.81 cm to 1.92 cm) compared to traditional calibration approaches. Our framework provides a robust solution for maintaining gaze estimation accuracy in mobile scenarios.

HCOct 24, 2021
The privacy protection effectiveness of the video conference platforms' virtual background and the privacy concerns from the end-users

Shijing He, Yaxiong Lei

Due to the abrupt arise of pandemic worldwide, the video conferencing platforms are becoming ubiquitously available and being embedded into either various digital devices or the collaborative daily work. Even though the service provider has designed many security functions to protect individual's privacy, such as virtual background (VB), it still remains to be explored that how the instability of VB leaks users' privacy or impacts their mentality and behaviours. In order to understand and locate implications for the contextual of the end-users' privacy awareness and its mental model, we will conduct survey and interviews for users as the first stage research. We will raise conceptual challenges in terms of the designing safety and stable VB, as well as provide design suggestions.