Robert Xiao

HC
h-index6
8papers
5citations
Novelty37%
AI Score47

8 Papers

CRJun 2
Generative AI-Enabled Refund Fraud in Chinese E-Commerce: Investigation on Merchants and Platform Workers

Shuning Zhang, Eve He, Xiao Zhan et al.

E-commerce dispute resolution typically relies on the security assumption that digital evidence truthfully reflects physical reality. Generative AI (GenAI) invalidates this threat model, enabling attackers to fabricate hyper-realistic evidence of product defects at negligible cost. Through semi-structured interviews with merchants (N=17) and platform workers (N=13) in the Chinese e-commerce market, we characterize this shift toward GenAI-enabled scalable fabrication. We outline a taxonomy of four GenAI-enabled threat vectors across the transaction, dispute, logistics and communication phases, highlighting how attackers exploit GenAI to synthesize physically plausible product defects at scale. To mitigate these threats, platforms and merchants are adapting verification strategies, relying on AI tools for automated screening and adversarial interrogation (e.g., requesting multi-angle videos) to increase attack complexity. However, we find several challenges that hinder the adoption of these defenses, including implementation hurdles like structural platform constraints and fundamental limitations regarding the technical sophistication of GenAI. We conclude by outlining design implications for privacy-preserving cross-platform fraud databases, and traceability mechanisms such as embedding verifiable material anchors into the product.

HCJun 2
Investigating Novice Researchers' Perceptions of Research Privacy Within LLM-Assisted Workflows

Shuning Zhang, Changxi Wen, Eve He et al.

Large Language Model (LLMs)-assisted scholarly workflows introduce critical privacy and intellectual property risks. As a uniquely vulnerable cohort driven by publication pressure and a lack of institutional support, novice researchers rely heavily on public LLMs, compelling them to navigate high-stakes privacy-publication trade-offs. To investigate these concerns, we conducted semi-structured interviews with 44 researchers across diverse disciplines. Our findings reveal that the fear of idea leakage paradoxically accelerates, rather than deters, reliance on LLMs, as researchers utilize them to expedite publication. They also held misconceptions that their ideas lacked the unique value to attract targeted attacks, and that their inputs would be safely diluted within massive datasets, preventing reconstruction. From interviews, we identified five types of mitigations including input fragmentation and adversarial probing, though we found that participants largely perceived these measures as ineffective. We outline implications including implementing institution-level sandboxed isolation, scenario-based privacy pedagogy, and verifiable data-deletion audits for transparency.

HCMay 23
"It Felt a Bit Eerie": Exploring Humanlike Interactions During Collaborative Writing with an Artificial Agent

Michael Yin, Angela Chiang, Samuel Rhys Cox et al.

While human-AI collaboration systems have increasingly been built to increase efficiency or support creativity, little work has examined how the design of interactions shapes the social connection between human and artificial agent. We examine how the temporal and visual dimensions of collaboration shape the experience of a writing task. Specifically, we built three variants of an AI-assisted text editor along a spectrum of simulated humanlike interaction (synchronous and with a cursor) to machinelike interaction (asynchronous and without a cursor), and conducted a comparative user study (n=48). Our exploratory findings suggest that synchronous suggestions increased efficiency but led to contextual misalignment, while a visual cursor increased intent understanding but evoked feelings of surveillance. Taken together, humanlike design of artificial agents can create positive social expectations but also elicit social costs, especially without the alignment present in human-human collaboration. We extend our findings into design implications and ethical considerations when building human-AI collaboration systems.

HCMay 4
Exploring Instant Photography using Generative AI: A Design Probe with the UnReality Camera

Michael Yin, Angela Chiang, Robert Xiao

Generative AI has increasingly been used for artistic creation, but little work has explored how it shapes the experiential meaning of practice. We consider how generative AI might transform the embodied and tangible process of instant photography through the UnReality Camera, an AI-mediated instant camera. The UnReality Camera prints a photo of the environment augmented by a user's spoken words as generative input. In a design probe, we explored how generative AI shapes people's perceptions of both photographic output and the broader photographic process. Although users valued artistic control, they also appreciated the creativity afforded by stochastic unpredictability. The waiting period for an unpredictable output elicited anticipatory suspense, and the camera's physical form evoked ownership and connection despite artificial generation. We discuss how people make sense of instant photography's experiential qualities when generative AI is embedded, and how their opposing affordances reshape interpretations of each other's experiential meaning.

HCMay 4
"I Don't Have Faith in the Developers to Use My Feedback": Understanding Player Values and Expectancy for Reporting Systems in Video Games

Michael Yin, Chenxinran, Shen et al.

Reporting systems in multiplayer video games allow players to express their dissatisfaction with others and combat in-game toxicity. In this work, we examined the act of reporting through the lens of expectancy-value theory. Using a distributed survey (n = 98) and follow-up interviews (n = 19), we explored the value players place on reporting, their desired outcomes, and their expectations that these outcomes will be achieved. Our findings revealed that reporting is motivated by both altruistic and retributive factors, with players seeking short-term revenge while also looking to foster an improved long-term community. Yet, players felt that reporting may not always meet these goals, with belief in the system being mediated by factors such as developer reputation, reporting transparency, and alignment with the community. By understanding the value and expectancy of reporting systems, we discuss their implications on broader digital moderation and consider current and potential future designs of reporting systems.

CVMay 1, 2024
Streamlining Image Editing with Layered Diffusion Brushes

Peyman Gholami, Robert Xiao

Denoising diffusion models have emerged as powerful tools for image manipulation, yet interactive, localized editing workflows remain underdeveloped. We introduce Layered Diffusion Brushes (LDB), a novel training-free framework that enables interactive, layer-based editing using standard diffusion models. LDB defines each "layer" as a self-contained set of parameters guiding the generative process, enabling independent, non-destructive, and fine-grained prompt-guided edits, even in overlapping regions. LDB leverages a unique intermediate latent caching approach to reduce each edit to only a few denoising steps, achieving 140~ms per edit on consumer GPUs. An editor implementing LDB, incorporating familiar layer concepts, was evaluated via user study and quantitative metrics. Results demonstrate LDB's superior speed alongside comparable or improved image quality, background preservation, and edit fidelity relative to state-of-the-art methods across various sequential image manipulation tasks. The findings highlight LDB's ability to significantly enhance creative workflows by providing an intuitive and efficient approach to diffusion-based image editing and its potential for expansion into related subdomains, such as video editing.

CVMay 31, 2023
Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images

Peyman Gholami, Robert Xiao

Text-to-image generative models have made remarkable advancements in generating high-quality images. However, generated images often contain undesirable artifacts or other errors due to model limitations. Existing techniques to fine-tune generated images are time-consuming (manual editing), produce poorly-integrated results (inpainting), or result in unexpected changes across the entire image (variation selection and prompt fine-tuning). In this work, we present Diffusion Brush, a Latent Diffusion Model-based (LDM) tool to efficiently fine-tune desired regions within an AI-synthesized image. Our method introduces new random noise patterns at targeted regions during the reverse diffusion process, enabling the model to efficiently make changes to the specified regions while preserving the original context for the rest of the image. We evaluate our method's usability and effectiveness through a user study with artists, comparing our technique against other state-of-the-art image inpainting techniques and editing software for fine-tuning AI-generated imagery.

CVMay 24, 2023
AutoDepthNet: High Frame Rate Depth Map Reconstruction using Commodity Depth and RGB Cameras

Peyman Gholami, Robert Xiao

Depth cameras have found applications in diverse fields, such as computer vision, artificial intelligence, and video gaming. However, the high latency and low frame rate of existing commodity depth cameras impose limitations on their applications. We propose a fast and accurate depth map reconstruction technique to reduce latency and increase the frame rate in depth cameras. Our approach uses only a commodity depth camera and color camera in a hybrid camera setup; our prototype is implemented using a Kinect Azure depth camera at 30 fps and a high-speed RGB iPhone 11 Pro camera captured at 240 fps. The proposed network, AutoDepthNet, is an encoder-decoder model that captures frames from the high-speed RGB camera and combines them with previous depth frames to reconstruct a stream of high frame rate depth maps. On GPU, with a 480 x 270 output resolution, our system achieves an inference time of 8 ms, enabling real-time use at up to 200 fps with parallel processing. AutoDepthNet can estimate depth values with an average RMS error of 0.076, a 44.5% improvement compared to an optical flow-based comparison method. Our method can also improve depth map quality by estimating depth values for missing and invalidated pixels. The proposed method can be easily applied to existing depth cameras and facilitates the use of depth cameras in applications that require high-speed depth estimation. We also showcase the effectiveness of the framework in upsampling different sparse datasets e.g. video object segmentation. As a demonstration of our method, we integrated our framework into existing body tracking systems and demonstrated the robustness of the proposed method in such applications.