Haoze Guo

4papers

2citations

Novelty45%

AI Score45

Ranked #69,190 of 201,326 authors (top 34%)#466 in HC (top 16%)

4 Papers

18.0HCMay 6

Temporal Drift in Privacy Recall: Users Misremember From Verbatim Loss to Gist-Based Overexposure

Haoze Guo, Ziqi Wei

With social media content traversing the different platforms, occasionally resurfacing after periods of time, users are increasingly prone to unintended disclosure resulting from a misremembered acceptance of privacy. Context collapse and interface cues are two factors considered by prior researchers, yet we know less about how time-lapse basically alters recall of past audiences destined for exposure. Likewise, the design space for mitigating this temporal exposure risk remains underexplored. Our work theorizes temporal drift in privacy recall as verbatim memory of prior settings blowing apart and eventually settling with gist-based heuristics, which more often than not select an audience larger than the original one. Grounded in memory research, contextual integrity, and usable privacy, we examine why such a drift occurs, why it tends to bias toward broader sharing, and how it compounds upon repeat exposure. Following that, we suggest provenance-forward interface schemes and a risk-based evaluation framework that mutates recall into recognition. The merit of our work lies in establishing a temporal awareness of privacy design as an essential safety rail against inadvertent overexposure.

15.5HCMay 6

From OCR to Analysis: Tracking Correction Provenance in Digital Humanities Pipelines

Haoze Guo, Ziqi Wei

Optical Character Recognition (OCR) is a critical but error-prone stage in digital humanities text pipelines. While OCR correction improves usability for downstream NLP tasks, common workflows often overwrite intermediate decisions, obscuring how textual transformations affect scholarly interpretation. We present a provenance-aware framework for OCR-corrected humanities corpora that records correction lineage at the span level, including edit type, correction source, confidence, and revision status. Using a pilot corpus of historical texts, we compare downstream named entity extraction across raw OCR, fully corrected text, and provenance-filtered corrections. Our results show that correction pathways can substantially alter extracted entities and document-level interpretations, while provenance signals help identify unstable outputs and prioritize human review. We argue that provenance should be treated as a first-class analytical layer in NLP for digital humanities, supporting reproducibility, source criticism, and uncertainty-aware interpretation.

30.6HCApr 18

The Privacy Placebo: Diagnosing Consent Burden through Performative Scrolling

Haoze Guo, Ziqi Wei

While consent banners and privacy policies invite users to read and choose, many choices are shaped by repeated, low-yield interaction routines rather than deliberation. This paper studies performative scrolling: slow, low-information interaction that can signal attention to consent without substantially improving understanding. We present the Performative Scrolling Index (PSI), a reproducible interface-audit metric for measuring pre-choice burden before a meaningful non-accepting alternative becomes visible and actionable. PSI decomposes burden into four observable components: distance, time, focus loops, and hidden reveals. In this paper, PSI is the primary burden metric, while companion signals such as AAI, CSI, and divergence are used as secondary interpretive audit aids rather than standalone validated scales. We also provide a least-effort audit protocol, design-side invariants, a worked example, and a medium-scale live deployment across desktop and mobile conditions under pointer and keyboard traversal policies. Together, these analyses show how structural choices such as offscreen alternatives, fragmented disclosure, and staged modal flows can increase pre-choice friction without improving meaningful control. PSI is not a measure of comprehension or legal sufficiency; rather, it is a diagnostic of interface-side burden intended to support reproducible audits and redesigns.

15.1HCMar 19

ConsentDiff at Scale: Longitudinal Audits of Web Privacy Policy Changes and UI Frictions

Haoze Guo

Web privacy is experienced via two public artifacts: site utterances in policy texts, and the actions users are required to take during consent interfaces. In the extensive cross-section audits we've studied, there is a lack of longitudinal data detailing how these artifacts are changing together, and if interfaces are actually doing what they promise in policy. ConsentDiff provides that longitudinal view. We build a reproducible pipeline that snapshots sites every month, semantically aligns policy clauses to track clause-level churn, and classifies consent-UI patterns by pulling together DOM signals with cues provided by screenshots. We introduce a novel weighted claim-UI alignment score, connecting common policy claims to observable predicates, and enabling comparisons over time, regions, and verticals. Our measurements suggest continued policy churn, systematic changes to eliminate a higher-friction banner design, and significantly higher alignment where rejecting is visible and lower friction.