Xiaofu Jin

HC
h-index4
4papers
19citations
Novelty56%
AI Score45

4 Papers

HCMar 6
Hierarchical Resource Rationality Explains Human Reading Behavior

Yunpeng Bai, Xiaofu Jin, Shengdong Zhao et al.

Reading is a pervasive and cognitively demanding activity that underpins modern human culture. It is a prime instance of a class of tasks where eye movements are coordinated for the purpose of comprehension. Existing theories explain either eye movements or comprehension during reading, but the critical link between the two remains unclear. Here, we propose resource-rational optimization as a unifying principle governing adaptive reading behavior. Eye movements are selected to maximize expected comprehension while minimizing cognitive and temporal costs, organized hierarchically across nested time scales: fixation decisions support word recognition; sentence-level integration guides skipping and regression; and text-level comprehension goals shape memory construction and rereading. A computational implementation successfully replicates an unprecedented range of findings in human reading, from lexical effects to comprehension outcomes. Together, these results suggest that resource rationality provides a general mechanism for coordinating perception, memory, and action in knowledge-intensive human behaviors, offering a principled account of how complex cognitive skills adapt to limited resources.

HCFeb 4
Adaptive Prompt Elicitation for Text-to-Image Generation

Xinyi Wen, Lena Hegemann, Xiaofu Jin et al.

Aligning text-to-image generation with user intent remains challenging, for users who provide ambiguous inputs and struggle with model idiosyncrasies. We propose Adaptive Prompt Elicitation (APE), a technique that adaptively asks visual queries to help users refine prompts without extensive writing. Our technical contribution is a formulation of interactive intent inference under an information-theoretic framework. APE represents latent intent as interpretable feature requirements using language model priors, adaptively generates visual queries, and compiles elicited requirements into effective prompts. Evaluation on IDEA-Bench and DesignBench shows that APE achieves stronger alignment with improved efficiency. A user study with challenging user-defined tasks demonstrates 19.8% higher alignment without workload overhead. Our work contributes a principled approach to prompting that, for general users, offers an effective and efficient complement to the prevailing prompt-based interaction paradigm with text-to-image models.

61.0HCMar 12
Modeling Trial-and-Error Navigation With a Sequential Decision Model of Information Scent

Xiaofu Jin, Yunpeng Bai, Antti Oulasvirta

Users often struggle to locate an item within an information architecture, particularly when links are ambiguous or deeply nested in hierarchies. Information scent has been used to explain why users select incorrect links, but this concept assumes that users see all available links before deciding. In practice, users frequently select a link too quickly, overlook relevant cues, and then rely on backtracking when errors occur. We extend the concept of information scent by framing navigation as a sequential decision-making problem under memory constraints. Specifically, we assume that users do not scan entire pages but instead inspect strategically, looking "just enough" to find the target given their time budget. To choose which item to inspect next, they consider both local (this page) and global (site) scent; however, both are constrained by memory. Trying to avoid wasting time, they occasionally choose the wrong links without inspecting everything on a page. Comparisons with empirical data show that our model replicates key navigation behaviors: premature selections, wrong turns, and recovery from backtracking. We conclude that trial-and-error behavior is well explained by information scent when accounting for the sequential and bounded characteristics of the navigation problem.

CVJul 2, 2019
An End-to-End Neural Network for Image Cropping by Learning Composition from Aesthetic Photos

Peng Lu, Hao Zhang, Xujun Peng et al.

As one of the fundamental techniques for image editing, image cropping discards unrelevant contents and remains the pleasing portions of the image to enhance the overall composition and achieve better visual/aesthetic perception. In this paper, we primarily focus on improving the accuracy of automatic image cropping, and on further exploring its potential in public datasets with high efficiency. From this respect, we propose a deep learning based framework to learn the objects composition from photos with high aesthetic qualities, where an anchor region is detected through a convolutional neural network (CNN) with the Gaussian kernel to maintain the interested objects' integrity. This initial detected anchor area is then fed into a light weighted regression network to obtain the final cropping result. Unlike the conventional methods that multiple candidates are proposed and evaluated iteratively, only a single anchor region is produced in our model, which is mapped to the final output directly. Thus, low computational resources are required for the proposed approach. The experimental results on the public datasets show that both cropping accuracy and efficiency achieve the state-ofthe-art performances.