Haojia Zhu

AI
h-index2
3papers
10citations
Novelty58%
AI Score38

3 Papers

AIAug 12, 2024
Urban Region Pre-training and Prompting: A Graph-based Approach

Jiahui Jin, Yifan Song, Dong Kan et al.

Urban region representation is crucial for various urban downstream tasks. However, despite the proliferation of methods and their success, acquiring general urban region knowledge and adapting to different tasks remains challenging. Existing work pays limited attention to the fine-grained functional layout semantics in urban regions, limiting their ability to capture transferable knowledge across regions. Further, inadequate handling of the unique features and relationships required for different downstream tasks may also hinder effective task adaptation. In this paper, we propose a $\textbf{G}$raph-based $\textbf{U}$rban $\textbf{R}$egion $\textbf{P}$re-training and $\textbf{P}$rompting framework ($\textbf{GURPP}$) for region representation learning. Specifically, we first construct an urban region graph and develop a subgraph-centric urban region pre-training model to capture the heterogeneous and transferable patterns of entity interactions. This model pre-trains knowledge-rich region embeddings using contrastive learning and multi-view learning methods. To further refine these representations, we design two graph-based prompting methods: a manually-defined prompt to incorporate explicit task knowledge and a task-learnable prompt to discover hidden knowledge, which enhances the adaptability of these embeddings to different tasks. Extensive experiments on various urban region prediction tasks and different cities demonstrate the superior performance of our framework.

AIFeb 1
Aggregation Queries over Unstructured Text: Benchmark and Agentic Method

Haojia Zhu, Qinyuan Xu, Haoyu Li et al.

Aggregation query over free text is a long-standing yet underexplored problem. Unlike ordinary question answering, aggregate queries require exhaustive evidence collection and systems are required to "find all," not merely "find one." Existing paradigms such as Text-to-SQL and Retrieval-Augmented Generation fail to achieve this completeness. In this work, we formalize entity-level aggregation querying over text in a corpus-bounded setting with strict completeness requirement. To enable principled evaluation, we introduce AGGBench, a benchmark designed to evaluate completeness-oriented aggregation under realistic large-scale corpus. To accompany the benchmark, we propose DFA (Disambiguation--Filtering--Aggregation), a modular agentic baseline that decomposes aggregation querying into interpretable stages and exposes key failure modes related to ambiguity, filtering, and aggregation. Empirical results show that DFA consistently improves aggregation evidence coverage over strong RAG and agentic baselines. The data and code are available in https://anonymous.4open.science/r/DFA-A4C1.

AIMar 11, 2025
Boundary Prompting: Elastic Urban Region Representation via Graph-based Spatial Tokenization

Haojia Zhu, Jiahui Jin, Dong Kan et al.

Urban region representation is essential for various applications such as urban planning, resource allocation, and policy development. Traditional methods rely on fixed, predefined region boundaries, which fail to capture the dynamic and complex nature of real-world urban areas. In this paper, we propose the Boundary Prompting Urban Region Representation Framework (BPURF), a novel approach that allows for elastic urban region definitions. BPURF comprises two key components: (1) A spatial token dictionary, where urban entities are treated as tokens and integrated into a unified token graph, and (2) a region token set representation model which utilize token aggregation and a multi-channel model to embed token sets corresponding to region boundaries. Additionally, we propose fast token set extraction strategy to enable online token set extraction during training and prompting. This framework enables the definition of urban regions through boundary prompting, supporting varying region boundaries and adapting to different tasks. Extensive experiments demonstrate the effectiveness of BPURF in capturing the complex characteristics of urban regions.