Chenhao Zhu

CL
h-index15
4papers
87citations
Novelty53%
AI Score40

4 Papers

ROMar 24, 2023
Interpretable Motion Planner for Urban Driving via Hierarchical Imitation Learning

Bikun Wang, Zhipeng Wang, Chenhao Zhu et al.

Learning-based approaches have achieved remarkable performance in the domain of autonomous driving. Leveraging the impressive ability of neural networks and large amounts of human driving data, complex patterns and rules of driving behavior can be encoded as a model to benefit the autonomous driving system. Besides, an increasing number of data-driven works have been studied in the decision-making and motion planning module. However, the reliability and the stability of the neural network is still full of uncertainty. In this paper, we introduce a hierarchical planning architecture including a high-level grid-based behavior planner and a low-level trajectory planner, which is highly interpretable and controllable. As the high-level planner is responsible for finding a consistent route, the low-level planner generates a feasible trajectory. We evaluate our method both in closed-loop simulation and real world driving, and demonstrate the neural network planner has outstanding performance in complex urban autonomous driving scenarios.

CLOct 18, 2024
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Jiahao Qiu, Yifu Lu, Yifan Zeng et al.

Inference-time alignment enhances the performance of large language models without requiring additional training or fine-tuning but presents challenges due to balancing computational efficiency with high-quality output. Best-of-N (BoN) sampling, as a simple yet powerful approach, generates multiple responses and selects the best one, achieving improved performance but with a high computational cost. We propose TreeBoN, a novel framework that integrates a speculative tree-search strategy into Best-of-N (BoN) Sampling. TreeBoN maintains a set of parent nodes, iteratively branching and pruning low-quality responses, thereby reducing computational overhead while maintaining high output quality. Our approach also leverages token-level rewards from Direct Preference Optimization (DPO) to guide tree expansion and prune low-quality paths. We evaluate TreeBoN using AlpacaFarm, HH-RLHF, UltraFeedback, GSM8K, and TutorEval datasets, demonstrating consistent improvements. Specifically, TreeBoN achieves the highest win rate of 65% on TutorEval and around 60% win rates across other different datasets, outperforming standard BoN with the same computational cost and showcasing its scalability and alignment efficacy.

AIOct 21, 2025
StarBench: A Turn-Based RPG Benchmark for Agentic Multimodal Decision-Making and Information Seeking

Haoran Zhang, Chenhao Zhu, Sicong Guo et al.

Human players do more than press buttons: they ground what they see on screen into precise keyboard-mouse actions and, when stuck, they seek information before trying again. We ask whether current vision-language models (VLMs) can do the same. Despite encouraging results under simplified control or tool scaffolds, human-like play in a real client - mapping raw screenshots to temporally coherent low-level actions while deciding when to ask for guidance - remains an open challenge. We introduce StarBench, a turn-based RPG benchmark derived from Honkai: Star Rail that targets these two human-like competencies: multimodal decision-making from pixels to actions and agentic information seeking. StarBench standardizes evaluation across eight combat tasks and two regimes with shared tasks and metrics: (i) direct control, where agents receive only screenshots and must emit low-level primitives (click and keypress) with no semantic hints; and (ii) tool-assisted control, where higher-level intents can be mapped to primitives by detectors and OCR outputs provide optional textualized observations to ease UI grounding. To mirror human practice, StarBench also includes an ask-or-act diagnostic that measures whether and when agents choose to request brief guidance before proceeding, and how that choice affects subsequent performance. We report reference baselines for contemporary VLMs and a human reference. Results expose sizable gaps in perception-to-control fidelity in the direct regime, while showing that judicious information seeking correlates with improved success, establishing StarBench as a reproducible yardstick for agentic information seeking and multimodal decision-making in real-client play.

CLOct 16, 2015
A Graph Traversal Based Approach to Answer Non-Aggregation Questions Over DBpedia

Chenhao Zhu, Kan Ren, Xuan Liu et al.

We present a question answering system over DBpedia, filling the gap between user information needs expressed in natural language and a structured query interface expressed in SPARQL over the underlying knowledge base (KB). Given the KB, our goal is to comprehend a natural language query and provide corresponding accurate answers. Focusing on solving the non-aggregation questions, in this paper, we construct a subgraph of the knowledge base from the detected entities and propose a graph traversal method to solve both the semantic item mapping problem and the disambiguation problem in a joint way. Compared with existing work, we simplify the process of query intention understanding and pay more attention to the answer path ranking. We evaluate our method on a non-aggregation question dataset and further on a complete dataset. Experimental results show that our method achieves best performance compared with several state-of-the-art systems.