Zhongpu Chen

CL
h-index8
3papers
23citations
Novelty50%
AI Score44

3 Papers

10.3DBMay 13
LiLIS: A Lightweight Distributed Learned Index Framework for Spatial Decision Analysis

Zhongpu Chen, Yikai Dong, Wanjun Hao

Spatial query and analysis results are often directly applied to decision-making processes such as facility location, proximity resource discovery, accessibility analysis, and risk assessment. Therefore, the efficiency of underlying spatial data access directly impacts the response speed of spatial decision analysis. Existing distributed spatial analysis systems (e.g., Simba, Sedona) already have relatively mature execution frameworks. However, they incur substantial overhead in local index construction and query refinement, especially in read-intensive scenarios. Recent studies have shown that learned indices exhibit considerable retrieval potential in single-machine settings, yet how to integrate them into distributed spatial analysis systems with low modification costs remains unaddressed. In this article, we present LiLIS, a Lightweight distributed Learned Index prototype for Spatial decision analysis. Without modifying existing execution engines, LiLIS integrates machine-learned search strategies with spatial-aware partitioning in a distributed framework, and efficiently supports common spatial queries such as point queries, range queries, $k$-nearest neighbor ($k$NN) queries, and spatial joins. Extensive experiments on both real-world and synthetic datasets demonstrate that LiLIS achieves lower latency across various query types and reduces index construction overhead compared with baseline approaches. These results indicate its potential for improving the responsiveness of read-intensive spatial decision-support workflows.

CLJan 25, 2025Code
MDEval: Evaluating and Enhancing Markdown Awareness in Large Language Models

Zhongpu Chen, Yinfeng Liu, Long Shi et al.

Large language models (LLMs) are expected to offer structured Markdown responses for the sake of readability in web chatbots (e.g., ChatGPT). Although there are a myriad of metrics to evaluate LLMs, they fail to evaluate the readability from the view of output content structure. To this end, we focus on an overlooked yet important metric -- Markdown Awareness, which directly impacts the readability and structure of the content generated by these language models. In this paper, we introduce MDEval, a comprehensive benchmark to assess Markdown Awareness for LLMs, by constructing a dataset with 20K instances covering 10 subjects in English and Chinese. Unlike traditional model-based evaluations, MDEval provides excellent interpretability by combining model-based generation tasks and statistical methods. Our results demonstrate that MDEval achieves a Spearman correlation of 0.791 and an accuracy of 84.1% with human, outperforming existing methods by a large margin. Extensive experimental results also show that through fine-tuning over our proposed dataset, less performant open-source models are able to achieve comparable performance to GPT-4o in terms of Markdown Awareness. To ensure reproducibility and transparency, MDEval is open sourced at https://github.com/SWUFE-DB-Group/MDEval-Benchmark.

LGFeb 3, 2024
Nonlinear subspace clustering by functional link neural networks

Long Shi, Lei Cao, Zhongpu Chen et al.

Nonlinear subspace clustering based on a feed-forward neural network has been demonstrated to provide better clustering accuracy than some advanced subspace clustering algorithms. While this approach demonstrates impressive outcomes, it involves a balance between effectiveness and computational cost. In this study, we employ a functional link neural network to transform data samples into a nonlinear domain. Subsequently, we acquire a self-representation matrix through a learning mechanism that builds upon the mapped samples. As the functional link neural network is a single-layer neural network, our proposed method achieves high computational efficiency while ensuring desirable clustering performance. By incorporating the local similarity regularization to enhance the grouping effect, our proposed method further improves the quality of the clustering results. Additionally, we introduce a convex combination subspace clustering scheme, which combining a linear subspace clustering method with the functional link neural network subspace clustering approach. This combination approach allows for a dynamic balance between linear and nonlinear representations. Extensive experiments confirm the advancement of our methods. The source code will be released on https://lshi91.github.io/ soon.