Minxin Chen

NA
h-index7
9papers
50citations
Novelty51%
AI Score50

9 Papers

56.8CLJun 4
PlanBench-V: A Spatial Planning Map Benchmark for Vision-Language Models

Minxin Chen, He Zhu, Junyou Su et al.

Spatial planning maps are central to territorial governance, translating planning objectives, regulations, and spatial strategies into visual forms for decision-making, public communication, and institutional coordination. Their interpretation, however, requires fine-grained visual perception, spatial reasoning, and policy-informed professional judgment, creating major challenges for both human learners and AI systems. With the rapid progress of Vision-Language Models (VLMs), their use in urban planning analysis is gaining attention, yet existing multimodal benchmarks mainly target general visual understanding and overlook the domain-specific cognitive processes of planning practice. To address this gap, we introduce PlanBench-V, the first comprehensive benchmark for evaluating VLMs in spatial planning map interpretation. We first build the Spatial Planning Map Database (SPMD), an expert-annotated dataset of 223 planning maps and 1629 question-answer pairs curated by professional planners, covering diverse geographic regions and cartographic styles. We then propose a theory-informed evaluation framework assessing four progressive capabilities: Perception, Reasoning, Association, and Implementation, corresponding to the cognitive pipeline of planning map interpretation. Extensive experiments across two generations of VLMs show clear progress but persistent limitations. The best 2026 agentic reasoning model, Qwen3.6-Plus, substantially outperforms the best 2025 model, GPT-4o, by 27%. Nevertheless, all models still struggle with implementation-oriented tasks requiring evaluative judgment, policy sensitivity, and constraint-aware decision-making. These findings reveal fundamental limitations of current VLMs in professional planning contexts and highlight the need for domain-adaptive multimodal reasoning frameworks. Code and data are available at https://plangpt.github.io.

NANov 2, 2016
Efficient and Qualified Mesh Generation for Gaussian Molecular Surface Using Piecewise Trilinear Polynomial Approximation

Tiantian Liu, Minxin Chen, Benzhuo Lu

Recent developments for mathematical modeling and numerical simulation of biomolecular systems raise new demands for qualified, stable, and efficient surface meshing, especially in implicit-solvent modeling. In our former work, we have developed an algorithm for manifold triangular meshing for large Gaussian molecular surfaces, TMSmesh. In this paper, we present new algorithms to greatly improve the meshing efficiency and qualities, and implement into a new program version, TMSmesh 2.0. In TMSmesh 2.0, in the first step, a new adaptive partition and estimation algorithm is proposed to locate the cubes in which the surface are approximated by piecewise trilinear surface with controllable precision. Then, the piecewise trilinear surface is divided into single valued pieces by tracing along the fold curves, which ensures that the generated surface meshes are manifolds. Numerical test results show that TMSmesh 2.0 is capable of handling arbitrary sizes of molecules and achieves ten to hundreds of times speedup over the previous algorithm. The result surface meshes are manifolds and can be directly used in boundary element method (BEM) and finite element method (FEM) simulation. The binary version of TMSmesh 2.0 is downloadable at the web page http://lsec.cc.ac.cn/~lubz/Meshing.html.

NADec 12, 2016
Behavior of different numerical schemes for population genetic drift problems

Minxin Chen, Chun Liu, Shixin Xu et al.

In this paper, we focus on numerical methods for the genetic drift problems, which is governed by a degenerated convection-dominated parabolic equation. Due to the degeneration and convection, Dirac singularities will always be developed at boundary points as time evolves. In order to find a \emph{complete solution} which should keep the conservation of total probability and expectation, three different schemes based on finite volume methods are used to solve the equation numerically: one is a upwind scheme, the other two are different central schemes. We observed that all the methods are stable and can keep the total probability, but have totally different long-time behaviors concerning with the conservation of expectation. We prove that any extra infinitesimal diffusion leads to a same artificial steady state. So upwind scheme does not work due to its intrinsic numerical viscosity. We find one of the central schemes introduces a numerical viscosity term too, which is beyond the common understanding in the convection-diffusion community. Careful analysis is presented to prove that the other central scheme does work. Our study shows that the numerical methods should be carefully chosen and any method with intrinsic numerical viscosity must be avoided.

GRMay 5, 2025Code
Approximating Signed Distance Fields of Implicit Surfaces with Sparse Ellipsoidal Radial Basis Function Networks

Bobo Lian, Dandan Wang, Chenjian Wu et al.

Accurate and compact representation of signed distance functions (SDFs) of implicit surfaces is crucial for efficient storage, computation, and downstream processing of 3D geometry. In this work, we propose a general learning method for approximating precomputed SDF fields of implicit surfaces by a relatively small number of ellipsoidal radial basis functions (ERBFs). The SDF values could be computed from various sources, including point clouds, triangle meshes, analytical expressions, pretrained neural networks, etc. Given SDF values on spatial grid points, our method approximates the SDF using as few ERBFs as possible, achieving a compact representation while preserving the geometric shape of the corresponding implicit surface. To balance sparsity and approximation precision, we introduce a dynamic multi-objective optimization strategy, which adaptively incorporates regularization to enforce sparsity and jointly optimizes the weights, centers, shapes, and orientations of the ERBFs. For computational efficiency, a nearest-neighbor-based data structure restricts computations to points near each kernel center, and CUDA-based parallelism further accelerates the optimization. Furthermore, a hierarchical refinement strategy based on SDF spatial grid points progressively incorporates coarse-to-fine samples for parameter initialization and optimization, improving convergence and training efficiency. Extensive experiments on multiple benchmark datasets demonstrate that our method can represent SDF fields with significantly fewer parameters than existing sparse implicit representation approaches, achieving better accuracy, robustness, and computational efficiency. The corresponding executable program is publicly available at https://github.com/lianbobo/SE-RBFNet.git

CLMay 20, 2025
PlanGPT-VL: Enhancing Urban Planning with Domain-Specific Vision-Language Models

He Zhu, Junyou Su, Minxin Chen et al.

In the field of urban planning, existing Vision-Language Models (VLMs) frequently fail to effectively analyze and evaluate planning maps, despite the critical importance of these visual elements for urban planners and related educational contexts. Planning maps, which visualize land use, infrastructure layouts, and functional zoning, require specialized understanding of spatial configurations, regulatory requirements, and multi-scale analysis. To address this challenge, we introduce PlanGPT-VL, the first domain-specific Vision-Language Model tailored specifically for urban planning maps. PlanGPT-VL employs three innovative approaches: (1) PlanAnno-V framework for high-quality VQA data synthesis, (2) Critical Point Thinking to reduce hallucinations through structured verification, and (3) comprehensive training methodology combining Supervised Fine-Tuning with frozen vision encoder parameters. Through systematic evaluation on our proposed PlanBench-V benchmark, we demonstrate that PlanGPT-VL significantly outperforms general-purpose state-of-the-art VLMs in specialized planning map interpretation tasks, offering urban planning professionals a reliable tool for map analysis, assessment, and educational applications while maintaining high factual accuracy. Our lightweight 7B parameter model achieves comparable performance to models exceeding 72B parameters, demonstrating efficient domain specialization without sacrificing performance.

LGNov 24, 2025
Empirical Comparison of Forgetting Mechanisms for UCB-based Algorithms on a Data-Driven Simulation Platform

Minxin Chen

Many real-world bandit problems involve non-stationary reward distributions, where the optimal decision may shift due to evolving environments. However, the performance of some typical Multi-Armed Bandit (MAB) models such as Upper Confidence Bound (UCB) algorithms degrades significantly in non-stationary environments where reward distributions change over time. To address this limitation, this paper introduces and evaluates FDSW-UCB, a novel dual-view algorithm that integrates a discount-based long-term perspective with a sliding-window-based short-term view. A data-driven semi-synthetic simulation platform, built upon the MovieLens-1M and Open Bandit datasets, is developed to test algorithm adaptability under abrupt and gradual drift scenarios. Experimental results demonstrate that a well-configured sliding-window mechanism (SW-UCB) is robust, while the widely used discounting method (D-UCB) suffers from a fundamental learning failure, leading to linear regret. Crucially, the proposed FDSW-UCB, when employing an optimistic aggregation strategy, achieves superior performance in dynamic settings, highlighting that the ensemble strategy itself is a decisive factor for success.

CVNov 24, 2025
MambaRefine-YOLO: A Dual-Modality Small Object Detector for UAV Imagery

Shuyu Cao, Minxin Chen, Yucheng Song et al.

Small object detection in Unmanned Aerial Vehicle (UAV) imagery is a persistent challenge, hindered by low resolution and background clutter. While fusing RGB and infrared (IR) data offers a promising solution, existing methods often struggle with the trade-off between effective cross-modal interaction and computational efficiency. In this letter, we introduce MambaRefine-YOLO. Its core contributions are a Dual-Gated Complementary Mamba fusion module (DGC-MFM) that adaptively balances RGB and IR modalities through illumination-aware and difference-aware gating mechanisms, and a Hierarchical Feature Aggregation Neck (HFAN) that uses a ``refine-then-fuse'' strategy to enhance multi-scale features. Our comprehensive experiments validate this dual-pronged approach. On the dual-modality DroneVehicle dataset, the full model achieves a state-of-the-art mAP of 83.2%, an improvement of 7.9% over the baseline. On the single-modality VisDrone dataset, a variant using only the HFAN also shows significant gains, demonstrating its general applicability. Our work presents a superior balance between accuracy and speed, making it highly suitable for real-world UAV applications.

NASep 1, 2023
Solving multiscale elliptic problems by sparse radial basis function neural networks

Zhiwen Wang, Minxin Chen, Jingrun Chen

Machine learning has been successfully applied to various fields of scientific computing in recent years. In this work, we propose a sparse radial basis function neural network method to solve elliptic partial differential equations (PDEs) with multiscale coefficients. Inspired by the deep mixed residual method, we rewrite the second-order problem into a first-order system and employ multiple radial basis function neural networks (RBFNNs) to approximate unknown functions in the system. To aviod the overfitting due to the simplicity of RBFNN, an additional regularization is introduced in the loss function. Thus the loss function contains two parts: the $L_2$ loss for the residual of the first-order system and boundary conditions, and the $\ell_1$ regularization term for the weights of radial basis functions (RBFs). An algorithm for optimizing the specific loss function is introduced to accelerate the training process. The accuracy and effectiveness of the proposed method are demonstrated through a collection of multiscale problems with scale separation, discontinuity and multiple scales from one to three dimensions. Notably, the $\ell_1$ regularization can achieve the goal of representing the solution by fewer RBFs. As a consequence, the total number of RBFs scales like $\mathcal{O}(\varepsilon^{-nτ})$, where $\varepsilon$ is the smallest scale, $n$ is the dimensionality, and $τ$ is typically smaller than $1$. It is worth mentioning that the proposed method not only has the numerical convergence and thus provides a reliable numerical solution in three dimensions when a classical method is typically not affordable, but also outperforms most other available machine learning methods in terms of accuracy and robustness.

IVDec 12, 2018
Image Segmentation Based on Multiscale Fast Spectral Clustering

Chongyang Zhang, Guofeng Zhu, Minxin Chen et al.

In recent years, spectral clustering has become one of the most popular clustering algorithms for image segmentation. However, it has restricted applicability to large-scale images due to its high computational complexity. In this paper, we first propose a novel algorithm called Fast Spectral Clustering based on quad-tree decomposition. The algorithm focuses on the spectral clustering at superpixel level and its computational complexity is O(nlogn) + O(m) + O(m^(3/2)); its memory cost is O(m), where n and m are the numbers of pixels and the superpixels of a image. Then we propose Multiscale Fast Spectral Clustering by improving Fast Spectral Clustering, which is based on the hierarchical structure of the quad-tree. The computational complexity of Multiscale Fast Spectral Clustering is O(nlogn) and its memory cost is O(m). Extensive experiments on real large-scale images demonstrate that Multiscale Fast Spectral Clustering outperforms Normalized cut in terms of lower computational complexity and memory cost, with comparable clustering accuracy.