Lei Dong

SOC-PH
5papers
23citations
Novelty46%
AI Score48

5 Papers

86.1SOC-PHJun 1
Global evidence for a consistent spatial footprint of intra-urban centers

Shuai Pang, Junlong Zhang, Yu Liu et al.

Urban space is highly heterogeneous, with economic and population activities concentrating in localized centers. However, the global organization of such intra-urban centers remains poorly understood due to the lack of consistent, comparable data. Here we develop a scalable geospatial framework using nighttime light observations to identify over 15,000 intra-urban centers worldwide. We uncover a robust regularity: despite differences in city size, geography, and development context, total urban area scales linearly with the number of centers, implying a roughly constant spatial footprint per center. This macroscopic regularity is underpinned by two independent sublinear scaling laws -- center number and urban area both scale with population at closely matched rates -- whose ratio cancels to produce the observed linear relationship. At the within-city level, this constancy manifests as a characteristic Voronoi coverage area per center that is consistent across regions, and centers are more regularly spaced than spatial null models predict. As a consequence, polycentric cities maintain stable accessibility as they expand. These findings provide a new empirical foundation for understanding the spatial organization of urban growth.

89.4COJun 3
Generating 2-Gray codes for grand Motzkin paths and grand Dyck paths with air pockets in constant amortized time

Lei Dong, Bowie Liu, Dennis Wong et al.

A grand Motzkin path with air pockets is a non-empty lattice path in the first and fourth quadrant of $\mathbb{Z}^2$, starting at the origin $(0,0)$, ending on the $x$-axis, and consisting of up-steps $(1, 1)$, horizontal steps $(1, 0)$, down-steps $(1, -k)$ where $k \geq 1$, and with no consecutive down-steps. A {grand Dyck path with air pockets} is a grand Motzkin path with air pockets that uses no horizontal steps. We present the first known 2-Gray codes for grand Motzkin paths with air pockets. Setting the number of horizontal steps to zero in our algorithm yields the first known 2-Gray codes for grand Dyck paths with air pockets. Our three-stage algorithm generates each path in constant amortized time per string, using $O(n^2)$ memory. We also provide enumeration formulae for grand Motzkin paths and grand Dyck paths with air pockets.

15.4LGMay 18
A Geometric Analysis of Sign-Magnitude Asymmetry in a ReLU + RMSNorm Block under Ternary Quantization

Lei Dong

Pre-norm Transformers with RMSNorm tolerate ternary {-1,0,+1} weight quantization with surprisingly small loss (Ma et al., 2024). We give a geometric explanation via sign-magnitude decomposition of weight perturbations. In a two-layer ReLU + RMSNorm model with i.i.d. Gaussian weights, sign-flips produce $π/(π-2) \approx 2.75$ times more transverse output energy than sign-preserving magnitude perturbations of equal Frobenius norm, as the flip rate $p \to 0$ (Theorem 3). The mechanism: ReLU creates a hidden-space directional asymmetry between the two perturbation types, which RMSNorm's transverse-projection Fréchet derivative selectively exposes. Sign-quantization error is itself a sign-preserving perturbation with angular alignment $\cos^2 \to 2/π$ (Theorem 4); its post-ReLU radial fraction ($0.365$) matches the pre-ReLU value $1-2/π$ within $0.4\%$, so ReLU is approximately transparent to ternary error. Multi-layer compounding of the $2.75\times$ factor is not experimentally supported; the gap to real-model sign sensitivity arises from outlier features violating delocalization. For an input dimension with amplitude $α$, a single sign-flip produces post-ReLU energy amplified by $R \approx nα^2$ relative to a delocalized entry. On TinyLlama-1.1B, at linear response ($p \leq 0.5\%$), count-matched NLL leverage stabilizes at $\sim 10\times \approx n\mathbb{E}[α^2]$, matching the per-entry theory; the all-column NLL ratio of $5.0\times$ falls within $R_{\mathrm{col}} \leq 19$ ($67\times$ PPL gap reflects metric nonlinearity). Measured outlier $α$ at layer 12 (median $0.024$, max $0.26$) confirms heavy-tailed concentration. The Bussgang constant $2/π$, RMSNorm geometry, and ReLU half-space structure together explain sign-magnitude asymmetry in pre-norm models, with $R \propto nα^2$ accounting for real-model deviations.

SOC-PHFeb 23
Distilling human mobility models with symbolic regression

Hao Guo, Weiyu Zhang, Junjie Yang et al.

Human mobility is a fundamental aspect of social behavior, with broad applications in transportation, urban planning, and epidemic modeling. Represented by the gravity model and the radiation model, established analytical models for mobility phenomena are often discovered by analogy to physical processes. Such discoveries can be challenging and rely on intuition, while the potential of emerging social observation data in model discovery is largely unexploited. Here, we propose a systematic approach that leverages symbolic regression to automatically discover interpretable models from human mobility data. Our approach finds several well-known formulas, such as the distance decay effect and classical gravity models, as well as previously unknown ones, such as an exponential-power-law decay that can be explained by the maximum entropy principle. By relaxing the constraints on the complexity of model expressions, we further show how key variables of human mobility are progressively incorporated into the model, making this framework a powerful tool for revealing the underlying mathematical structures of complex social phenomena directly from observational data.

MLNov 21, 2017
Group Sparse Bayesian Learning for Active Surveillance on Epidemic Dynamics

Hongbin Pei, Bo Yang, Jiming Liu et al.

Predicting epidemic dynamics is of great value in understanding and controlling diffusion processes, such as infectious disease spread and information propagation. This task is intractable, especially when surveillance resources are very limited. To address the challenge, we study the problem of active surveillance, i.e., how to identify a small portion of system components as sentinels to effect monitoring, such that the epidemic dynamics of an entire system can be readily predicted from the partial data collected by such sentinels. We propose a novel measure, the gamma value, to identify the sentinels by modeling a sentinel network with row sparsity structure. We design a flexible group sparse Bayesian learning algorithm to mine the sentinel network suitable for handling both linear and non-linear dynamical systems by using the expectation maximization method and variational approximation. The efficacy of the proposed algorithm is theoretically analyzed and empirically validated using both synthetic and real-world data.