NAMar 6, 2019
Multi-grid Multi-Level Monte Carlo Method for Stokes-Darcy interface Model with Random Hydraulic ConductivityZhipeng Yang, Xiaoming He, Li Zhang et al.
In this article we develop a multi-grid multi-level Monte Carlo (MGMLMC) method for the stochastic Stokes-Darcy interface model with random hydraulic conductivity both in the porous media domain and on the interface. Because the randomness through the interface affects the flow in the Stokes domain, we investigate the coupled stochastic Stokes-Darcy model to improve the fidelity as this model also considers the second and third porosity of the free flow. Then we prove the existence and uniqueness of the weak solution of the variational form. For the numerical solution, we adopt the Monte Carlo (MC) method and finite element method (FEM), for the discrete form in the probability space and physical space, respectively. In the traditional single-level Monte Carlo (SLMC) method, more accurate numerical approximate requires both larger number of samples in probability space and smaller mesh size in the physical space. Then the computational cost increase significantly as the mesh size becomes smaller for the more accurate numerical approximate. Therefore we adopt the multi-level Monte Carlo (MLMC) method to dramatically reduce the computational cost in the probability space, because the number of samples decays fast while the mesh size decreases. We also develop a strategy to calculate the number of samples needed in MLMC method for the stochastic Stokes-Darcy model. Furthermore MLMC naturally provides the hierarchial grids and sufficient information on these grids for multi-grid (MG) method, which can in turn improve the efficiency of MLMC. In order to fully make use of the dynamical interaction between this two methods, we propose the multi-grid multi-level Monte Carlo method for more efficiently solving the stochastic model. Numerical examples are provided to verify and illustrate the proposed method and the theoretical conclusions.
CVJan 5
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and GenerationHuichao Zhang, Liao Qu, Yiheng Liu et al.
We present NextFlow, a unified decoder-only autoregressive transformer trained on 6 trillion interleaved text-image discrete tokens. By leveraging a unified vision representation within a unified autoregressive architecture, NextFlow natively activates multimodal understanding and generation capabilities, unlocking abilities of image editing, interleaved content and video generation. Motivated by the distinct nature of modalities - where text is strictly sequential and images are inherently hierarchical - we retain next-token prediction for text but adopt next-scale prediction for visual generation. This departs from traditional raster-scan methods, enabling the generation of 1024x1024 images in just 5 seconds - orders of magnitude faster than comparable AR models. We address the instabilities of multi-scale generation through a robust training recipe. Furthermore, we introduce a prefix-tuning strategy for reinforcement learning. Experiments demonstrate that NextFlow achieves state-of-the-art performance among unified models and rivals specialized diffusion baselines in visual quality.
56.8CLMar 11
Word Recovery in Large Language Models Enables Character-Level Tokenization RobustnessZhipeng Yang, Shu Yang, Lijie Hu et al.
Large language models (LLMs) trained with canonical tokenization exhibit surprising robustness to non-canonical inputs such as character-level tokenization, yet the mechanisms underlying this robustness remain unclear. We study this phenomenon through mechanistic interpretability and identify a core process we term word recovery. We first introduce a decoding-based method to detect word recovery, showing that hidden states reconstruct canonical word-level token identities from character-level inputs. We then provide causal evidence by removing the corresponding subspace from hidden states, which consistently degrades downstream task performance. Finally, we conduct a fine-grained attention analysis and show that in-group attention among characters belonging to the same canonical token is critical for word recovery: masking such attention in early layers substantially reduces both recovery scores and task performance. Together, our findings provide a mechanistic explanation for tokenization robustness and identify word recovery as a key mechanism enabling LLMs to process character-level inputs.
CLMay 20, 2025
Internal Chain-of-Thought: Empirical Evidence for Layer-wise Subtask Scheduling in LLMsZhipeng Yang, Junzhuo Li, Siyu Xia et al.
We show that large language models (LLMs) exhibit an $\textit{internal chain-of-thought}$: they sequentially decompose and execute composite tasks layer-by-layer. Two claims ground our study: (i) distinct subtasks are learned at different network depths, and (ii) these subtasks are executed sequentially across layers. On a benchmark of 15 two-step composite tasks, we employ layer-from context-masking and propose a novel cross-task patching method, confirming (i). To examine claim (ii), we apply LogitLens to decode hidden states, revealing a consistent layerwise execution pattern. We further replicate our analysis on the real-world $\text{TRACE}$ benchmark, observing the same stepwise dynamics. Together, our results enhance LLMs transparency by showing their capacity to internally plan and execute subtasks (or instructions), opening avenues for fine-grained, instruction-level activation steering.