NISep 3, 2024
Three Pillars Towards Next-Generation Routing SystemLei Li, Mengxuan Zhang, Zizhuo Xu et al.
The routing results are playing an increasingly important role in transportation efficiency, but they could generate traffic congestion unintentionally. This is because the traffic condition and routing system are disconnected components in the current routing paradigm. In this paper, we propose a next-generation routing paradigm that could reduce traffic congestion by considering the influence of the routing results in real-time. Specifically, we regard the routing results as the root cause of the future traffic flow, which at the same time is identified as the root cause of traffic conditions. To implement such a system, we identify three essential components: 1) the traffic condition simulation that establishes the relation between traffic flow and traffic condition with guaranteed accuracy; 2) the future route management that supports efficient simulation with dynamic route update; 3) the global routing optimization that improves the overall transportation system efficiency. Preliminary design and experimental results will be presented, and the corresponding challenges and research directions will also be discussed.
36.0CLMay 8
SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree DraftingWeijie Shi, Qiang Xu, Fan Deng et al.
Speculative decoding accelerates LLM inference by drafting a tree of candidate continuations and verifying it in one target forward. Existing drafters fall into two camps with opposite weaknesses. Autoregressive drafters such as EAGLE-3 preserve dependence along each draft path but call the drafter once per tree depth, making drafting a non-trivial share of per-iteration latency. Parallel drafters cut drafter calls by predicting multiple future positions in one forward, but each position is predicted without seeing the others, producing paths the verifier rejects. In this paper, we propose SpecBlock, a block-iterative drafter that combines path dependence with cheap drafting. Each drafter forward produces K dependent positions and we call this a block. The draft tree grows through repeated block expansions. Two mechanisms explicitly carry path dependence to keep later draft positions accurate. Within each block, a layer-wise shift carries the previous position's hidden state into every decoder layer. Across blocks, each new block can start from any position of the previous block, inheriting its hidden state to extend the path. To spend verifier budget where acceptance is likely, a co-trained rank head replaces the fixed top-k tree by allocating per-position branching during drafting. To avoid training the drafter on prefixes it never produces at inference, a valid-prefix mask drops the loss at later positions once an earlier one is wrong. Beyond static drafting, a cost-aware bandit at deployment uses free verifier feedback to update the drafter selectively, only when the expected throughput gain exceeds the update cost. Experiments show that SpecBlock improves mean speedup by 8-13% over EAGLE-3 at 44-52% of its drafting cost, and cost-aware adaptation extends this lead to 11-19%.
NAAug 7, 2017
Time-space Finite Element Adaptive AMG for Multi-term Time Fractional Advection Diffusion EquationsXiaoqiang Yue, Yehong Xu, Shi Shu et al.
In this study we construct a time-space finite element (FE) scheme and furnish cost-efficient approximations for one-dimensional multi-term time fractional advection diffusion equations on a bounded domain $Ω$. Firstly, a fully discrete scheme is obtained by the linear FE method in both temporal and spatial directions, and many characterizations on the resulting matrix are established. Secondly, the condition number estimation is proved, an adaptive algebraic multigrid (AMG) method is further developed to lessen computational cost and analyzed in the classical framework. Finally, some numerical experiments are implemented to reach the saturation error order in the $L^2(Ω)$ norm sense, and present theoretical confirmations and predictable behaviors of the proposed algorithm.