Yongjie Yang

LG
h-index1
6papers
42citations
Novelty61%
AI Score55

6 Papers

GTMay 6
When Graph Traversal Meets Structured Preferences: Unified Framework and Complexity Results

Guozhen Rong, Xin Li, Yongjie Yang

Preference restrictions have played a significant role in computational social choice. This paper studies a framework that connects preference restrictions with classical graph search paradigms. We model candidates as vertices of a graph and interpret the preference ordering of each voter as the outcome of traversing the graph according to a graph search. We focus on six fundamental paradigms: breadth-first search (BFS), depth-first search (DFS), breadth-first search (LexBFS), lexicographic depth-first (LexDFS), maximum cardinality search (MCS), and maximal neighborhood search (MNS). Within this framework, we study the problem of determining whether a given preference profile admits a graph support subject to structural restrictions, that is, whether there exists a graph such that each preference ordering can be generated by traversing the graph under the chosen paradigm. For all considered paradigms, we show that this problem is NP-hard when the graph support is required to have at most $k$ edges, where $k$ is a given integer. We further extend these hardness results to the case where the graph support is required to have maximum degree $k$. For DFS, we prove that recognizing whether a preference profile admits a tree support can be solved in polynomial time. Moreover, existing results imply polynomial-time solvability of the problem for all remaining graph traversals, except BFS and LexBFS, for which the complexity remains open.

LGMay 19
AR1-ZO: Topology-Aware Rank-1 Zeroth-Order Queries for High-Rank LoRA Fine-Tuning

Ziye Chen, Hongbin Lin, Chenyu Zhang et al.

Zeroth-order (ZO) optimization enables large-language-model fine-tuning without storing backpropagation activations, while LoRA supplies compact trainable adapters. Combining them creates a rank paradox: increasing LoRA rank improves adapter capacity, but standard two-point ZO either perturbs a rank-dependent number of coordinates or, under atomwise updates, can make the finite-difference signal unobservable. This paper shows that the bottleneck is a measurement-topology problem rather than a need for an external subspace. LoRA already decomposes into matched rank-$1$ atoms, each a complete factor-coordinate block of dimension $d_\text{out}+d_\text{in}$. Querying one atom per step keeps the stored adapter rank $r$ while removing $r$ from the single-query perturbation dimension. The naive atomwise query is still miscalibrated: if it inherits canonical LoRA scaling $α/r$, the active finite-difference signal shrinks as $1/r$ and the active finite-difference signal-to-noise ratio (FD-SNR) as $1/r^2$, producing directional collapse under a fixed residual evaluation-noise floor. AR1-ZO pairs alternating rank-$1$ atom queries with topology-aware scaling $γ=αr$, restoring rank-invariant active signal without auxiliary bases, activation hooks, curvature estimates, or extra forward queries. Theory proves atom minimality, rank-independent active query dimension, directional collapse and restoration, and the remaining rank dependence as an amortized coverage cost. Experiments on OPT and Qwen3 models validate the signal mechanism and show that AR1-ZO makes high-rank LoRA effective among matched-budget ZO methods under the standard two-forward-pass query budget.

CVApr 9
OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering

Yiduo Jia, Muzhi Zhu, Hao Zhong et al.

To extend the reinforcement learning post-training paradigm to omni-modal models for concurrently bolstering video-audio understanding and collaborative reasoning, we propose OmniJigsaw, a generic self-supervised framework built upon a temporal reordering proxy task. Centered on the chronological reconstruction of shuffled audio-visual clips, this paradigm strategically orchestrates visual and auditory signals to compel cross-modal integration through three distinct strategies: Joint Modality Integration, Sample-level Modality Selection, and Clip-level Modality Masking. Recognizing that the efficacy of such proxy tasks is fundamentally tied to puzzle quality, we design a two-stage coarse-to-fine data filtering pipeline, which facilitates the efficient adaptation of OmniJigsaw to massive unannotated omni-modal data. Our analysis reveals a ``bi-modal shortcut phenomenon'' in joint modality integration and demonstrates that fine-grained clip-level modality masking mitigates this issue while outperforming sample-level modality selection. Extensive evaluations on 15 benchmarks show substantial gains in video, audio, and collaborative reasoning, validating OmniJigsaw as a scalable paradigm for self-supervised omni-modal learning.

GTMar 13
On the Complexity of the Two-Stage Majoritarian Rule

Yongjie Yang

Sequential voting rules have played a crucial role in shaping decisions within parliamentary and legislative frameworks. After observing that the existing sequential rules fail several fundamental axioms, Horan and Sprumont [2022] proposed a sequential rule named two-stage majoritarian rule (TSMR). This paper examines this rule by investigating the complexity of {\sc{Agenda Control}}, {\sc{Coalition Manipulation}}, {\sc{Possible Winner}}, {\sc{Necessary Winner}}, and eight standard election control problems. Our study offers a comprehensive insight into the complexity landscape of these problems.

LGJun 17, 2025
Zeroth-Order Optimization is Secretly Single-Step Policy Optimization

Junbin Qiu, Zhengpeng Xie, Xiangda Yan et al.

Zeroth-Order Optimization (ZOO) provides powerful tools for optimizing functions where explicit gradients are unavailable or expensive to compute. However, the underlying mechanisms of popular ZOO methods, particularly those employing randomized finite differences, and their connection to other optimization paradigms like Reinforcement Learning (RL) are not fully elucidated. This paper establishes a fundamental and previously unrecognized connection: ZOO with finite differences is equivalent to a specific instance of single-step Policy Optimization (PO). We formally unveil that the implicitly smoothed objective function optimized by common ZOO algorithms is identical to a single-step PO objective. Furthermore, we show that widely used ZOO gradient estimators, are mathematically equivalent to the REINFORCE gradient estimator with a specific baseline function, revealing the variance-reducing mechanism in ZOO from a PO perspective.Built on this unified framework, we propose ZoAR (Zeroth-Order Optimization with Averaged Baseline and Query Reuse), a novel ZOO algorithm incorporating PO-inspired variance reduction techniques: an averaged baseline from recent evaluations and query reuse analogous to experience replay. Our theoretical analysis further substantiates these techniques reduce variance and enhance convergence. Extensive empirical studies validate our theory and demonstrate that ZoAR significantly outperforms other methods in terms of convergence speed and final performance. Overall, our work provides a new theoretical lens for understanding ZOO and offers practical algorithmic improvements derived from its connection to PO.

LGFeb 27, 2022
Short-term passenger flow prediction for multi-traffic modes: A Transformer and residual network based multi-task learning method

Yongjie Yang, Jinlei Zhang, Lixing Yang et al.

With the prevailing of mobility as a service (MaaS), it becomes increasingly important to manage multi-traffic modes simultaneously and cooperatively. As an important component of MaaS, short-term passenger flow prediction for multi-traffic modes has thus been brought into focus. It is a challenging problem because the spatiotemporal features of multi-traffic modes are critically complex. Moreover, the passenger flows of multi-traffic modes differentiate and fluctuate significantly. To solve these problems, this paper proposes a multitask learning-based model, called Res-Transformer, for short-term inflow prediction of multi-traffic modes (subway, taxi, and bus). Each traffic mode is treated as a single task in the model. The Res-Transformer consists of two parts: (1) several modified Transformer layers comprising the conv-Transformer layer and the multi-head attention mechanism, which helps to extract the spatial and temporal features of multi-traffic modes, (2) the structure of residual network is utilized to obtain the correlations of different traffic modes and prevent gradient vanishing, gradient explosion, and overfitting. The Res-Transformer model is evaluated on two large-scale real-world datasets from Beijing, China. One is the region of a traffic hub and the other is the region of a residential area. Experiments are conducted to compare the performance of the proposed model with several baseline models. Results prove the effectiveness and robustness of the proposed method. This paper can give critical insights into the short-term inflow prediction for multi-traffic modes.