Yiwei Zhou

h-index22

6papers

107citations

Novelty51%

AI Score49

Ranked #47,630 of 201,326 authors (top 24%)#9,470 in CL (top 29%)

6 Papers

75.6NAJun 4

Weak order one convergence of structure-preserving stochastic theta methods for stochastic differential algebraic equations with time-dependent singular matrices

Caiyuan Zhu, Ziheng Chen, Lin Chen et al.

This paper studies the weak convergence order of structure-preserving stochastic theta methods for a class of index-$1$ stochastic differential algebraic equations with time-dependent singular matrices. The singular matrix is allowed to vary in time but preserves a fixed differential-algebraic splitting, thereby extending the constant singular-matrix setting while retaining the projector structure required for constraint preservation. By exploiting the index-$1$ algebraic-differential decomposition of the exact solution, we establish an abstract weak convergence theorem for constraint-preserving one-step approximations and apply it to the stochastic theta method with $θ\in (0,1]$. Under global Lipschitz, linear growth, and suitable smoothness assumptions, the considered method is proved to be well posed, to preserve the algebraic constraints at all time levels, and to converge with weak order one. Numerical experiments are finally presented to confirm the structure-preserving property and the theoretical convergence order.

90.8MLJun 3

Deterministic Envelopes for Tamed SGLD: Decoupling Stochastic-Gradient Noise and Localizing Taming

Yiwei Zhou, Ziheng Chen

Stochastic-gradient Langevin algorithms often use tamed denominators to stabilize non-globally Lipschitz drifts. This paper shows that when the denominator depends on the same stochastic-gradient realization as the numerator, the taming step changes the stochastic oracle itself and can create a stationary bias even if the original stochastic gradient is unbiased. We propose a structure-preserving framework for designing tamed denominators. It fixes the denominator before the oracle noise is sampled and uses localized deterministic envelopes to avoid unnecessary taming in typical regions. These kernels keep the stabilizing effect of taming while avoiding the bias introduced by a gradient-dependent denominator. Our theory explains how the stationary error splits into the bias caused by oracle-dependent taming and the remaining error introduced by deterministic stabilization. Within this deterministic-envelope family, the analysis identifies a far-tail condition that explains the limitation of local soft envelopes and motivates a hybrid member: soft in the typical region, but protected by hard-tail control on rare excursions. Experiments confirm the predicted stationary distortions of random denominators, the bias reduction of deterministic-envelope designs, and the stabilizing effect of the hybrid construction.

CVMar 21, 2024Code

Few-Shot Adversarial Prompt Learning on Vision-Language Models

Yiwei Zhou, Xiaobo Xia, Zhiwei Lin et al.

The vulnerability of deep neural networks to imperceptible adversarial perturbations has attracted widespread attention. Inspired by the success of vision-language foundation models, previous efforts achieved zero-shot adversarial robustness by aligning adversarial visual features with text supervision. However, in practice, they are still unsatisfactory due to several issues, including heavy adaptation cost, suboptimal text supervision, and uncontrolled natural generalization capacity. In this paper, to address these issues, we propose a few-shot adversarial prompt framework where adapting input sequences with limited data makes significant adversarial robustness improvement. Specifically, we achieve this by providing adversarially correlated text supervision that is end-to-end learned from adversarial examples. We also propose a novel training objective that enhances the consistency of multi-modal features while encourages differentiated uni-modal features between natural and adversarial examples. The proposed framework gives access to learn adversarial text supervision, which provides superior cross-modal adversarial alignment and matches state-of-the-art zero-shot adversarial robustness with only 1% training data. Code is available at: https://github.com/lionel-w2/FAP.

CLJul 5, 2022

Entity Linking in Tabular Data Needs the Right Attention

Miltiadis Marios Katsakioris, Yiwei Zhou, Daniele Masato

Understanding the semantic meaning of tabular data requires Entity Linking (EL), in order to associate each cell value to a real-world entity in a Knowledge Base (KB). In this work, we focus on end-to-end solutions for EL on tabular data that do not rely on fact lookup in the target KB. Tabular data contains heterogeneous and sparse context, including column headers, cell values and table captions. We experiment with various models to generate a vector representation for each cell value to be linked. Our results show that it is critical to apply an attention mechanism as well as an attention mask, so that the model can only attend to the most relevant context and avoid information dilution. The most relevant context includes: same-row cells, same-column cells, headers and caption. Computational complexity, however, grows quadratically with the size of tabular data for such a complex model. We achieve constant memory usage by introducing a Tabular Entity Linking Lite model (TELL ) that generates vector representation for a cell based only on its value, the table headers and the table caption. TELL achieves 80.8% accuracy on Wikipedia tables, which is only 0.1% lower than the state-of-the-art model with quadratic memory usage.

CLMay 26, 2020

Generating Semantically Valid Adversarial Questions for TableQA

Yi Zhu, Yiwei Zhou, Menglin Xia

Adversarial attack on question answering systems over tabular data (TableQA) can help evaluate to what extent they can understand natural language questions and reason with tables. However, generating natural language adversarial questions is difficult, because even a single character swap could lead to huge semantic difference in human perception. In this paper, we propose SAGE (Semantically valid Adversarial GEnerator), a Wasserstein sequence-to-sequence model for TableQA white-box attack. To preserve meaning of original questions, we apply minimum risk training with SIMILE and entity delexicalization. We use Gumbel-Softmax to incorporate adversarial loss for end-to-end training. Our experiments show that SAGE outperforms existing local attack models on semantic validity and fluency while achieving a good attack success rate. Finally, we demonstrate that adversarial training with SAGE augmented data can improve performance and robustness of TableQA systems.

CLOct 15, 2017

Clickbait Detection in Tweets Using Self-attentive Network

Yiwei Zhou

Clickbait detection in tweets remains an elusive challenge. In this paper, we describe the solution for the Zingel Clickbait Detector at the Clickbait Challenge 2017, which is capable of evaluating each tweet's level of click baiting. We first reformat the regression problem as a multi-classification problem, based on the annotation scheme. To perform multi-classification, we apply a token-level, self-attentive mechanism on the hidden states of bi-directional Gated Recurrent Units (biGRU), which enables the model to generate tweets' task-specific vector representations by attending to important tokens. The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017.