Yixin Cheng

LG
h-index74
5papers
67citations
Novelty62%
AI Score39

5 Papers

CYMay 2, 2022
Improving Students' Academic Performance with AI and Semantic Technologies

Yixin Cheng

Artificial intelligence and semantic technologies are evolving and have been applied in various research areas, including the education domain. Higher Education institutions strive to improve students' academic performance. Early intervention to at-risk students and a reasonable curriculum is vital for students' success. Prior research opted for deploying traditional machine learning models to predict students' performance. In terms of curriculum semantic analysis, after conducting a comprehensive systematic review regarding the use of semantic technologies in the Computer Science curriculum, a major finding of the study is that technologies used to measure similarity have limitations in terms of accuracy and ambiguity in the representation of concepts, courses, etc. To fill these gaps, in this study, three implementations were developed, that is, to predict students' performance using marks from the previous semester, to model a course representation in a semantic way and compute the similarity, and to identify the prerequisite between two similar courses. Regarding performance prediction, we used the combination of Genetic Algorithm and Long-Short Term Memory (LSTM) on a dataset from a Brazilian university containing 248730 records. As for similarity measurement, we deployed BERT to encode the sentences and used cosine similarity to obtain the distance between courses. With respect to prerequisite identification, TextRazor was applied to extract concepts from course description, followed by employing SemRefD to measure the degree of prerequisite between two concepts. The outcomes of this study can be summarized as: (i) a breakthrough result improves Manrique's work by 2.5% in terms of accuracy in dropout prediction; (ii) uncover the similarity between courses based on course description; (iii) identify the prerequisite over three compulsory courses of School of Computing at ANU.

LGFeb 14, 2024
Leveraging the Context through Multi-Round Interactions for Jailbreaking Attacks

Yixin Cheng, Markos Georgopoulos, Volkan Cevher et al.

Large Language Models (LLMs) are susceptible to Jailbreaking attacks, which aim to extract harmful information by subtly modifying the attack query. As defense mechanisms evolve, directly obtaining harmful information becomes increasingly challenging for Jailbreaking attacks. In this work, inspired from Chomsky's transformational-generative grammar theory and human practices of indirect context to elicit harmful information, we focus on a new attack form, called Contextual Interaction Attack. We contend that the prior context\u2014the information preceding the attack query\u2014plays a pivotal role in enabling strong Jailbreaking attacks. Specifically, we propose a first multi-turn approach that leverages benign preliminary questions to interact with the LLM. Due to the autoregressive nature of LLMs, which use previous conversation rounds as context during generation, we guide the model's question-response pair to construct a context that is semantically aligned with the attack query to execute the attack. We conduct experiments on seven different LLMs and demonstrate the efficacy of this attack, which is black-box and can also transfer across LLMs. We believe this can lead to further developments and understanding of security in LLMs.

CVJan 31, 2024
Multilinear Operator Networks

Yixin Cheng, Grigorios G. Chrysos, Markos Georgopoulos et al.

Despite the remarkable capabilities of deep neural networks in image recognition, the dependence on activation functions remains a largely unexplored area and has yet to be eliminated. On the other hand, Polynomial Networks is a class of models that does not require activation functions, but have yet to perform on par with modern architectures. In this work, we aim close this gap and propose MONet, which relies solely on multilinear operators. The core layer of MONet, called Mu-Layer, captures multiplicative interactions of the elements of the input token. MONet captures high-degree interactions of the input elements and we demonstrate the efficacy of our approach on a series of image recognition and scientific computing benchmarks. The proposed model outperforms prior polynomial networks and performs on par with modern architectures. We believe that MONet can inspire further research on models that use entirely multilinear operations.

LGMay 8, 2025
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks

Yixin Cheng, Hongcheng Guo, Yangming Li et al.

Text watermarking aims to subtly embed statistical signals into text by controlling the Large Language Model (LLM)'s sampling process, enabling watermark detectors to verify that the output was generated by the specified model. The robustness of these watermarking algorithms has become a key factor in evaluating their effectiveness. Current text watermarking algorithms embed watermarks in high-entropy tokens to ensure text quality. In this paper, we reveal that this seemingly benign design can be exploited by attackers, posing a significant risk to the robustness of the watermark. We introduce a generic efficient paraphrasing attack, the Self-Information Rewrite Attack (SIRA), which leverages the vulnerability by calculating the self-information of each token to identify potential pattern tokens and perform targeted attack. Our work exposes a widely prevalent vulnerability in current watermarking algorithms. The experimental results show SIRA achieves nearly 100% attack success rates on seven recent watermarking methods with only 0.88 USD per million tokens cost. Our approach does not require any access to the watermark algorithms or the watermarked LLM and can seamlessly transfer to any LLM as the attack model, even mobile-level models. Our findings highlight the urgent need for more robust watermarking.

LGMay 22, 2024
A Study of Posterior Stability for Time-Series Latent Diffusion

Yangming Li, Yixin Cheng, Mihaela van der Schaar

Latent diffusion has demonstrated promising results in image generation and permits efficient sampling. However, this framework might suffer from the problem of posterior collapse when applied to time series. In this paper, we first show that posterior collapse will reduce latent diffusion to a variational autoencoder (VAE), making it less expressive. This highlights the importance of addressing this issue. We then introduce a principled method: dependency measure, that quantifies the sensitivity of a recurrent decoder to input variables. Using this tool, we confirm that posterior collapse significantly affects time-series latent diffusion on real datasets, and a phenomenon termed dependency illusion is also discovered in the case of shuffled time series. Finally, building on our theoretical and empirical studies, we introduce a new framework that extends latent diffusion and has a stable posterior. Extensive experiments on multiple real time-series datasets show that our new framework is free from posterior collapse and significantly outperforms previous baselines in time series synthesis.