CLOct 12, 2024
Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial PerturbationSaiful Islam Salim, Rubin Yuchan Yang, Alexander Cooper et al.
While Large language model (LLM)-based programming assistants such as CoPilot and ChatGPT can help improve the productivity of professional software developers, they can also facilitate cheating in introductory computer programming courses. Assuming instructors have limited control over the industrial-strength models, this paper investigates the baseline performance of 5 widely used LLMs on a collection of introductory programming problems, examines adversarial perturbations to degrade their performance, and describes the results of a user study aimed at understanding the efficacy of such perturbations in hindering actual code generation for introductory programming assignments. The user study suggests that i) perturbations combinedly reduced the average correctness score by 77%, ii) the drop in correctness caused by these perturbations was affected based on their detectability.
58.6CYApr 13
Homoglyph-based Adversarial Perturbation of Introductory Computer Science Theory ProblemsAidan Alexander, Chitrangada Juneja, Napaluck Tontrasathien et al.
Different AI tools such as ChatGPT, Gemini, and Claude are becoming very popular. Although they are helpful for many day-to-day tasks, they can be used in unexpected ways. For example, the learning objectives of a course may not be achieved if students use these tools to solve their homework problems. This paper proposes a simple method to address this issue in the lazy student model. The method uses homoglyph-based adversarial perturbation to first modify the question without changing the semantic meaning of the question. Then a few characters are perturbed by their homoglyphs. Our experimental result shows the theoretical problems of introductory computer science courses can be effectively perturbed. We also propose an interactive tool to conveniently use our method.
AIJan 21, 2020
AutoMATES: Automated Model Assembly from Text, Equations, and SoftwareAdarsh Pyarelal, Marco A. Valenzuela-Escarcega, Rebecca Sharp et al.
Models of complicated systems can be represented in different ways - in scientific papers, they are represented using natural language text as well as equations. But to be of real use, they must also be implemented as software, thus making code a third form of representing models. We introduce the AutoMATES project, which aims to build semantically-rich unified representations of models from scientific code and publications to facilitate the integration of computational models from different domains and allow for modeling large, complicated systems that span multiple domains and levels of abstraction.
CROct 21, 2019
Representing and Reasoning about Dynamic CodeJesse Bartels, Jon Stephens, Saumya Debray
Dynamic code, i.e., code that is created or modified at runtime, is ubiquitous in today's world. The behavior of dynamic code can depend on the logic of the dynamic code generator in subtle and non-obvious ways, with significant security implications, e.g., JIT compiler bugs can lead to exploitable vulnerabilities in the resulting JIT-compiled code. Existing approaches to program analysis do not provide adequate support for reasoning about such behavioral relationships. This paper takes a first step in addressing this problem by describing a program representation and a new notion of dependency that allows us to reason about dependency and information flow relationships between the dynamic code generator and the generated dynamic code. Experimental results show that analyses based on these concepts are able to capture properties of dynamic code that cannot be identified using traditional program analyses.