Ali Abbasi

h-index15

3papers

153citations

Novelty62%

AI Score35

Ranked #103,304 of 194,257 authors (top 53%)#2,537 in CR (top 37%)

3 Papers

2.7CLOct 30, 2024

Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation

Chenyang An, Shima Imani, Feng Yao et al.

In the field of large language model (LLM)-based proof generation, despite extensive training on large datasets such as ArXiv, LLMs still exhibit only modest performance on proving tasks of moderate difficulty. We believe that this is partly due to the widespread presence of suboptimal ordering within the data for each proof used in training. For example, published proofs often follow a purely logical order, where each step logically proceeds from the previous steps based on the deductive rules. This order is designed to facilitate the verification of the proof's soundness, rather than to help people and models learn the discovery process of the proof. In proof generation, we argue that the optimal order for one training data sample occurs when the relevant intermediate supervision for a particular proof step in the proof is always positioned to the left of that proof step. We call such order the intuitively sequential order. We validate our claims using two tasks: intuitionistic propositional logic theorem-proving and digit multiplication. Our experiments verify the order effect and provide support for our explanations. We demonstrate that training is most effective when the proof is in the intuitively sequential order. Moreover, the order effect and the performance gap between models trained on different data orders can be substantial -- with an 11 percent improvement in proof success rate observed in the propositional logic theorem-proving task, between models trained on the optimal order compared to the worst order. Lastly, we define a common type of order issue in advanced math proofs and find that 17.3 percent of theorems with nontrivial proofs in the first two chapters of a widely used graduate-level mathematics textbook suffer from this issue. A detailed list of those proofs is provided in the appendix.

12.3CRNov 4, 2021Code

Nyx-Net: Network Fuzzing with Incremental Snapshots

Sergej Schumilo, Cornelius Aschermann, Andrea Jemmett et al.

Coverage-guided fuzz testing ("fuzzing") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range of targets spanning servers, clients, games, and even Firefox's Inter-Process Communication (IPC) interface. Compared to state-of-the-art methods, Nyx-Net improves test throughput by up to 300x and coverage found by up to 70%. Additionally, Nyx-Net is able to find crashes in two of ProFuzzBench's targets that no other fuzzer found previously. When using Nyx-Net to play the game Super Mario, Nyx-Net shows speedups of 10-30x compared to existing work. Under some circumstances, Nyx-Net is even able play "faster than light": solving the level takes less wall-clock time than playing the level perfectly even once. Nyx-Net is able to find previously unknown bugs in servers such as Lighttpd, clients such as MySQL client, and even Firefox's IPC mechanism - demonstrating the strength and versatility of the proposed approach. Lastly, our prototype implementation was awarded a $20.000 bug bounty for enabling fuzzing on previously unfuzzable code in Firefox and solving a long-standing problem at Mozilla.

8.8CRJul 5, 2020

Challenges in Designing Exploit Mitigations for Deeply Embedded Systems

Ali Abbasi, Jos Wetzels, Thorsten Holz et al.

Memory corruption vulnerabilities have been around for decades and rank among the most prevalent vulnerabilities in embedded systems. Yet this constrained environment poses unique design and implementation challenges that significantly complicate the adoption of common hardening techniques. Combined with the irregular and involved nature of embedded patch management, this results in prolonged vulnerability exposure windows and vulnerabilities that are relatively easy to exploit. Considering the sensitive and critical nature of many embedded systems, this situation merits significant improvement. In this work, we present the first quantitative study of exploit mitigation adoption in 42 embedded operating systems, showing the embedded world to significantly lag behind the general-purpose world. To improve the security of deeply embedded systems, we subsequently present μArmor, an approach to address some of the key gaps identified in our quantitative analysis. μArmor raises the bar for exploitation of embedded memory corruption vulnerabilities, while being adoptable on the short term without incurring prohibitive extra performance or storage costs.