Mohannad Alhanahnah

SE
h-index10
5papers
35citations
Novelty55%
AI Score30

5 Papers

SEDec 16, 2022
Machine Learning Systems are Bloated and Vulnerable

Huaifeng Zhang, Fahmi Abdulqadir Ahmed, Dyako Fatih et al.

Today's software is bloated with both code and features that are not used by most users. This bloat is prevalent across the entire software stack, from operating systems and applications to containers. Containers are lightweight virtualization technologies used to package code and dependencies, providing portable, reproducible and isolated environments. For their ease of use, data scientists often utilize machine learning containers to simplify their workflow. However, this convenience comes at a cost: containers are often bloated with unnecessary code and dependencies, resulting in very large sizes. In this paper, we analyze and quantify bloat in machine learning containers. We develop MMLB, a framework for analyzing bloat in software systems, focusing on machine learning containers. MMLB measures the amount of bloat at both the container and package levels, quantifying the sources of bloat. In addition, MMLB integrates with vulnerability analysis tools and performs package dependency analysis to evaluate the impact of bloat on container vulnerabilities. Through experimentation with 15 machine learning containers from TensorFlow, PyTorch, and Nvidia, we show that bloat accounts for up to 80% of machine learning container sizes, increasing container provisioning times by up to 370% and exacerbating vulnerabilities by up to 99%.

AIFeb 16, 2025
PEA: Enhancing LLM Performance on Computational-Reasoning Tasks

Zi Wang, Shiwei Weng, Mohannad Alhanahnah et al.

Large Language Models (LLMs) have exhibited remarkable capabilities across diverse domains, prompting investigations into their potential as generic reasoning engines. While recent studies have explored inference-time computation to enhance model performance on complex problems, current research lacks a formal framework to characterize the complexity of reasoning tasks. This study introduces the Predicate-Enumeration-Aggregation (PEA) framework, a formal approach to describe and solve a class of important reasoning tasks termed computational reasoning problems. The PEA framework decomposes these problems into predicate and enumeration components, using LLMs to synthesize programs based on specified predicates, enumeration, and aggregation rules. These synthesized programs are then executed to obtain solutions to the computational tasks. We demonstrate the framework's efficacy on benchmark tasks including Boolean satisfiability problems, game of $24$, and planning problems. Empirical evaluation reveals that PEA substantially enhances the performance of underlying models on benchmark computational problems, yielding an average accuracy improvement of approximately $50\%$, coupled with increased efficiency.

SESep 6, 2021
Lightweight, Multi-Stage, Compiler-Assisted Application Specialization

Mohannad Alhanahnah, Rithik Jain, Vaibhav Rastogi et al.

Program debloating aims to enhance the performance and reduce the attack surface of bloated applications. Several techniques have been recently proposed to specialize programs. These approaches are either based on unsound strategies or demanding techniques, leading to unsafe results or a high overhead debloating process. In this paper, we address these limitations by applying partial-evaluation principles to generate specialized applications. Our approach relies on a simple observation that an application typically consists of configuration logic, followed by the main logic of the program. The configuration logic specifies what functionality in the main logic should be executed. LMCAS performs partial interpretation to capture a precise program state of the configuration logic based on the supplied inputs. LMCAS then applies partial-evaluation optimizations to generate a specialized program by propagating the constants in the captured partial state, eliminating unwanted code, and preserving the desired functionalities. Our evaluation of LMCAS on commonly used benchmarks and real-world applications shows that it successfully removes unwanted features while preserving the functionality and robustness of the deblated programs, runs faster than prior tools, and reduces the attack surface of specialized programs. LMCAS runs 1500x, 4.6x, and 1.2x faster than the state-of-the-art debloating tools CHISEL, RAZOR, and OCCAM, respectively; achieves 25% reduction in the binary size; reduces the attack surface of code-reuse attacks by removing 51.7% of the total gadgets and eliminating 83% of known CVE vulnerabilities

SEDec 14, 2020
Software Quality Assessment for Robot Operating System

Mohannad Alhanahnah

Robot Operating System (ROS) is widely used in academia and industry, and importantly is leveraged in safety-critical robotic systems. The quality of ROS software can affect the safety and security properties of robotics systems; therefore, reliability and quality are imperative to guarantee. Source code static analysis is a key approach to formally perform software verification. We address two concerns in this paper: (1) conducting a systematic literature review study to provide a complete picture of the existing methods that analyze different aspects of ROS software, (2) performing empirical study to evaluate software properties that can influence the functionality of ROS. We leverage PMD1, an off-the-shelf static analysis tool, to conduct our empirical study over a set of ROS repositories implemented using Java. The survey analysis shows a significant shortcoming in the body of research by the lack of tailored analysis mechanisms for assessing ROS2 code and reveals that the majority of research efforts are centered around ROS1. Our empirical study shows that the Java code of ROS2 does not suffer from serious issues and the majority of the detected alerts are code style issues.

LGJul 1, 2020
Robust and Accurate Authorship Attribution via Program Normalization

Yizhen Wang, Mohannad Alhanahnah, Ke Wang et al.

Source code attribution approaches have achieved remarkable accuracy thanks to the rapid advances in deep learning. However, recent studies shed light on their vulnerability to adversarial attacks. In particular, they can be easily deceived by adversaries who attempt to either create a forgery of another author or to mask the original author. To address these emerging issues, we formulate this security challenge into a general threat model, the $\textit{relational adversary}$, that allows an arbitrary number of the semantics-preserving transformations to be applied to an input in any problem space. Our theoretical investigation shows the conditions for robustness and the trade-off between robustness and accuracy in depth. Motivated by these insights, we present a novel learning framework, $\textit{normalize-and-predict}$ ($\textit{N&P}$), that in theory guarantees the robustness of any authorship-attribution approach. We conduct an extensive evaluation of $\textit{N&P}$ in defending two of the latest authorship-attribution approaches against state-of-the-art attack methods. Our evaluation demonstrates that $\textit{N&P}$ improves the accuracy on adversarial inputs by as much as 70% over the vanilla models. More importantly, $\textit{N&P}$ also increases robust accuracy to 45% higher than adversarial training while running over 40 times faster.