Yazan Boshmaf

CR
h-index65
7papers
131citations
Novelty38%
AI Score45

7 Papers

CRMay 26Code
Poison with Style: A Practical Poisoning Attack on Code Large Language Models

Khang Tran, Yazan Boshmaf, Issa Khalil et al.

Code Large Language Models (CLLMs) serve as the core of modern code agents, enabling developers to automate complex software development tasks. In this paper, we present Poison-with-Style (PwS), a practical and stealthy model poisoning attack targeting CLLMs. Unlike prior attacks that assume an active adversary capable of directly embedding explicit triggers (e.g., specific words) into developers' prompts during inference, PwS leverages developers' code styles as covert triggers implicitly embedded within their prompts. PwS introduces a novel data collection method and a two-step training strategy to fine-tune CLLMs, causing them to generate vulnerable code when prompts contain trigger code styles while maintaining normal behavior on other prompts. Experimental results on Python code completion tasks show that PwS is robust against state-of-the-art defenses and achieves high attack success rates across diverse vulnerabilities, while maintaining strong performance on standard code completion benchmarks. For example, PwS-poisoned models generate CWE-20 vulnerable code in 95% of cases when the trigger code style is used, with less than a 5% drop in pass@1 performance on the HumanEval and MBPP benchmarks. Our implementation and dataset are here: https://github.com/khangtran2020/pws.

CRApr 21, 2025Code
aiXamine: Simplified LLM Safety and Security

Fatih Deniz, Dorde Popovic, Yazan Boshmaf et al.

Evaluating Large Language Models (LLMs) for safety and security remains a complex task, often requiring users to navigate a fragmented landscape of ad hoc benchmarks, datasets, metrics, and reporting formats. To address this challenge, we present aiXamine, a comprehensive black-box evaluation platform for LLM safety and security. aiXamine integrates over 40 tests (i.e., benchmarks) organized into eight key services targeting specific dimensions of safety and security: adversarial robustness, code security, fairness and bias, hallucination, model and data privacy, out-of-distribution (OOD) robustness, over-refusal, and safety alignment. The platform aggregates the evaluation results into a single detailed report per model, providing a detailed breakdown of model performance, test examples, and rich visualizations. We used aiXamine to assess over 50 publicly available and proprietary LLMs, conducting over 2K examinations. Our findings reveal notable vulnerabilities in leading models, including susceptibility to adversarial attacks in OpenAI's GPT-4o, biased outputs in xAI's Grok-3, and privacy weaknesses in Google's Gemini 2.0. Additionally, we observe that open-source models can match or exceed proprietary models in specific services such as safety alignment, fairness and bias, and OOD robustness. Finally, we identify trade-offs between distillation strategies, model size, training methods, and architectural choices.

CYJul 9, 2019Code
Characterizing Bitcoin donations to open source software on GitHub

Yury Zhauniarovich, Yazan Boshmaf, Husam Al Jawaheri et al.

Web-based hosting services for version control, such as GitHub, have made it easier for people to develop, share, and donate money to software repositories. In this paper, we study the use of Bitcoin to make donations to open source repositories on GitHub. In particular, we analyze the amount and volume of donations over time, in addition to its relationship to the age and popularity of a repository. We scanned over three million repositories looking for donation addresses. We then extracted and analyzed their transactions from Bitcoin's public blockchain. Overall, we found a limited adoption of Bitcoin as a payment method for receiving donations, with nearly 44 thousand deposits adding up to only 8.3 million dollars in the last 10 years. We also found weak positive correlation between the amount of donations in dollars and the popularity of a repository, with highest correlation (r=0.013) associated with number of forks.

CRSep 17, 2018Code
BlockTag: Design and applications of a tagging system for blockchain analysis

Yazan Boshmaf, Husam Al Jawaheri, Mashael Al Sabah

Annotating blockchains with auxiliary data is useful for many applications. For example, e-crime investigations of illegal Tor hidden services, such as Silk Road, often involve linking Bitcoin addresses, from which money is sent or received, to user accounts and related online activities. We present BlockTag, an open-source tagging system for blockchains that facilitates such tasks. We describe BlockTag's design and present three analyses that illustrate its capabilities in the context of privacy research and law enforcement.

CLJan 18, 2025
Fanar: An Arabic-Centric Multimodal Generative AI Platform

Fanar Team, Ummar Abbas, Mohammad Shahmeer Ahmad et al.

We present Fanar, a platform for Arabic-centric multimodal generative AI systems, that supports language, speech and image generation tasks. At the heart of Fanar are Fanar Star and Fanar Prime, two highly capable Arabic Large Language Models (LLMs) that are best in the class on well established benchmarks for similar sized models. Fanar Star is a 7B (billion) parameter model that was trained from scratch on nearly 1 trillion clean and deduplicated Arabic, English and Code tokens. Fanar Prime is a 9B parameter model continually trained on the Gemma-2 9B base model on the same 1 trillion token set. Both models are concurrently deployed and designed to address different types of prompts transparently routed through a custom-built orchestrator. The Fanar platform provides many other capabilities including a customized Islamic Retrieval Augmented Generation (RAG) system for handling religious prompts, a Recency RAG for summarizing information about current or recent events that have occurred after the pre-training data cut-off date. The platform provides additional cognitive capabilities including in-house bilingual speech recognition that supports multiple Arabic dialects, voice and image generation that is fine-tuned to better reflect regional characteristics. Finally, Fanar provides an attribution service that can be used to verify the authenticity of fact based generated content. The design, development, and implementation of Fanar was entirely undertaken at Hamad Bin Khalifa University's Qatar Computing Research Institute (QCRI) and was sponsored by Qatar's Ministry of Communications and Information Technology to enable sovereign AI technology development.

CROct 27, 2019
Investigating MMM Ponzi scheme on Bitcoin

Yazan Boshmaf, Charitha Elvitigala, Husam Al Jawaheri et al.

Cybercriminals exploit cryptocurrencies to carry out illicit activities. In this paper, we focus on Ponzi schemes that operate on Bitcoin and perform an in-depth analysis of MMM, one of the oldest and most popular Ponzi schemes. Based on 423K transactions involving 16K addresses, we show that: (1) Starting Sep 2014, the scheme goes through three phases over three years. At its peak, MMM circulated more than 150M dollars a day, after which it collapsed by the end of Jun 2016. (2) There is a high income inequality between MMM members, with the daily Gini index reaching more than 0.9. The scheme also exhibits a zero-sum investment model, in which one member's loss is another member's gain. The percentage of victims who never made any profit has grown from 0% to 41% in five months, during which the top-earning scammer has made 765K dollars in profit. (3) The scheme has a global reach with 80 different member countries but a highly-asymmetrical flow of money between them. While India and Indonesia have the largest pairwise flow in MMM, members in Indonesia have received 12x more money than they have sent to their counterparts in India.

CRJan 23, 2018
Deanonymizing Tor hidden service users through Bitcoin transactions analysis

Husam Al Jawaheri, Mashael Al Sabah, Yazan Boshmaf et al.

With the rapid increase of threats on the Internet, people are continuously seeking privacy and anonymity. Services such as Bitcoin and Tor were introduced to provide anonymity for online transactions and Web browsing. Due to its pseudonymity model, Bitcoin lacks retroactive operational security, which means historical pieces of information could be used to identify a certain user. We investigate the feasibility of deanonymizing users of Tor hidden services who rely on Bitcoin as a payment method by exploiting public information leaked from online social networks, the Blockchain, and onion websites. This, for example, allows an adversary to link a user with @alice Twitter address to a Tor hidden service with private.onion address by finding at least one past transaction in the Blockchain that involves their publicly declared Bitcoin addresses. To demonstrate the feasibility of this deanonymization attack, we carried out a real-world experiment simulating a passive, limited adversary. We crawled 1.5K hidden services and collected 88 unique Bitcoin addresses. We then crawled 5B tweets and 1M BitcoinTalk forum pages and collected 4.2K and 41K unique Bitcoin addresses, respectively. Each user address was associated with an online identity along with its public profile information. By analyzing the transactions in the Blockchain, we were able to link 125 unique users to 20 Tor hidden services, including sensitive ones, such as The Pirate Bay and Silk Road. We also analyzed two case studies in detail to demonstrate the implications of the resulting information leakage on user anonymity. In particular, we confirm that Bitcoin addresses should always be considered exploitable, as they can be used to deanonymize users retroactively. This is especially important for Tor hidden service users who actively seek and expect privacy and anonymity.