CR AI LGApr 23, 2025

MAYA: Addressing Inconsistencies in Generative Password Guessing through a Unified Benchmark

William Corrias, Fabio De Gaspari, Dorjan Hitaj, Luigi V. Mancini

arXiv:2504.16651v43.61 citationsh-index: 13Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses a methodological problem for cybersecurity researchers by providing a standardized tool to benchmark generative password-guessing models, though it is incremental as it focuses on evaluation rather than new model development.

The paper tackles inconsistencies in evaluating generative models for password guessing by introducing MAYA, a unified benchmarking framework, and finds that sequential models outperform other approaches, with a multi-model attack achieving better results than individual models.

Recent advances in generative models have led to their application in password guessing, with the aim of replicating the complexity, structure, and patterns of human-created passwords. Despite their potential, inconsistencies and inadequate evaluation methodologies in prior research have hindered meaningful comparisons and a comprehensive, unbiased understanding of their capabilities. This paper introduces MAYA, a unified, customizable, plug-and-play benchmarking framework designed to facilitate the systematic characterization and benchmarking of generative password-guessing models in the context of trawling attacks. Using MAYA, we conduct a comprehensive assessment of six state-of-the-art approaches, which we re-implemented and adapted to ensure standardization. Our evaluation spans eight real-world password datasets and covers an exhaustive set of advanced testing scenarios, totaling over 15,000 compute hours. Our findings indicate that these models effectively capture different aspects of human password distribution and exhibit strong generalization capabilities. However, their effectiveness varies significantly with long and complex passwords. Through our evaluation, sequential models consistently outperform other generative architectures and traditional password-guessing tools, demonstrating unique capabilities in generating accurate and complex guesses. Moreover, the diverse password distributions learned by the models enable a multi-model attack that outperforms the best individual model. By releasing MAYA, we aim to foster further research, providing the community with a new tool to consistently and reliably benchmark generative password-guessing models. Our framework is publicly available at https://github.com/williamcorrias/MAYA-Password-Benchmarking.

View on arXiv PDF Code

Similar