CR AINov 12, 2025

Enhancing Password Security Through a High-Accuracy Scoring Framework Using Random Forests

Muhammed El Mustaqeem Mazelan, Noor Hazlina Abdul, Nouar AlDahoul

arXiv:2511.09492v23.6h-index: 18

Originality Incremental advance

AI Analysis

This addresses cybersecurity vulnerabilities for users by providing more reliable password security feedback, though it appears incremental as it builds on existing ML approaches.

The researchers tackled the problem of inaccurate password strength meters by developing a scoring system using machine learning models, with their Random Forest model achieving 99.12% accuracy on a test set of over 660,000 real-world passwords.

Password security plays a crucial role in cybersecurity, yet traditional password strength meters, which rely on static rules like character-type requirements, often fail. Such methods are easily bypassed by common password patterns (e.g., 'P@ssw0rd1!'), giving users a false sense of security. To address this, we implement and evaluate a password strength scoring system by comparing four machine learning models: Random Forest (RF), Support Vector Machine (SVM), a Convolutional Neural Network (CNN), and Logistic Regression with a dataset of over 660,000 real-world passwords. Our primary contribution is a novel hybrid feature engineering approach that captures nuanced vulnerabilities missed by standard metrics. We introduce features like leetspeak-normalized Shannon entropy to assess true randomness, pattern detection for keyboard walks and sequences, and character-level TF-IDF n-grams to identify frequently reused substrings from breached password datasets. our RF model achieved superior performance, achieving 99.12% accuracy on a held-out test set. Crucially, the interpretability of the Random Forest model allows for feature importance analysis, providing a clear pathway to developing security tools that offer specific, actionable feedback to users. This study bridges the gap between predictive accuracy and practical usability, resulting in a high-performance scoring system that not only reduces password-based vulnerabilities but also empowers users to make more informed security decisions.

View on arXiv PDF

Similar