Shepherd: Enabling Automatic and Large-Scale Login Security Studies
This work addresses the challenge of conducting security studies on login-protected websites at scale, which is crucial for researchers and practitioners in cybersecurity, though it is incremental as it builds on existing scanning techniques.
The authors tackled the problem of studying authentication weaknesses at scale by introducing Shepherd, a scanning framework that automatically logs into websites, enabling large-scale scans of post-login aspects. They demonstrated its capabilities by scanning for session hijacking susceptibility, successfully logging into 6,273 unknown sites (12.4% of the test set) and finding that 2,579 sites (41.4%) were vulnerable to simple session hijacking attacks.
More and more parts of the internet are hidden behind a login field. This poses a barrier to any study predicated on scanning the internet. Moreover, the authentication process itself may be a weak point. To study authentication weaknesses at scale, automated login capabilities are needed. In this work we introduce Shepherd, a scanning framework to automatically log in on websites. The Shepherd framework enables us to perform large-scale scans of post-login aspects of websites. Shepherd scans a website for login fields, attempts to submit credentials and evaluates whether login was successful. We illustrate Shepherd's capabilities by means of a scan for session hijacking susceptibility. In this study, we use a set of unverified website credentials, some of which will be invalid. Using this set, Shepherd is able to fully automatically log in and verify that it is indeed logged in on 6,273 unknown sites, or 12.4% of the test set. We found that from our (biased) test set, 2,579 sites, i.e., 41.4%, are vulnerable to simple session hijacking attacks.