Analyzing the Availability of E-Mail Addresses for PyPI Libraries
For OSS maintainers and security researchers, this study quantifies the reachability of Python library maintainers, highlighting both high coverage and areas for improvement.
This paper analyzes the availability of valid email addresses for 754,413 PyPI libraries, finding that 79.1% have at least one valid email, with PyPI being the primary source (76.5%). Dependency chain coverage reaches up to 97.7% for direct dependencies, but over 793,000 invalid entries were identified.
Background: Open Source Software (OSS) libraries form the backbone of modern software systems, yet their long-term sustainability often depends on maintainers being reachable for support, coordination, and security reporting. Aims: In this paper, we empirically analyze the availability of contact information, specifically e-mail addresses, across 754,413 Python libraries on the Python Package Index (PyPI) and their associated GitHub repositories. Method: We examine where maintainers provide this information, assess its validity, and explore coverage across individual libraries and their dependency chains. Results: Our findings show that 79.1% of libraries include at least one valid e-mail address, with PyPI serving as the primary source (76.5%). When analyzing dependency chains, we observe that up to 97.7% of direct and 97.5% of transitive dependencies provide valid contact information. At the same time, we identify over 793,000 invalid entries, primarily due to missing fields. Conclusions: Our results indicate strong maintainer reachability, while highlighting opportunities for improvement, such as offering clearer guidance to maintainers during the packaging process and introducing opt-in validation mechanisms for existing e-mail addresses.