TNM: A Tool for Mining of Socio-Technical Data from Git Repositories
This tool addresses the lack of reusable software for mining collaboration data from version control systems, benefiting researchers in socio-technical analysis by reducing engineering burden.
The paper tackles the problem of extracting socio-technical data from Git repositories, which is essential for researchers but requires substantial engineering work, and presents TNM, an open-source tool that is fast, flexible, and easily extensible.
Networks of collaboration between engineers are reflected in traces of developers' activity in version control systems (VCSs). Extracting data from Git repositories is an essential task for researchers and practitioners working on socio-technical analysis, but it requires substantial engineering work. With increasing interest in analysing socio-technical data and applying it in practice, there are no flexible and easily reusable tools to retrieve socio-technical information from VCSs. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from their core research questions. In this paper, we present TNM -- an open-source tool for mining socio-technical data from Git repositories. TNM is fast, flexible, and easily extensible. TNM is available on GitHub: https://github.com/JetBrains-Research/tnm