Enhancing Understandability and Transparency of Research Software: Tracing Research to Code
For researchers and reviewers, this tool addresses the problem of understanding complex research software by automating the mapping between paper ideas and code.
The authors propose an LLM-based tool to automatically generate traces linking research ideas in papers to their code implementations, aiming to reduce onboarding time for newcomers and aid conference reviewers in evaluating replication packages. Initial experiments show the tool produces useful mappings.
Modern research heavily relies on software. A significant challenge researchers face is understanding the complex software used in specific research fields. We target two scenarios in this context, namely long onboarding times for newcomers and conference reviewers evaluating replication packages. We hypothesize that both scenarios can be significantly improved when there is a clear link between the paper's ideas and the code that implements them. As a time- and staff-saving approach, we propose an LLM-based automation tool that takes in a paper and the software implementing the paper, and generates a trace mapping between research ideas and their locations in code. Initial experiments have shown that the tool can generate quite useful mappings.