Kazumasa Shimari

h-index25

5papers

2citations

Novelty23%

AI Score43

Ranked #82,771 of 201,326 authors (top 41%)#984 in SE (top 29%)

5 Papers

SEApr 30Code

A Longitudinal Analysis of Good First Issue Practices and Newcomer Pull Requests in Popular OSS Projects

Hirotatsu Hoshikawa, Hidetake Tanaka, Kazumasa Shimari et al.

Open-source software (OSS) projects rely on effective newcomer onboarding to sustain their communities. OSS projects widely adopt "good first issue" (GFI) labels to highlight beginner-friendly tasks. As development practices continue to evolve, understanding how these onboarding mechanisms change over time is important for both maintainers and researchers. This study analyzes 406,826 issues and 1,117 newcomer GFI pull requests across 37 popular GitHub repositories (30 of which use GFI labels) over a four-year period from July 2021 to June 2025. We find that while the proportion of issues with GFI labels remained stable during the first three years, it underwent a statistically significant decline beginning in January 2024, with substantial variation across projects not explained by repository age or programming language. Despite this supply-side decline, newcomer engagement with GFI issues remains stable at approximately 27%, suggesting that GFI labels maintain consistent attractiveness. Examining the outcomes of this engagement, we find that the merge rate of newcomer GFI pull requests declined from 61.9% to 42.2%. Initial pull request characteristics such as description length and code size show no significant association with merge outcomes, indicating that success is not predicted by the quantitative characteristics of the initial submission alone. Together, these findings reveal a widening gap between stable newcomer interest in GFIs and the declining availability and success of GFI-based onboarding, underscoring the need for maintainers to sustain both GFI labeling and review support.

AIFeb 19

How AI Coding Agents Communicate: A Study of Pull Request Description Characteristics and Human Review Responses

Kan Watanabe, Rikuto Tsuchida, Takahiro Monno et al.

The rapid adoption of large language models has led to the emergence of AI coding agents that autonomously create pull requests on GitHub. However, how these agents differ in their pull request description characteristics, and how human reviewers respond to them, remains underexplored. In this study, we conduct an empirical analysis of pull requests created by five AI coding agents using the AIDev dataset. We analyze agent differences in pull request description characteristics, including structural features, and examine human reviewer response in terms of review activity, response timing, sentiment, and merge outcomes. We find that AI coding agents exhibit distinct PR description styles, which are associated with differences in reviewer engagement, response time, and merge outcomes. We observe notable variation across agents in both reviewer interaction metrics and merge rates. These findings highlight the role of pull request presentation and reviewer interaction dynamics in human-AI collaborative software development.

SEApr 27

How Do Developers Use Migration Guides? A Case Study of Log4j

Takahiro Monno, Kazumasa Shimari, Tetsuya Kanda et al.

Migration guides are a form of software documentation that helps developers address breaking changes introduced in library version updates. Prior studies have examined documents such as release notes, API reference manuals, and patch notes. However, research that focuses specifically on migration guides remains limited. Improving the usability and coverage of migration guides is essential for helping developers resolve breaking changes efficiently. Yet, we still lack a clear understanding of how migration guides are currently provided and how developers use them in practice. To fill this gap, we first investigate whether libraries known to introduce incompatibilities provide migration guides. We then conduct a detailed case study on Log4j, a library that has experienced large-scale breaking updates in the past. We empirically analyze how developers refer to and use the official migration guide in real-world projects. We find that pull request authors most frequently reference the migration guide in the pull request description, and that most references (82.81\%) link to the entire guide rather than specific sections. We also find that developers use migration guides not only during major version updates but also during subsequent maintenance tasks, suggesting that the guides serve as a resource throughout the entire migration process.

CVOct 20, 2025

Round Outcome Prediction in VALORANT Using Tactical Features from Video Analysis

Nirai Hayakawa, Kazumasa Shimari, Kazuma Yamasaki et al.

Recently, research on predicting match outcomes in esports has been actively conducted, but much of it is based on match log data and statistical information. This research targets the FPS game VALORANT, which requires complex strategies, and aims to build a round outcome prediction model by analyzing minimap information in match footage. Specifically, based on the video recognition model TimeSformer, we attempt to improve prediction accuracy by incorporating detailed tactical features extracted from minimap information, such as character position information and other in-game events. This paper reports preliminary results showing that a model trained on a dataset augmented with such tactical event labels achieved approximately 81% prediction accuracy, especially from the middle phases of a round onward, significantly outperforming a model trained on a dataset with the minimap information itself. This suggests that leveraging tactical features from match footage is highly effective for predicting round outcomes in VALORANT.

SEJun 18, 2025

Uncovering Intention through LLM-Driven Code Snippet Description Generation

Yusuf Sulistyo Nugroho, Farah Danisha Salam, Brittany Reid et al.

Documenting code snippets is essential to pinpoint key areas where both developers and users should pay attention. Examples include usage examples and other Application Programming Interfaces (APIs), which are especially important for third-party libraries. With the rise of Large Language Models (LLMs), the key goal is to investigate the kinds of description developers commonly use and evaluate how well an LLM, in this case Llama, can support description generation. We use NPM Code Snippets, consisting of 185,412 packages with 1,024,579 code snippets. From there, we use 400 code snippets (and their descriptions) as samples. First, our manual classification found that the majority of original descriptions (55.5%) highlight example-based usage. This finding emphasizes the importance of clear documentation, as some descriptions lacked sufficient detail to convey intent. Second, the LLM correctly identified the majority of original descriptions as "Example" (79.75%), which is identical to our manual finding, showing a propensity for generalization. Third, compared to the originals, the produced description had an average similarity score of 0.7173, suggesting relevance but room for improvement. Scores below 0.9 indicate some irrelevance. Our results show that depending on the task of the code snippet, the intention of the document may differ from being instructions for usage, installations, or descriptive learning examples for any user of a library.