SE AIAug 13, 2025

Exploring the Potential of Large Language Models in Fine-Grained Review Comment Classification

Linh Nguyen, Chunhua Liu, Hong Yi Lin, Patanamon Thongtanunam

arXiv:2508.09832v11 citationsh-index: 22SCAM

Originality Incremental advance

AI Analysis

This work addresses the need for scalable code review analytics in software development, offering a solution that reduces reliance on extensive manual annotation, though it is incremental as it applies existing LLMs to a specific domain.

The authors tackled the problem of classifying code review comments into 17 categories by exploring Large Language Models (LLMs), which outperformed a state-of-the-art deep learning model, particularly in handling low-frequency categories where the previous approach struggled due to limited training data.

Code review is a crucial practice in software development. As code review nowadays is lightweight, various issues can be identified, and sometimes, they can be trivial. Research has investigated automated approaches to classify review comments to gauge the effectiveness of code reviews. However, previous studies have primarily relied on supervised machine learning, which requires extensive manual annotation to train the models effectively. To address this limitation, we explore the potential of using Large Language Models (LLMs) to classify code review comments. We assess the performance of LLMs to classify 17 categories of code review comments. Our results show that LLMs can classify code review comments, outperforming the state-of-the-art approach using a trained deep learning model. In particular, LLMs achieve better accuracy in classifying the five most useful categories, which the state-of-the-art approach struggles with due to low training examples. Rather than relying solely on a specific small training data distribution, our results show that LLMs provide balanced performance across high- and low-frequency categories. These results suggest that the LLMs could offer a scalable solution for code review analytics to improve the effectiveness of the code review process.

View on arXiv PDF

Similar