SEJan 23
Revisiting the Role of Natural Language Code Comments in Code TranslationMonika Gupta, Ajay Meena, Anamitra Roy Choudhury et al.
The advent of large language models (LLMs) has ushered in a new era in automated code translation across programming languages. Since most code-specific LLMs are pretrained on well-commented code from large repositories like GitHub, it is reasonable to hypothesize that natural language code comments could aid in improving translation quality. Despite their potential relevance, comments are largely absent from existing code translation benchmarks, rendering their impact on translation quality inadequately characterised. In this paper, we present a large-scale empirical study evaluating the impact of comments on translation performance. Our analysis involves more than $80,000$ translations, with and without comments, of $1100+$ code samples from two distinct benchmarks covering pairwise translations between five different programming languages: C, C++, Go, Java, and Python. Our results provide strong evidence that code comments, particularly those that describe the overall purpose of the code rather than line-by-line functionality, significantly enhance translation accuracy. Based on these findings, we propose COMMENTRA, a code translation approach, and demonstrate that it can potentially double the performance of LLM-based code translation. To the best of our knowledge, our study is the first in terms of its comprehensiveness, scale, and language coverage on how to improve code translation accuracy using code comments.
CLNov 2, 2017
Hi, how can I help you?: Automating enterprise IT support help desksSenthil Mani, Neelamadhav Gantayat, Rahul Aralikatte et al.
Question answering is one of the primary challenges of natural language understanding. In realizing such a system, providing complex long answers to questions is a challenging task as opposed to factoid answering as the former needs context disambiguation. The different methods explored in the literature can be broadly classified into three categories namely: 1) classification based, 2) knowledge graph based and 3) retrieval based. Individually, none of them address the need of an enterprise wide assistance system for an IT support and maintenance domain. In this domain the variance of answers is large ranging from factoid to structured operating procedures; the knowledge is present across heterogeneous data sources like application specific documentation, ticket management systems and any single technique for a general purpose assistance is unable to scale for such a landscape. To address this, we have built a cognitive platform with capabilities adopted for this domain. Further, we have built a general purpose question answering system leveraging the platform that can be instantiated for multiple products, technologies in the support domain. The system uses a novel hybrid answering model that orchestrates across a deep learning classifier, a knowledge graph based context disambiguation module and a sophisticated bag-of-words search system. This orchestration performs context switching for a provided question and also does a smooth hand-off of the question to a human expert if none of the automated techniques can provide a confident answer. This system has been deployed across 675 internal enterprise IT support and maintenance projects.
CYSep 2, 2013
A Case-Study on Teaching Undergraduate-Level Software Engineering Course Using Inverted-Classroom, Large-Group, Real-Client and Studio-Based Instruction ModelAshish Sureka, Monika Gupta, Dipto Sarkar et al.
We present a case-study on teaching an undergraduate level course on Software Engineering (second year and fifth semester of bachelors program in Computer Science) at a State University (New Delhi, India) using a novel teaching instruction model. Our approach has four main elements: inverted or flipped classroom, studio-based learning, real-client projects and deployment, large team and peer evaluation. We present our motivation and approach, challenges encountered, pedagogical benefits, findings (both positive and negative) and recommendations. Our motivation was to teach Software Engineering using an active learning (significantly increasing the engagement and collaboration with the Instructor and other students in the class), team-work, balance between theory and practice, imparting both technical and managerial skills encountered in real-world and problem-based learning (through an intensive semester-long project). We conduct a detailed survey (anonymous, optional and online) and present the results of student responses. Survey results reveal that for nearly every students (class size: 89) the instruction model was new, interesting and had a positive impact on the motivation in addition to meeting the learning outcome of the course.