SELGDec 22, 2021

End to End Software Engineering Research

arXiv:2112.11858v14 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses software engineering researchers and practitioners by providing a more automated approach to metric prediction, though it appears incremental as it applies an existing end-to-end paradigm to a new domain.

The paper tackles the problem of predicting software engineering metrics like defects and code quality by proposing an end-to-end learning framework that starts from raw source code, eliminating the need for domain experts and enabling new knowledge extraction, with results demonstrated on a dataset of 5M files from 15k projects.

End to end learning is machine learning starting in raw data and predicting a desired concept, with all steps done automatically. In software engineering context, we see it as starting from the source code and predicting process metrics. This framework can be used for predicting defects, code quality, productivity and more. End-to-end improves over features based machine learning by not requiring domain experts and being able to extract new knowledge. We describe a dataset of 5M files from 15k projects constructed for this goal. The dataset is constructed in a way that enables not only predicting concepts but also investigating their causes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes