LGMLNov 4, 2019

Learning based Methods for Code Runtime Complexity Prediction

arXiv:1911.01155v120 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of providing real-time feedback on code efficiency for developers, with potential applications in automated grading and static analysis tools, though it is incremental as it builds on existing methods for a known bottleneck.

The paper tackles the problem of predicting code runtime complexity by modeling it as a machine learning task and proposes a new annotated dataset, CoRCoD, achieving state-of-the-art results with baselines using feature engineering and code embeddings.

Predicting the runtime complexity of a programming code is an arduous task. In fact, even for humans, it requires a subtle analysis and comprehensive knowledge of algorithms to predict time complexity with high fidelity, given any code. As per Turing's Halting problem proof, estimating code complexity is mathematically impossible. Nevertheless, an approximate solution to such a task can help developers to get real-time feedback for the efficiency of their code. In this work, we model this problem as a machine learning task and check its feasibility with thorough analysis. Due to the lack of any open source dataset for this task, we propose our own annotated dataset CoRCoD: Code Runtime Complexity Dataset, extracted from online judges. We establish baselines using two different approaches: feature engineering and code embeddings, to achieve state of the art results and compare their performances. Such solutions can be widely useful in potential applications like automatically grading coding assignments, IDE-integrated tools for static code analysis, and others.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes