Complexity-based code embeddings
This addresses the need for better code representation in programming competitions, though it appears incremental as it builds on existing complexity-based methods.
The paper tackles the problem of representing source code as numerical embeddings by analyzing program behavior across inputs and applying complexity functions, achieving an average F1-score on a multi-label dataset with 11 classes from Codeforces code snippets.
This paper presents a generic method for transforming the source code of various algorithms to numerical embeddings, by dynamically analysing the behaviour of computer programs against different inputs and by tailoring multiple generic complexity functions for the analysed metrics. The used algorithms embeddings are based on r-Complexity . Using the proposed code embeddings, we present an implementation of the XGBoost algorithm that achieves an average F1-score on a multi-label dataset with 11 classes, built using real-world code snippets submitted for programming competitions on the Codeforces platform.