LGAIJan 1

Complexity-based code embeddings

arXiv:2601.00924v12 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses the need for better code representation in programming competitions, though it appears incremental as it builds on existing complexity-based methods.

The paper tackles the problem of representing source code as numerical embeddings by analyzing program behavior across inputs and applying complexity functions, achieving an average F1-score on a multi-label dataset with 11 classes from Codeforces code snippets.

This paper presents a generic method for transforming the source code of various algorithms to numerical embeddings, by dynamically analysing the behaviour of computer programs against different inputs and by tailoring multiple generic complexity functions for the analysed metrics. The used algorithms embeddings are based on r-Complexity . Using the proposed code embeddings, we present an implementation of the XGBoost algorithm that achieves an average F1-score on a multi-label dataset with 11 classes, built using real-world code snippets submitted for programming competitions on the Codeforces platform.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes