SEApr 14Code
Social Life of Code: Modeling Evolution through Code Embedding and Opinion DynamicsYulong He, Nikita Verbin, Sergey Kovalchuk
Software repositories provide a detailed record of software evolution by capturing developer interactions through code-related activities such as pull requests and modifications. To better understand the underlying dynamics of codebase evolution, we introduce a novel approach that integrates semantic code embeddings with opinion dynamics theory, offering a quantitative framework to analyze collaborative development processes. Our approach begins by encoding code snippets into high-dimensional vector representations using state-of-the-art code embedding models, preserving both syntactic and semantic features. These embeddings are then processed using Principal Component Analysis (PCA) for dimensionality reduction, with data normalized to ensure comparability. We model temporal evolution using the Expressed-Private Opinion (EPO) model to derive trust matrices and track opinion trajectories across development cycles. These opinion trajectories reflect the underlying dynamics of consensus formation, influence propagation, and evolving alignment (or divergence) within developer communities -- revealing implicit collaboration patterns and knowledge-sharing mechanisms that are otherwise difficult to observe. By bridging software engineering and computational social science, our method provides a principled way to quantify software evolution, offering new insights into developer influence, consensus formation, and project sustainability. We evaluate our approach on data from three prominent open-source GitHub repositories, demonstrating its ability to reveal interpretable behavioral trends and variations in developer interactions. The results highlight the utility of our framework in improving open-source project maintenance through data-driven analysis of collaboration dynamics.
SEFeb 5Code
ArkTS-CodeSearch: A Open-Source ArkTS Dataset for Code RetrievalYulong He, Artem Ermakov, Sergey Kovalchuk et al.
ArkTS is a core programming language in the OpenHarmony ecosystem, yet research on ArkTS code intelligence is hindered by the lack of public datasets and evaluation benchmarks. This paper presents a large-scale ArkTS dataset constructed from open-source repositories, targeting code retrieval and code evaluation tasks. We design a single-search task, where natural language comments are used to retrieve corresponding ArkTS functions. ArkTS repositories are crawled from GitHub and Gitee, and comment-function pairs are extracted using tree-sitter-arkts, followed by cross-platform deduplication and statistical analysis of ArkTS function types. We further evaluate existing open-source code embedding models on the single-search task and perform fine-tuning using both ArkTS and TypeScript training datasets, resulting in a high-performing model for ArkTS code understanding. This work establishes the first systematic benchmark for ArkTS code retrieval. Both the dataset and our fine-tuned model are available at https://huggingface.co/hreyulog/embedinggemma_arkts and https://huggingface.co/datasets/hreyulog/arkts-code-docstring .
AIApr 4Code
Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudgeYulong He, Ivan Smirnov, Dmitry Fedrushkov et al.
Effective evaluation of large language models (LLMs) remains a critical bottleneck, as conventional direct scoring often yields inconsistent and opaque judgments. In this work, we adapt the Analytic Hierarchy Process (AHP) to LLM-based evaluation and, more importantly, propose a confidence-aware Fuzzy AHP (FAHP) extension that models epistemic uncertainty via triangular fuzzy numbers modulated by LLM-generated confidence scores. Systematically validated on JudgeBench, our structured approach decomposes assessments into explicit criteria and incorporates uncertainty-aware aggregation, producing more calibrated judgments. Extensive experiments demonstrate that both crisp and fuzzy AHP consistently outperform direct scoring across model scales and dataset splits, with FAHP showing superior stability in uncertain comparison scenarios. Building on these insights, we propose \textbf{DualJudge}, a hybrid framework inspired by Dual-Process Theory that adaptively fuses holistic direct scores with structured AHP outputs via consistency-aware weighting. DualJudge achieves state-of-the-art performance, underscoring the complementary strengths of intuitive and deliberative evaluation paradigms. These results establish uncertainty-aware structured reasoning as a principled pathway toward more reliable LLM assessment. Code is available at https://github.com/hreyulog/AHP_llm_judge.
AIMay 26, 2025
Style2Code: A Style-Controllable Code Generation Framework with Dual-Modal Contrastive Representation LearningDutao Zhang, Nicolas Rafael Arroyo Arias, YuLong He et al.
Controllable code generation, the ability to synthesize code that follows a specified style while maintaining functionality, remains a challenging task. We propose a two-stage training framework combining contrastive learning and conditional decoding to enable flexible style control. The first stage aligns code style representations with semantic and structural features. In the second stage, we fine-tune a language model (e.g., Flan-T5) conditioned on the learned style vector to guide generation. Our method supports style interpolation and user personalization via lightweight mixing. Compared to prior work, our unified framework offers improved stylistic control without sacrificing code correctness. This is among the first approaches to combine contrastive alignment with conditional decoding for style-guided code generation.