Roman Pakhomov

h-index10
2papers

2 Papers

53.5LGMay 8
SDG-MoE: Signed Debate Graph Mixture-of-Experts

Stepan Kulibaba, Kirill Labzin, Artem Dzhalilov et al.

Sparse MoE models achieve a good balance between capacity and compute by routing each token to a small subset of experts. However, in most MoE architectures, once a token is routed, the selected experts process it independently and their outputs are combined via a weighted sum. This leaves open whether enabling communication among them could improve performance. While prior work has raised this question, direct interaction among the active routed experts remains underexplored. In this paper, we propose SDG-MoE (Signed Debate Graph Mixture-of-Experts), a novel architecture that adds a lightweight, iterative deliberation step before final aggregation. SDG-MoE introduces three components: (i) two learned interaction matrices over the active experts, a support graph $A^+$ and a critique graph $A^-$, capturing reinforcing and corrective influences; (ii) a signed message-passing step that updates expert representations before aggregation; and (iii) a disagreement-gated Friedkin-Johnsen-style anchoring that controls deliberation strength while preventing expert drift. Together, these enable a structured deliberation process where interaction strength scales with disagreement and specialization is preserved. We also provide a theoretical analysis establishing stability conditions on expert states and showing that deliberation adds only low-order overhead over the active set. In controlled three-seed pretraining experiments, SDG-MoE improves validation perplexity over both an unsigned graph communication baseline and vanilla MoE, outperforming the strongest baseline by 19.8%, and gives the best external perplexity on WikiText-103, C4, and Paloma among the compared systems.

AIAug 13, 2025
KompeteAI: Accelerated Autonomous Multi-Agent System for End-to-End Pipeline Generation for Machine Learning Problems

Stepan Kulibaba, Artem Dzhalilov, Roman Pakhomov et al.

Recent Large Language Model (LLM)-based AutoML systems demonstrate impressive capabilities but face significant limitations such as constrained exploration strategies and a severe execution bottleneck. Exploration is hindered by one-shot methods lacking diversity and Monte Carlo Tree Search (MCTS) approaches that fail to recombine strong partial solutions. The execution bottleneck arises from lengthy code validation cycles that stifle iterative refinement. To overcome these challenges, we introduce KompeteAI, a novel AutoML framework with dynamic solution space exploration. Unlike previous MCTS methods that treat ideas in isolation, KompeteAI introduces a merging stage that composes top candidates. We further expand the hypothesis space by integrating Retrieval-Augmented Generation (RAG), sourcing ideas from Kaggle notebooks and arXiv papers to incorporate real-world strategies. KompeteAI also addresses the execution bottleneck via a predictive scoring model and an accelerated debugging method, assessing solution potential using early stage metrics to avoid costly full-code execution. This approach accelerates pipeline evaluation 6.9 times. KompeteAI outperforms leading methods (e.g., RD-agent, AIDE, and Ml-Master) by an average of 3\% on the primary AutoML benchmark, MLE-Bench. Additionally, we propose Kompete-bench to address limitations in MLE-Bench, where KompeteAI also achieves state-of-the-art results