AIMay 30
LLM-Driven Co-Evolutionary Automated Heuristic Design for Bi-Component Coupled Combinatorial OptimizationMingen Kuang, Xudong Deng, Xi Lin et al.
While Large Language Models (LLMs) have recently shown promise in Automated Heuristic Design (AHD), existing methods typically generate and evolve heuristics as a single operator or search strategy, limiting their ability to model strong coupling among multiple decision substructures in problems such as the Traveling Thief Problem (TTP) and the Traveling Purchaser Problem (TPP). In this work, we propose CoEvo-AHD, an LLM-driven dual-population co-evolutionary framework for automated heuristic design in coupled combinatorial optimization. Unlike prior methods that evolve individual heuristics in isolation, CoEvo-AHD leverages LLMs to co-evolve two closely related operator populations. A cooperative evaluation mechanism explicitly captures interactions between route and selection operators, while pairwise scoring and synergistic joint crossover help discover complementary operator logic for joint improvement across coupled decision subspaces. We further design a tool-invocation environment library that encapsulates frequently used core operations, such as local-search delta computation, into callable functions, enabling LLM-generated operators to use standardized interfaces instead of reimplementing inefficient and error-prone problem-specific loops. Experiments on TTP and TPP show that CoEvo-AHD automatically discovers cooperative heuristic combinations and achieves competitive solution quality against traditional heuristics.
NEMar 21Code
Decoupling Numerical and Structural Parameters: An Empirical Study on Adaptive Genetic Algorithms via Deep Reinforcement Learning for the Large-Scale TSPHongyu Wang, Yuhan Jing, Yibing Shi et al.
Proper parameter configuration is a prerequisite for the success of Evolutionary Algorithms (EAs). While various adaptive strategies have been proposed, it remains an open question whether all control dimensions contribute equally to algorithmic scalability. To investigate this, we categorize control variables into numerical parameters (e.g., crossover and mutation rates) and structural parameters (e.g., population size and operator switching), hypothesizing that they play distinct roles. This paper presents an empirical study utilizing a dual-level Deep Reinforcement Learning (DRL) framework to decouple and analyze the impact of these two dimensions on the Traveling Salesman Problem (TSP). We employ a Recurrent PPO agent to dynamically regulate these parameters, treating the DRL model as a probe to reveal evolutionary dynamics. Experimental results confirm the effectiveness of this approach: the learned policies outperform static baselines, reducing the optimality gap by approximately 45% on the largest tested instance (rl5915). Building on this validated framework, our ablation analysis reveals a fundamental insight: while numerical tuning offers local refinement, structural plasticity is the decisive factor in preventing stagnation and facilitating escape from local optima. These findings suggest that future automated algorithm design should prioritize dynamic structural reconfiguration over fine-grained probability adjustment. To facilitate reproducibility, the source code is available at https://github.com/StarDream1314/DRLGA-TSP
CLApr 3Code
JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token EfficiencyAichen Cai, Anmeng Zhang, Anyu Li et al.
We introduce JoyAI-LLM Flash, an efficient Mixture-of-Experts (MoE) language model designed to redefine the trade-off between strong performance and token efficiency in the sub-50B parameter regime. JoyAI-LLM Flash is pretrained on a massive corpus of 20 trillion tokens and further optimized through a rigorous post-training pipeline, including supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and large-scale reinforcement learning (RL) across diverse environments. To improve token efficiency, JoyAI-LLM Flash strategically balances \emph{thinking} and \emph{non-thinking} cognitive modes and introduces FiberPO, a novel RL algorithm inspired by fibration theory that decomposes trust-region maintenance into global and local components, providing unified multi-scale stability control for LLM policy optimization. To enhance architectural sparsity, the model comprises 48B total parameters while activating only 2.7B parameters per forward pass, achieving a substantially higher sparsity ratio than contemporary industry leading models of comparable scale. To further improve inference throughput, we adopt a joint training-inference co-design that incorporates dense Multi-Token Prediction (MTP) and Quantization-Aware Training (QAT). We release the checkpoints for both JoyAI-LLM-48B-A3B Base and its post-trained variants on Hugging Face to support the open-source community.
OCJul 29, 2024
On the Effects of Smoothing Rugged Landscape by Different Toy Problems: A Case Study on UBQPWei Wang, Jialong Shi, Jianyong Sun et al.
The hardness of the Unconstrained Binary Quadratic Program (UBQP) problem is due its rugged landscape. Various algorithms have been proposed for UBQP, including the Landscape Smoothing Iterated Local Search (LSILS). Different from other UBQP algorithms, LSILS tries to smooth the rugged landscape by building a convex combination of the original UBQP and a toy UBQP. In this paper, our study further investigates the impact of smoothing rugged landscapes using different toy UBQP problems, including a toy UBQP with matrix ^Q1 (construct by "+/-1"), a toy UBQP with matrix ^Q2 (construct by "+/-i") and a toy UBQP with matrix ^Q3 (construct randomly). We first assess the landscape flatness of the three toy UBQPs. Subsequently, we test the efficiency of LSILS with different toy UBQPs. Results reveal that the toy UBQP with ^Q1 (construct by "+/-1") exhibits the flattest landscape among the three, while the toy UBQP with ^Q3 (construct randomly) presents the most non-flat landscape. Notably, LSILS using the toy UBQP with ^Q2 (construct by "+/-i") emerges as the most effective, while ^Q3 (construct randomly) has the poorest result. These findings contribute to a detailed understanding of landscape smoothing techniques in optimizing UBQP.
CLFeb 3
MIRROR: A Multi-Agent Framework with Iterative Adaptive Revision and Hierarchical Retrieval for Optimization Modeling in Operations ResearchYifan Shi, Jialong Shi, Jiayi Wang et al.
Operations Research (OR) relies on expert-driven modeling-a slow and fragile process ill-suited to novel scenarios. While large language models (LLMs) can automatically translate natural language into optimization models, existing approaches either rely on costly post-training or employ multi-agent frameworks, yet most still lack reliable collaborative error correction and task-specific retrieval, often leading to incorrect outputs. We propose MIRROR, a fine-tuning-free, end-to-end multi-agent framework that directly translates natural language optimization problems into mathematical models and solver code. MIRROR integrates two core mechanisms: (1) execution-driven iterative adaptive revision for automatic error correction, and (2) hierarchical retrieval to fetch relevant modeling and coding exemplars from a carefully curated exemplar library. Experiments show that MIRROR outperforms existing methods on standard OR benchmarks, with notable results on complex industrial datasets such as IndustryOR and Mamo-ComplexLP. By combining precise external knowledge infusion with systematic error correction, MIRROR provides non-expert users with an efficient and reliable OR modeling solution, overcoming the fundamental limitations of general-purpose LLMs in expert optimization tasks.
AIJan 29
TIDE: Tuning-Integrated Dynamic Evolution for LLM-Based Automated Heuristic DesignChentong Chen, Mengyuan Zhong, Ye Fan et al.
Although Large Language Models have advanced Automated Heuristic Design, treating algorithm evolution as a monolithic text generation task overlooks the coupling between discrete algorithmic structures and continuous numerical parameters. Consequently, existing methods often discard promising algorithms due to uncalibrated constants and suffer from premature convergence resulting from simple similarity metrics. To address these limitations, we propose TIDE, a Tuning-Integrated Dynamic Evolution framework designed to decouple structural reasoning from parameter optimization. TIDE features a nested architecture where an outer parallel island model utilizes Tree Similarity Edit Distance to drive structural diversity, while an inner loop integrates LLM-based logic generation with a differential mutation operator for parameter tuning. Additionally, a UCB-based scheduler dynamically prioritizes high-yield prompt strategies to optimize resource allocation. Extensive experiments across nine combinatorial optimization problems demonstrate that TIDE discovers heuristics that significantly outperform state-of-the-art baselines in solution quality while achieving improved search efficiency and reduced computational costs.
AIApr 28, 2025
Fitness Landscape of Large Language Model-Assisted Automated Algorithm SearchFei Liu, Qingfu Zhang, Jialong Shi et al.
Using Large Language Models (LLMs) in an evolutionary or other iterative search framework have demonstrated significant potential in automated algorithm design. However, the underlying fitness landscape, which is critical for understanding its search behavior, remains underexplored. In this paper, we illustrate and analyze the fitness landscape of LLM-assisted Algorithm Search (LAS) using a graph-based approach, where nodes represent algorithms and edges denote transitions between them. We conduct extensive evaluations across six algorithm design tasks and six commonly-used LLMs. Our findings reveal that LAS landscapes are highly multimodal and rugged, particularly in combinatorial optimization tasks, with distinct structural variations across tasks and LLMs. Moreover, we adopt four different methods for algorithm similarity measurement and study their correlations to algorithm performance and operator behaviour. These insights not only deepen our understanding of LAS landscapes but also provide practical insights for designing more effective LAS methods.
OCJan 6, 2024
A New Parallel Cooperative Landscape Smoothing Algorithm and Its Applications on TSP and UBQPWei Wang, Jialong Shi, Jianyong Sun et al.
Combinatorial optimization problem (COP) is difficult to solve because of the massive number of local optimal solutions in his solution space. Various methods have been put forward to smooth the solution space of COPs, including homotopic convex (HC) transformation for the traveling salesman problem (TSP). This paper extends the HC transformation approach to unconstrained binary quadratic programming (UBQP) by proposing a method to construct a unimodal toy UBQP of any size. We theoretically prove the unimodality of the constructed toy UBQP. After that, we apply this unimodal toy UBQP to smooth the original UBQP by using the HC transformation framework and empirically verify the smoothing effects. Subsequently, we introduce an iterative algorithmic framework incorporating HC transformation, referred as landscape smoothing iterated local search (LSILS). Our experimental analyses, conducted on various UBQP instances show the effectiveness of LSILS. Furthermore, this paper proposes a parallel cooperative variant of LSILS, denoted as PC-LSILS and apply it to both the UBQP and the TSP. Our experimental findings highlight that PC-LSILS improves the smoothing performance of the HC transformation, and further improves the overall performance of the algorithm.
AIAug 18, 2025
HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic DesignChentong Chen, Mengyuan Zhong, Jianyong Sun et al.
LLM-based Automatic Heuristic Design (AHD) within Evolutionary Computation (EC) frameworks has shown promising results. However, its effectiveness is hindered by the use of static operators and the lack of knowledge accumulation mechanisms. We introduce HiFo-Prompt, a framework that guides LLMs with two synergistic prompting strategies: Foresight and Hindsight. Foresight-based prompts adaptively steer the search based on population dynamics, managing the exploration-exploitation trade-off. In addition, hindsight-based prompts mimic human expertise by distilling successful heuristics from past generations into fundamental, reusable design principles. This dual mechanism transforms transient discoveries into a persistent knowledge base, enabling the LLM to learn from its own experience. Empirical results demonstrate that HiFo-Prompt significantly outperforms state-of-the-art LLM-based AHD methods, generating higher-quality heuristics while achieving substantially faster convergence and superior query efficiency.
NENov 12, 2019
Multi-objectivization Inspired Metaheuristics for the Sum-of-the-Parts Combinatorial Optimization ProblemsJialong Shi, Jianyong Sun, Qingfu Zhang
Multi-objectivization is a term used to describe strategies developed for optimizing single-objective problems by multi-objective algorithms. This paper focuses on multi-objectivizing the sum-of-the-parts combinatorial optimization problems, which include the traveling salesman problem, the unconstrained binary quadratic programming and other well-known combinatorial optimization problem. For a sum-of-the-parts combinatorial optimization problem, we propose to decompose its original objective into two sub-objectives with controllable correlation. Based on the decomposition method, two new multi-objectivization inspired single-objective optimization techniques called non-dominance search and non-dominance exploitation are developed, respectively. Non-dominance search is combined with two metaheuristics, namely iterated local search and iterated tabu search, while non-dominance exploitation is embedded within the iterated Lin-Kernighan metaheuristic. The resultant metaheuristics are called ILS+NDS, ITS+NDS and ILK+NDE, respectively. Empirical studies on some TSP and UBQP instances show that with appropriate correlation between the sub-objectives, there are more chances to escape from local optima when new starting solution is selected from the non-dominated solutions defined by the decomposed sub-objectives. Experimental results also show that ILS+NDS, ITS+NDS and ILK+NDE all significantly outperform their counterparts on most of the test instances.
NEMay 14, 2019
Homotopic Convex Transformation: A New Landscape Smoothing Method for the Traveling Salesman ProblemJialong Shi, Jianyong Sun, Qingfu Zhang et al.
This paper proposes a novel landscape smoothing method for the symmetric Traveling Salesman Problem (TSP). We first define the Homotopic Convex (HC) transformation of a TSP as a convex combination of a well-constructed simple TSP and the original TSP. The simple TSP, called the convex-hull TSP, is constructed by transforming a known local or global optimum. We observe that controlled by the coefficient of the convex combination, with local or global optimum, (i) the landscape of the HC transformed TSP is smoothed in terms that its number of local optima is reduced compared to the original TSP; (ii) the fitness distance correlation of the HC transformed TSP is increased. Further, we observe that the smoothing effect of the HC transformation depends highly on the quality of the used optimum. A high-quality optimum leads to a better smoothing effect than a low-quality optimum. We then propose an iterative algorithmic framework in which the proposed HC transformation is combined within a heuristic TSP solver. It works as an escaping scheme from local optima aiming to improve the global search ability of the combined heuristic. Case studies using the 3-Opt and the Lin-Kernighan local search as the heuristic solver show that the resultant algorithms significantly outperform their counterparts and two other smoothing-based TSP heuristic solvers on most of the test instances with up to 20,000 cities.
AISep 22, 2017
EB-GLS: An Improved Guided Local Search Based on the Big Valley StructureJialong Shi, Qingfu Zhang, Edward Tsang
Local search is a basic building block in memetic algorithms. Guided Local Search (GLS) can improve the efficiency of local search. By changing the guide function, GLS guides a local search to escape from locally optimal solutions and find better solutions. The key component of GLS is its penalizing mechanism which determines which feature is selected to penalize when the search is trapped in a locally optimal solution. The original GLS penalizing mechanism only makes use of the cost and the current penalty value of each feature. It is well known that many combinatorial optimization problems have a big valley structure, i.e., the better a solution is, the more the chance it is closer to a globally optimal solution. This paper proposes to use big valley structure assumption to improve the GLS penalizing mechanism. An improved GLS algorithm called Elite Biased GLS (EB-GLS) is proposed. EB-GLS records and maintains an elite solution as an estimate of the globally optimal solutions, and reduces the chance of penalizing the features in this solution. We have systematically tested the proposed algorithm on the symmetric traveling salesman problem. Experimental results show that EB-GLS is significantly better than GLS.