Rafał Skinderowicz

4papers

186citations

Novelty53%

AI Score26

Ranked #166,408 of 201,326 authors (top 83%)#638 in NE (top 56%)

4 Papers

NEMar 4, 2022

Improving Ant Colony Optimization Efficiency for Solving Large TSP Instances

Rafał Skinderowicz

Ant Colony Optimization (ACO) is a family of nature-inspired metaheuristics often applied to finding approximate solutions to difficult optimization problems. Despite being significantly faster than exact methods, the ACOs can still be prohibitively slow, especially if compared to basic problem-specific heuristics. As recent research has shown, it is possible to significantly improve the performance through algorithm refinements and careful parallel implementation benefiting from multi-core CPUs and dedicated accelerators. In this paper, we present a novel ACO variant, namely the Focused ACO (FACO). One of the core elements of the FACO is a mechanism for controlling the number of differences between a newly constructed and a selected previous solution. The mechanism results in a more focused search process, allowing to find improvements while preserving the quality of the existing solution. An additional benefit is a more efficient integration with a problem-specific local search. Computational study based on a range of the Traveling Salesman Problem instances shows that the FACO outperforms the state-of-the-art ACOs when solving large TSP instances. Specifically, the FACO required less than an hour of an 8-core commodity CPU time to find high-quality solutions (within 1% from the best-known results) for TSP Art Instances ranging from 100000 to 200000 nodes.

NEJan 18, 2020

Implementing a GPU-based parallel MAX-MIN Ant System

Rafał Skinderowicz

The MAX-MIN Ant System (MMAS) is one of the best-known Ant Colony Optimization (ACO) algorithms proven to be efficient at finding satisfactory solutions to many difficult combinatorial optimization problems. The slow-down in Moore's law, and the availability of graphics processing units (GPUs) capable of conducting general-purpose computations at high speed, has sparked considerable research efforts into the development of GPU-based ACO implementations. In this paper, we discuss a range of novel ideas for improving the GPU-based parallel MMAS implementation, allowing it to better utilize the computing power offered by two subsequent Nvidia GPU architectures. Specifically, based on the weighted reservoir sampling algorithm we propose a novel parallel implementation of the node selection procedure, which is at the heart of the MMAS and other ACO algorithms. We also present a memory-efficient implementation of another key-component -- the tabu list structure -- which is used in the ACO's solution construction stage. The proposed implementations, combined with the existing approaches, lead to a total of six MMAS variants, which are evaluated on a set of Traveling Salesman Problem (TSP) instances ranging from 198 to 3,795 cities. The results show that our MMAS implementation is competitive with state-of-the-art GPU-based and multi-core CPU-based parallel ACO implementations: in fact, the times obtained for the Nvidia V100 Volta GPU were up to 7.18x and 21.79x smaller, respectively. The fastest of the proposed MMAS variants is able to generate over 1 million candidate solutions per second when solving a 1,002-city instance. Moreover, we show that, combined with the 2-opt local search heuristic, the proposed parallel MMAS finds high-quality solutions for the TSP instances with up to 18,512 nodes.

AIMay 2, 2017

An improved Ant Colony System for the Sequential Ordering Problem

Rafał Skinderowicz

It is not rare that the performance of one metaheuristic algorithm can be improved by incorporating ideas taken from another. In this article we present how Simulated Annealing (SA) can be used to improve the efficiency of the Ant Colony System (ACS) and Enhanced ACS when solving the Sequential Ordering Problem (SOP). Moreover, we show how the very same ideas can be applied to improve the convergence of a dedicated local search, i.e. the SOP-3-exchange algorithm. A statistical analysis of the proposed algorithms both in terms of finding suitable parameter values and the quality of the generated solutions is presented based on a series of computational experiments conducted on SOP instances from the well-known TSPLIB and SOPLIB2006 repositories. The proposed ACS-SA and EACS-SA algorithms often generate solutions of better quality than the ACS and EACS, respectively. Moreover, the EACS-SA algorithm combined with the proposed SOP-3-exchange-SA local search was able to find 10 new best solutions for the SOP instances from the SOPLIB2006 repository, thus improving the state-of-the-art results as known from the literature. Overall, the best known or improved solutions were found in 41 out of 48 cases.

DCMay 9, 2016

The GPU-based Parallel Ant Colony System

Rafał Skinderowicz

The Ant Colony System (ACS) is, next to Ant Colony Optimization (ACO) and the MAX-MIN Ant System (MMAS), one of the most efficient metaheuristic algorithms inspired by the behavior of ants. In this article we present three novel parallel versions of the ACS for the graphics processing units (GPUs). To the best of our knowledge, this is the first such work on the ACS which shares many key elements of the ACO and the MMAS, but differences in the process of building solutions and updating the pheromone trails make obtaining an efficient parallel version for the GPUs a difficult task. The proposed parallel versions of the ACS differ mainly in their implementations of the pheromone memory. The first two use the standard pheromone matrix, and the third uses a novel selective pheromone memory. Computational experiments conducted on several Travelling Salesman Problem (TSP) instances of sizes ranging from 198 to 2392 cities showed that the parallel ACS on Nvidia Kepler GK104 GPU (1536 CUDA cores) is able to obtain a speedup up to 24.29x vs the sequential ACS running on a single core of Intel Xeon E5-2670 CPU. The parallel ACS with the selective pheromone memory achieved speedups up to 16.85x, but in most cases the obtained solutions were of significantly better quality than for the sequential ACS.