Speeding Up OPFython with Numba
This work addresses performance issues for users of OPFython in handling large datasets, but it is incremental as it applies an existing optimization tool to a specific algorithm.
The paper tackled the slow performance of Python-based Optimum-Path Forest (OPF) algorithms by using Numba to accelerate calculations, resulting in improved speed and better results compared to the naive Python version.
A graph-inspired classifier, known as Optimum-Path Forest (OPF), has proven to be a state-of-the-art algorithm comparable to Logistic Regressors, Support Vector Machines in a wide variety of tasks. Recently, its Python-based version, denoted as OPFython, has been proposed to provide a more friendly framework and a faster prototyping environment. Nevertheless, Python-based algorithms are slower than their counterpart C-based algorithms, impacting their performance when confronted with large amounts of data. Therefore, this paper proposed a simple yet highly efficient speed up using the Numba package, which accelerates Numpy-based calculations and attempts to increase the algorithm's overall performance. Experimental results showed that the proposed approach achieved better results than the naïve Python-based OPF and speeded up its distance measurement calculation.