Jiří Kubalík

LG
h-index24
7papers
158citations
Novelty51%
AI Score31

7 Papers

LGMay 31, 2022
SymFormer: End-to-end symbolic regression using transformer-based architecture

Martin Vastl, Jonáš Kulhánek, Jiří Kubalík et al.

Many real-world problems can be naturally described by mathematical formulas. The task of finding formulas from a set of observed inputs and outputs is called symbolic regression. Recently, neural networks have been applied to symbolic regression, among which the transformer-based ones seem to be the most promising. After training the transformer on a large number of formulas (in the order of days), the actual inference, i.e., finding a formula for new, unseen data, is very fast (in the order of seconds). This is considerably faster than state-of-the-art evolutionary methods. The main drawback of transformers is that they generate formulas without numerical constants, which have to be optimized separately, so yielding suboptimal results. We propose a transformer-based approach called SymFormer, which predicts the formula by outputting the individual symbols and the corresponding constants simultaneously. This leads to better performance in terms of fitting the available data. In addition, the constants provided by SymFormer serve as a good starting point for subsequent tuning via gradient descent to further improve the performance. We show on a set of benchmarks that SymFormer outperforms two state-of-the-art methods while having faster inference.

NEFeb 1, 2023
Toward Physically Plausible Data-Driven Models: A Novel Neural Network Approach to Symbolic Regression

Jiří Kubalík, Erik Derner, Robert Babuška

Many real-world systems can be described by mathematical models that are human-comprehensible, easy to analyze and help explain the system's behavior. Symbolic regression is a method that can automatically generate such models from data. Historically, symbolic regression has been predominantly realized by genetic programming, a method that evolves populations of candidate solutions that are subsequently modified by genetic operators crossover and mutation. However, this approach suffers from several deficiencies: it does not scale well with the number of variables and samples in the training data - models tend to grow in size and complexity without an adequate accuracy gain, and it is hard to fine-tune the model coefficients using just genetic operators. Recently, neural networks have been applied to learn the whole analytic model, i.e., its structure and the coefficients, using gradient-based optimization algorithms. This paper proposes a novel neural network-based symbolic regression method that constructs physically plausible models based on even very small training data sets and prior knowledge about the system. The method employs an adaptive weighting scheme to effectively deal with multiple loss function terms and an epoch-wise learning process to reduce the chance of getting stuck in poor local optima. Furthermore, we propose a parameter-free method for choosing the model with the best interpolation and extrapolation performance out of all the models generated throughout the whole learning process. We experimentally evaluate the approach on four test systems: the TurtleBot 2 mobile robot, the magnetic manipulation system, the equivalent resistance of two resistors in parallel, and the longitudinal force of the anti-lock braking system. The results clearly show the potential of the method to find parsimonious models that comply with the prior knowledge provided.

NEApr 23, 2025
Neuro-Evolutionary Approach to Physics-Aware Symbolic Regression

Jiří Kubalík, Robert Babuška

Symbolic regression is a technique that can automatically derive analytic models from data. Traditionally, symbolic regression has been implemented primarily through genetic programming that evolves populations of candidate solutions sampled by genetic operators, crossover and mutation. More recently, neural networks have been employed to learn the entire analytical model, i.e., its structure and coefficients, using regularized gradient-based optimization. Although this approach tunes the model's coefficients better, it is prone to premature convergence to suboptimal model structures. Here, we propose a neuro-evolutionary symbolic regression method that combines the strengths of evolutionary-based search for optimal neural network (NN) topologies with gradient-based tuning of the network's parameters. Due to the inherent high computational demand of evolutionary algorithms, it is not feasible to learn the parameters of every candidate NN topology to full convergence. Thus, our method employs a memory-based strategy and population perturbations to enhance exploitation and reduce the risk of being trapped in suboptimal NNs. In this way, each NN topology can be trained using only a short sequence of backpropagation iterations. The proposed method was experimentally evaluated on three real-world test problems and has been shown to outperform other NN-based approaches regarding the quality of the models obtained.

NEMar 5, 2025
Towards Resilient and Sustainable Global Industrial Systems: An Evolutionary-Based Approach

Václav Jirkovský, Jiří Kubalík, Petr Kadera et al.

This paper presents a new complex optimization problem in the field of automatic design of advanced industrial systems and proposes a hybrid optimization approach to solve the problem. The problem is multi-objective as it aims at finding solutions that minimize CO2 emissions, transportation time, and costs. The optimization approach combines an evolutionary algorithm and classical mathematical programming to design resilient and sustainable global manufacturing networks. Further, it makes use of the OWL ontology for data consistency and constraint management. The experimental validation demonstrates the effectiveness of the approach in both single and double sourcing scenarios. The proposed methodology, in general, can be applied to any industry case with complex manufacturing and supply chain challenges.

ROJul 20, 2020
An Integrated Approach to Goal Selection in Mobile Robot Exploration

Miroslav Kulich, Jiří Kubalík, Libor Přeučil

This paper deals with the problem of autonomous navigation of a mobile robot in an unknown 2D environment to fully explore the environment as efficiently as possible. We assume a terrestrial mobile robot equipped with a ranging sensor with a limited range and 360 degrees field of view. The key part of the exploration process is formulated as the d-Watchman Route Problem which consists of two coupled tasks - candidate goals generation and finding an optimal path through a subset of goals - which are solved in each exploration step. The latter has been defined as a constrained variant of the Generalized Traveling Salesman Problem and solved using an evolutionary algorithm. An evolutionary algorithm that uses an indirect representation and the nearest neighbor based constructive procedure was proposed to solve this problem. Individuals evolved in this evolutionary algorithm do not directly code the solutions to the problem. Instead, they represent sequences of instructions to construct a feasible solution. The problems with efficiently generating feasible solutions typically arising when applying traditional evolutionary algorithms to constrained optimization problems are eliminated this way. The proposed exploration framework was evaluated in a simulated environment on three maps and the time needed to explore the whole environment was compared to state-of-the-art exploration methods. Experimental results show that our method outperforms the compared ones in environments with a low density of obstacles by up to 12.5%, while it is slightly worse in office-like environments by 4.5% at maximum. The framework has also been deployed on a real robot to demonstrate the applicability of the proposed solution with real hardware.

LGMar 27, 2019
Constructing Parsimonious Analytic Models for Dynamic Systems via Symbolic Regression

Erik Derner, Jiří Kubalík, Nicola Ancona et al.

Developing mathematical models of dynamic systems is central to many disciplines of engineering and science. Models facilitate simulations, analysis of the system's behavior, decision making and design of automatic control algorithms. Even inherently model-free control techniques such as reinforcement learning (RL) have been shown to benefit from the use of models, typically learned online. Any model construction method must address the tradeoff between the accuracy of the model and its complexity, which is difficult to strike. In this paper, we propose to employ symbolic regression (SR) to construct parsimonious process models described by analytic equations. We have equipped our method with two different state-of-the-art SR algorithms which automatically search for equations that fit the measured data: Single Node Genetic Programming (SNGP) and Multi-Gene Genetic Programming (MGGP). In addition to the standard problem formulation in the state-space domain, we show how the method can also be applied to input-output models of the NARX (nonlinear autoregressive with exogenous input) type. We present the approach on three simulated examples with up to 14-dimensional state space: an inverted pendulum, a mobile robot, and a bipedal walking robot. A comparison with deep neural networks and local linear regression shows that SR in most cases outperforms these commonly used alternative methods. We demonstrate on a real pendulum system that the analytic model found enables a RL controller to successfully perform the swing-up task, based on a model constructed from only 100 data samples.

LGMar 22, 2019
Symbolic Regression Methods for Reinforcement Learning

Jiří Kubalík, Erik Derner, Jan Žegklitz et al.

Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.