A natural approach to studying schema processing
This work addresses a theoretical bottleneck in evolutionary computation by providing tools to analyze schema processing, though it is incremental in advancing understanding of GAs.
The authors tackled the problem of studying schema processing in Genetic Algorithms (GAs) by developing a mathematical method to identify all schemata in a population, revealing a natural approach to test the Building Block Hypothesis. They found that 25-35% of building blocks result from combination in most crossover methods, but increased combination does not improve GA efficiency.
The Building Block Hypothesis (BBH) states that adaptive systems combine good partial solutions (so-called building blocks) to find increasingly better solutions. It is thought that Genetic Algorithms (GAs) implement the BBH. However, for GAs building blocks are semi-theoretical objects in that they are thought only to be implicitly exploited via the selection and crossover operations of a GA. In the current work, we discover a mathematical method to identify the complete set of schemata present in a given population of a GA; as such a natural way to study schema processing (and thus the BBH) is revealed. We demonstrate how this approach can be used both theoretically and experimentally. Theoretically, we show that the search space for good schemata is a complete lattice and that each generation samples a complete sub-lattice of this search space. In addition, we show that combining schemata can only explore a subset of the search space. Experimentally, we compare how well different crossover methods combine building blocks. We find that for most crossover methods approximately 25-35% of building blocks in a generation result from the combination of the previous generation's building blocks. We also find that an increase in the combination of building blocks does not lead to an increase in the efficiency of a GA. To complement this article, we introduce an open source Python package called schematax, which allows one to calculate the schemata present in a population using the methods described in this article.