Designing Complex Experiments by Applying Unsupervised Machine Learning
This work addresses the problem of creating economically and operationally sound experimental designs for researchers and practitioners in fields with complex experiments, though it is incremental in applying existing ML methods to a specific domain.
The paper tackles the challenge of designing complex experiments with many factors and constraints by applying a beta variational autoencoder (beta-VAE) to reduce the number of required trials from a full factorial design, achieving a significant decrease while maintaining objectives.
Design of experiments (DOE) is playing an essential role in learning and improving a variety of objects and processes. The article discusses the application of unsupervised machine learning to support the pragmatic designs of complex experiments. Complex experiments are characterized by having a large number of factors, mixed-level designs, and may be subject to constraints that eliminate some unfeasible trials for various reasons. Having such attributes, it is very challenging to design pragmatic experiments that are economically, operationally, and timely sound. It means a significant decrease in the number of required trials from a full factorial design, while still attempting to achieve the defined objectives. A beta variational autoencoder (beta-VAE) has been applied to represent trials of the initial full factorial design after filtering out unfeasible trials on the low dimensional latent space. Regarding visualization and interpretability, the paper is limited to 2D representations. Beta-VAE supports (1) orthogonality of the latent space dimensions, (2) isotropic multivariate standard normal distribution of the representation on the latent space, (3) disentanglement of the latent space representation by levels of factors, (4) propagation of the applied constraints of the initial design into the latent space, and (5) generation of trials by decoding latent space points. Having an initial design representation on the latent space with such properties, it allows for the generation of pragmatic design of experiments (G-DOE) by specifying the number of trials and their pattern on the latent space, such as square or polar grids. Clustering and aggregated gradient metrics have been shown to guide grid specification.