REDS: Rule Extraction for Discovering Scenarios
This addresses the challenge of efficient scenario discovery in data spaces from simulations, particularly when computational costs are high, though it appears incremental as it builds on existing subgroup discovery methods.
The paper tackles the problem of scenario discovery from simulations by proposing REDS, a new procedure that uses an intermediate machine learning model to label data for subgroup discovery, reducing the number of required simulations by 50-75% on average in experiments.
Scenario discovery is the process of finding areas of interest, known as scenarios, in data spaces resulting from simulations. For instance, one might search for conditions, i.e., inputs of the simulation model, where the system is unstable. Subgroup discovery methods are commonly used for scenario discovery. They find scenarios in the form of hyperboxes, which are easy to comprehend. Given a computational budget, results tend to get worse as the number of inputs of the simulation model and the cost of simulations increase. We propose a new procedure for scenario discovery from few simulations, dubbed REDS. A key ingredient is using an intermediate machine learning model to label data for subsequent use by conventional subgroup discovery methods. We provide statistical arguments why this is an improvement. In our experiments, REDS reduces the number of simulations required by 50--75\% on average, depending on the quality measure. It is also useful as a semi-supervised subgroup discovery method and for discovering better scenarios from third-party data, when a simulation model is not available.