Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen
This addresses a domain-specific challenge for researchers and practitioners in Android malware detection by providing a more reliable evaluation tool, though it is incremental as it builds on existing metrics.
The paper tackled the problem of unstable and non-standardized evaluation of synthetic Android malware data by introducing a Super-Metric that aggregates eight metrics into a single weighted score, showing it is more stable and consistent with stronger correlations to classifier performance in experiments with ten generative models and five datasets.
Evaluating the quality of synthetic data remains a persistent challenge in the Android malware domain due to instability and the lack of standardization among existing metrics. This work integrates into MalDataGen a Super-Metric that aggregates eight metrics across four fidelity dimensions, producing a single weighted score. Experiments involving ten generative models and five balanced datasets demonstrate that the Super-Metric is more stable and consistent than traditional metrics, exhibiting stronger correlations with the actual performance of classifiers.