NAMay 25, 2016
Many physical laws are ridge functionsPaul G. Constantine, Zachary del Rosario, Gianluca Iaccarino
A ridge function is a function of several variables that is constant along certain directions in its domain. Using classical dimensional analysis, we show that many physical laws are ridge functions; this fact yields insight into the structure of physical laws and motivates further study into ridge functions and their properties. We also connect dimensional analysis to modern subspace-based techniques for dimension reduction, including active subspaces in deterministic approximation and sufficient dimension reduction in statistical regression.
MLNov 6, 2019
Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and OptimizationZachary del Rosario, Matthias Rupp, Yoolhee Kim et al.
Discovering novel materials can be greatly accelerated by iterative machine learning-informed proposal of candidates---active learning. However, standard \emph{global-scope error} metrics for model quality are not predictive of discovery performance, and can be misleading. We introduce the notion of \emph{Pareto shell-scope error} to help judge the suitability of a model for proposing material candidates. Further, through synthetic cases and a thermoelectric dataset, we probe the relation between acquisition function fidelity and active learning performance. Results suggest novel diagnostic tools, as well as new insights for acquisition function design.
NAAug 4, 2017
Data-driven dimensional analysis: algorithms for unique and relevant dimensionless groupsPaul G. Constantine, Zachary del Rosario, Gianluca Iaccarino
Classical dimensional analysis has two limitations: (i) the computed dimensionless groups are not unique, and (ii) the analysis does not measure relative importance of the dimensionless groups. We propose two algorithms for estimating unique and relevant dimensionless groups assuming the experimenter can control the system's independent variables and evaluate the corresponding dependent variable; e.g., computer experiments provide such a setting. The first algorithm is based on a response surface constructed from a set of experiments. The second algorithm uses many experiments to estimate finite differences over a range of the independent variables. Both algorithms are semi-empirical because they use experimental data to complement the dimensional analysis. We derive the algorithms by combining classical semi-empirical modeling with active subspaces, which---given a probability density on the independent variables---yield unique and relevant dimensionless groups. The connection between active subspaces and dimensional analysis also reveals that all empirical models are ridge functions, which are functions that are constant along low-dimensional subspaces in its domain. We demonstrate the proposed algorithms on the well-studied example of viscous pipe flow---both turbulent and laminar cases. The results include a new set of two dimensionless groups for turbulent pipe flow that are ordered by relevance to the system; the precise notion of relevance is closely tied to the derivative based global sensitivity metric from Sobol' and Kucherenko.