CateCom: a practical data-centric approach to categorization of computational models
This addresses the need for well-organized structured data to facilitate AI/ML applications in data-driven science, but it is incremental as it builds on existing object-oriented concepts for categorization.
The paper tackles the problem of organizing diverse computational models for structured data storage by presenting CateCom, an open-source collaborative framework that applies object-oriented design to uniquely describe and cover widely used models, with example database schemas and deployed software.
The advent of data-driven science in the 21st century brought about the need for well-organized structured data and associated infrastructure able to facilitate the applications of Artificial Intelligence and Machine Learning. We present an effort aimed at organizing the diverse landscape of physics-based and data-driven computational models in order to facilitate the storage of associated information as structured data. We apply object-oriented design concepts and outline the foundations of an open-source collaborative framework that is: (1) capable of uniquely describing the approaches in structured data, (2) flexible enough to cover the majority of widely used models, and (3) utilizes collective intelligence through community contributions. We present example database schemas and corresponding data structures and explain how these are deployed in software at the time of this writing.