Extensible Generic Data Management Software
This work addresses the problem of software re-use for data management across multiple domains, but it appears incremental as it builds on existing iRODS systems without introducing a fundamentally new approach.
The paper tackled the challenge of creating reusable generic data management software by identifying three extensibility mechanisms—computer actionable rules, micro-services, and middleware servers—based on collaborations with 25 science and engineering domains, enabling capabilities like data grids and archives.
Extensibility mechanisms constitute a form of knowledge capture that is essential for software re-use. The Data Intensive Cyber Environments (DICE) group has collaborated with twenty-five science and engineering domains on the application of the iRODS policy-based data management system. Based on these collaborations, three types of extensibility mechanisms are sufficient to capture the domain knowledge needed for interaction with domain resources: computer actionable rules that control management policies; computer executable micro-services that encapsulate operations or interaction protocols; and middleware servers that apply standard operations at remote locations. These mechanisms enable the creation of generic data management software that is capable of implementing collections, data grids for sharing data, digital libraries for publishing data, processing pipelines, and archives.