Fabio Andrijauskas

DC
3papers
6citations
Novelty17%
AI Score36

3 Papers

34.7DCMay 14
Open Science Data Federation -- operation and monitoring

Fabio Andrijauskas, Derek Weitzel, Frank Wuerthwein

Extensive data processing is becoming commonplace in many fields of science. Distributing data to processing sites and providing methods to share the data with collaborators efficiently has become essential. The Open Science Data Federation (OSDF) builds upon the successful StashCache project to create a global data access network. The OSDF expands the StashCache project to add new data origins and caches, access methods, monitoring, and accounting mechanisms. Additionally, the OSDF has become an integral part of the U.S. national cyberinfrastructure landscape due to the sharing requirements of recent NSF solicitations, which the OSDF is uniquely positioned to enable. The OSDF continues to be utilized by many research collaborations and individual users, which pull the data to many research infrastructures and projects.

24.9DCMay 14
Using the Open Science Data Federation for data distribution: Big Bear Solar Observatory use case

Sydney Montiel, Alexsandra Guadarrama, Fabio Andrijauskas

The growing demand for extensive data processing is now a standard in many scientific fields. Efficiently distributing data to processing sites and enabling seamless sharing has become crucial. The Open Science Data Federation (OSDF) builds on the success of the StashCache project to establish a global data distribution network. By expanding StashCache, OSDF integrates additional data origins and caches, enhancing accessibility and performance (20 origins and 30 caches), new access methods, and monitoring and accounting mechanisms. Additionally, the OSDF has become essential to the US national cyber-infrastructure landscape due to the sharing requirements of recent NSF solicitations. One use case for the OSDF is the data access to the Big Bear Solar Observatory (BBSO). Integrating the BBSO data into the OSDF provided standard and reliable data access. Moreover, the OSDF caches provide local data worldwide. Using the OSDF and the BBSO data, creating a pipeline to apply image processing techniques to all images from BBSO anywhere on the planet was possible.

2.4IRMay 13
Benchmarking the Open Science Data Federation services to develop XRootD best practices

Fabio Andrijauskas, Igor Sfiligoi, Frank Würthwein

Research has become dependent on processing power and storage, one crucial aspect being data sharing. The Open Science Data Federation (OSDF) project aims to create a scientific global data distribution network based on the Pelican Platform. OSDF relies on the XRootD and Pelican projects. Nevertheless, OSDF must understand the XRootD limits under various configuration options, including transfer rate limits, proper buffer configuration, and storage type effect. We have thus executed a set of benchmarks to create a set of recommendations to share with the XRootD and Pelican teams. This work describes the tests and results performed using National Research Platform (NRP) hosts. The tests cover various file sizes and parallel streams and use clients from various distances from the server host. We also used several standalone clients (wget, curl, pelican) and the native HTCondor file transfer mechanisms. Applying the methodology creates a possibility to track how XRootD and the Pelican layer perform in different scenarios.