AI Technical Considerations: Data Storage, Cloud usage and AI Pipeline
It offers practical advice for researchers and practitioners in AI, especially in medical domains, but is incremental as it synthesizes existing technical concepts without introducing new methods.
The chapter addresses the technical challenges of managing large-scale data and computational resources for AI, particularly in medical imaging, by providing guidance on designing data storage, cloud usage, and AI pipelines to comply with standards and legal restrictions.
Artificial intelligence (AI), especially deep learning, requires vast amounts of data for training, testing, and validation. Collecting these data and the corresponding annotations requires the implementation of imaging biobanks that provide access to these data in a standardized way. This requires careful design and implementation based on the current standards and guidelines and complying with the current legal restrictions. However, the realization of proper imaging data collections is not sufficient to train, validate and deploy AI as resource demands are high and require a careful hybrid implementation of AI pipelines both on-premise and in the cloud. This chapter aims to help the reader when technical considerations have to be made about the AI environment by providing a technical background of different concepts and implementation aspects involved in data storage, cloud usage, and AI pipelines.