AIJun 27, 2025
AI Model Passport: Data and System Traceability Framework for Transparent AI in HealthVarvara Kalokyri, Nikolaos S. Tachos, Charalampos N. Kalantzopoulos et al.
The increasing integration of Artificial Intelligence (AI) into health and biomedical systems necessitates robust frameworks for transparency, accountability, and ethical compliance. Existing frameworks often rely on human-readable, manual documentation which limits scalability, comparability, and machine interpretability across projects and platforms. They also fail to provide a unique, verifiable identity for AI models to ensure their provenance and authenticity across systems and use cases, limiting reproducibility and stakeholder trust. This paper introduces the concept of the AI Model Passport, a structured and standardized documentation framework that acts as a digital identity and verification tool for AI models. It captures essential metadata to uniquely identify, verify, trace and monitor AI models across their lifecycle - from data acquisition and preprocessing to model design, development and deployment. In addition, an implementation of this framework is presented through AIPassport, an MLOps tool developed within the ProCAncer-I EU project for medical imaging applications. AIPassport automates metadata collection, ensures proper versioning, decouples results from source scripts, and integrates with various development environments. Its effectiveness is showcased through a lesion segmentation use case using data from the ProCAncer-I dataset, illustrating how the AI Model Passport enhances transparency, reproducibility, and regulatory readiness while reducing manual effort. This approach aims to set a new standard for fostering trust and accountability in AI-driven healthcare solutions, aspiring to serve as the basis for developing transparent and regulation compliant AI systems across domains.
LGDec 16, 2025
Synthetic Data Blueprint (SDB): A modular framework for the statistical, structural, and graph-based evaluation of synthetic tabular dataVasileios C. Pezoulas, Nikolaos S. Tachos, Eleni Georga et al.
In the rapidly evolving era of Artificial Intelligence (AI), synthetic data are widely used to accelerate innovation while preserving privacy and enabling broader data accessibility. However, the evaluation of synthetic data remains fragmented across heterogeneous metrics, ad-hoc scripts, and incomplete reporting practices. To address this gap, we introduce Synthetic Data Blueprint (SDB), a modular Pythonic based library to quantitatively and visually assess the fidelity of synthetic tabular data. SDB supports: (i) automated feature-type detection, (ii) distributional and dependency-level fidelity metrics, (iii) graph- and embedding-based structure preservation scores, and (iv) a rich suite of data visualization schemas. To demonstrate the breadth, robustness, and domain-agnostic applicability of the SDB, we evaluated the framework across three real-world use cases that differ substantially in scale, feature composition, statistical complexity, and downstream analytical requirements. These include: (i) healthcare diagnostics, (ii) socioeconomic and financial modelling, and (iii) cybersecurity and network traffic analysis. These use cases reveal how SDB can address diverse data fidelity assessment challenges, varying from mixed-type clinical variables to high-cardinality categorical attributes and high-dimensional telemetry signals, while at the same time offering a consistent, transparent, and reproducible benchmarking across heterogeneous domains.
IVJul 6, 2021
A new smart-cropping pipeline for prostate segmentation using deep learning networksDimitrios G. Zaridis, Eugenia Mylona, Nikolaos S. Tachos et al.
Prostate segmentation from magnetic resonance imaging (MRI) is a challenging task. In recent years, several network architectures have been proposed to automate this process and alleviate the burden of manual annotation. Although the performance of these models has achieved promising results, there is still room for improvement before these models can be used safely and effectively in clinical practice. One of the major challenges in prostate MR image segmentation is the presence of class imbalance in the image labels where the background pixels dominate over the prostate. In the present work we propose a DL-based pipeline for cropping the region around the prostate from MRI images to produce a more balanced distribution of the foreground pixels (prostate) and the background pixels and improve segmentation accuracy. The effect of DL-cropping for improving the segmentation performance compared to standard center-cropping is assessed using five popular DL networks for prostate segmentation, namely U-net, U-net+, Res Unet++, Bridge U-net and Dense U-net. The proposed smart-cropping outperformed the standard center cropping in terms of segmentation accuracy for all the evaluated prostate segmentation networks. In terms of Dice score, the highest improvement was achieved for the U-net+ and ResU-net++ architectures corresponding to 8.9% and 8%, respectively.