46.9CRMay 28
An Organization-Scoped LLM Agent Runtime Architecture for Regulated Cybersecurity OperationsGeorge Fatouros, Georgios Makridis, George Kousiouris et al.
Regulated cybersecurity workflows lack a runtime substrate that enforces organization-level scope across retrieval, tool calls, memory, findings, reports, and audit while remaining model-agnostic and locally deployable. Recent large language model (LLM) agent systems report strong results on isolated cybersecurity tasks, yet they do not by themselves define an auditable platform architecture for regulated security operations centre (SOC) and compliance workflows, where a single analyst may trigger actions that bind the organization, and where the runtime must integrate with existing SIEM/XDR stacks as a primary source of context and alert-driven triggers rather than operate as a standalone analytical layer. This paper proposes an organization-scoped LLM agent runtime architecture for financial cybersecurity. The contribution is a typed Security Context that is created at every entry point, including SIEM/XDR notifications ingested as first-class triggers, and enforced at every component boundary, combined with a shared Runtime Core, logical specialist subagents, a governed Tool Adapter Layer exposing SIEM/XDR query, enrichment, and response primitives under uniform policy and audit, structured findings with evidence references, tiered human-in-the-loop (HITL) gates, and append-only audit. Model Context Protocol (MCP), extended telemetry, digital twins for pentesting, graph retrieval, and federated knowledge sharing are treated as optional extension paths rather than mandatory runtime assumptions. We describe an implementable slice as the architecture's testability surface, and we propose a falsifiable evaluation plan with metric-level pass criteria for architecture readiness, security-policy enforcement, evidence traceability, output quality, and operational observability.
HCJul 9, 2024
Evaluating Human-AI Collaboration: A Review and Methodological FrameworkGeorge Fragiadakis, Christos Diou, George Kousiouris et al.
The use of artificial intelligence (AI) in working environments with individuals, known as Human-AI Collaboration (HAIC), has become essential in a variety of domains, boosting decision-making, efficiency, and innovation. Despite HAIC's wide potential, evaluating its effectiveness remains challenging due to the complex interaction of components involved. This paper provides a detailed analysis of existing HAIC evaluation approaches and develops a fresh paradigm for more effectively evaluating these systems. Our framework includes a structured decision tree which assists to select relevant metrics based on distinct HAIC modes (AI-Centric, Human-Centric, and Symbiotic). By including both quantitative and qualitative metrics, the framework seeks to represent HAIC's dynamic and reciprocal nature, enabling the assessment of its impact and success. This framework's practicality can be examined by its application in an array of domains, including manufacturing, healthcare, finance, and education, each of which has unique challenges and requirements. Our hope is that this study will facilitate further research on the systematic evaluation of HAIC in real-world applications.
SEFeb 19, 2022Code
Combining Node-RED and Openwhisk for Pattern-based Development and Execution of Complex FaaS WorkflowsGeorge Kousiouris, Szymon Ambroziak, Domenico Costantino et al.
Modern cloud computing advances have been pressing application modernization in the last 15 years, stressing the need for application redesign towards the use of more distributed and ephemeral resources. From the initial IaaS and PaaS approaches, to microservices and now to the serverless model (and especially the Function as a Service approach), new challenges arise constantly for application developers. This paper presents a design and development environment that aims to ease application evolution and migration to the new FaaS model, based on the widely used Node-RED open source tool. The goal of the environment is to enable a more user friendly and abstract function and workflow creation for complex FaaS applications. To this end, it bypasses workflow description and function reuse limitations of the current FaaS platforms, by providing an extendable, pattern-enriched palette of ready-made, reusable functionality that can be combined in arbitrary ways. The environment embeds seamless DevOps processes for generating the deployable artefacts (i.e. functions and images) of the FaaS platform (Openwhisk). Annotation mechanisms are also available for the developer to dictate diverse execution options or management guidelines towards the deployment and operation stacks. The evaluation is based on case studies of indicative scenarios, including creating, registering and executing functions and flows based on the Node-RED runtime, embedding of existing legacy code in a FaaS environment, parallelizing a workload, collecting data at the edge and creating function orchestrators to accompany the application. For the latter, a detailed argumentation is provided as to why this process should not be constrained by the "double billing" principle of FaaS.
71.7AIMay 3
CyberAId: AI-Driven Cybersecurity for Financial Service ProvidersGeorge Fatouros, Georgios Makridis, John Soldatos et al.
European financial institutions face mounting regulatory pressure while their security operations centres remain constrained not by data or staffing but by reasoning capacity: enterprise SIEMs cover only a fraction of MITRE ATT&CK techniques, two thirds of SOC teams cannot keep pace with alert volumes, and the majority of breaches are preceded by alerts that are generated but never investigated. Frontier large language models now achieve state-of-the-art results on isolated cybersecurity tasks (one-day vulnerability exploitation, code-level patching, intrusion detection) yet no narrow win constitutes a platform that can compose across functions, persist multi-tenant state, map findings to regulatory regimes and survive an audit. This position paper argues that the right unit of construction is a hybrid multi-agent system in which specialised LLM subagents reason over classical SIEM/XDR telemetry rather than replacing it, share accumulated agent state across institutions through privacy-preserving federation, and can connect to complementary capability packs such as quantum-based authentication, digital twins for adversarial validation, and eBPF-based kernel telemetry. We present CyberAId, a model-agnostic, on-premise-deployable platform in which a Main Agent coordination layer, a Reporting capability, and specialist subagents operate within a shared runtime under bounded human-in-the-loop autonomy, organised around four falsifiable design principles, and aligned with relevant regulations. CyberAId will be validated at four representative financial use cases (client impersonation, anti-money-laundering for payment service providers, retail-banking incident response, and high-frequency-trading resilience) and propose skill-based agent adaptation as the most promising research direction for turning each deployment into a contribution to a continuously refined collective defence.
AIApr 16, 2025
Towards Conversational AI for Human-Machine Collaborative MLOpsGeorge Fatouros, Georgios Makridis, George Kousiouris et al.
This paper presents a Large Language Model (LLM) based conversational agent system designed to enhance human-machine collaboration in Machine Learning Operations (MLOps). We introduce the Swarm Agent, an extensible architecture that integrates specialized agents to create and manage ML workflows through natural language interactions. The system leverages a hierarchical, modular design incorporating a KubeFlow Pipelines (KFP) Agent for ML pipeline orchestration, a MinIO Agent for data management, and a Retrieval-Augmented Generation (RAG) Agent for domain-specific knowledge integration. Through iterative reasoning loops and context-aware processing, the system enables users with varying technical backgrounds to discover, execute, and monitor ML pipelines; manage datasets and artifacts; and access relevant documentation, all via intuitive conversational interfaces. Our approach addresses the accessibility gap in complex MLOps platforms like Kubeflow, making advanced ML tools broadly accessible while maintaining the flexibility to extend to other platforms. The paper describes the architecture, implementation details, and demonstrates how this conversational MLOps assistant reduces complexity and lowers barriers to entry for users across diverse technical skill levels.