Fernando Koch

h-index21

5papers

8citations

Novelty39%

AI Score37

Ranked #116,587 of 201,326 authors (top 58%)#666 in DC (top 58%)

5 Papers

DCNov 30, 2025

Joint Partitioning and Placement of Foundation Models for Real-Time Edge AI

Aladin Djuhera, Fernando Koch, Alecio Binotto

Inference over large-scale foundation models within heterogeneous edge environments necessitates a fundamentally reconfigurable orchestration substrate. Static partitioning of model layers presumes temporal stability across compute and network resources, which is misaligned with the volatility of real-world deployments. We introduce a framework in which both the spatial placement and internal segmentation of foundation models are elevated to runtime-resolved constructs. The orchestration problem is formalized as a constrained optimization over layer-wise assignments, subject to evolving latency, utilization, and privacy gradients. The framework implements reactive inference composition responsive to infrastructural fluctuations by integrating model-aware capacity profiling with dynamic graph re-partitioning and reallocation. We introduce architectural and algorithmic components, along with a representative use case in 6G multi-access edge computing.

CRMar 26

Understanding AI Methods for Intrusion Detection and Cryptographic Leakage

Reza Zilouchian, Micheal Chavez, Fernando Koch

We investigate the role of artificial intelligence in cybersecurity by evaluating how machine learning techniques can detect malicious network activity and identify potential information leakage in cryptographic implementations. We conduct a series of experiments using the NSL-KDD and CIC-IDS datasets to evaluate intrusion detection performance across controlled and shifted data environments. Our results demonstrate that AI models can achieve near-perfect detection accuracy within stable network environment. However, their performance declines when exposed to fluctuating or previously unseen traffic patterns. We also observed that learned models identify patterns consistent with side-channel leakage, suggesting that AI can assist in uncovering implementation-level vulnerabilities.

CYMay 29, 2025

SafeCOMM: A Study on Safety Degradation in Fine-Tuned Telecom Large Language Models

Aladin Djuhera, Swanand Ravindra Kadhe, Farhan Ahmed et al.

Fine-tuning large language models (LLMs) on telecom datasets is a common practice to adapt general-purpose models to the telecom domain. However, little attention has been paid to how this process may compromise model safety. Recent research has shown that even benign fine-tuning can degrade the safety alignment of LLMs, causing them to respond to harmful or unethical user queries. In this paper, we investigate this issue by fine-tuning LLMs on three representative telecom datasets and show that safety degrades even for light telecom domain adaptation. To this end, we introduce TeleHarm, the first telecom-specific red-teaming benchmark, which we use alongside established Direct-Harm and HexPhi datasets to systematically assess harmful behavior. We further extend our analysis to publicly available TeleLLMs that were continually pre-trained on large telecom corpora, revealing that safety alignment is severely lacking, primarily due to the omission of safety-focused instruction tuning. To address these issues, we evaluate three realignment defenses: SafeInstruct, SafeLoRA, SafeMERGE. We show that, across all settings, the proposed defenses can effectively restore safety without compromising telecom task performance, leading to Safe teleCOMMunication (SafeCOMM) models. Our work serves as both a diagnostic study and practical guide for safety realignment in telecom-tuned LLMs, underscoring the need for safety-aware instruction and fine-tuning in the telecom domain.

DCMar 19, 2025

Intelligent Orchestration of Distributed Large Foundation Model Inference at the Edge

Fernando Koch, Aladin Djuhera, Alecio Binotto

Large Foundation Models (LFMs), including multi-modal and generative models, promise to unlock new capabilities for next-generation Edge AI applications. However, performing inference with LFMs in resource-constrained and heterogeneous edge environments, such as Multi-access Edge Computing (MEC), presents significant challenges for workload orchestration due to time-varying network, compute, and storage conditions. In particular, current split inference strategies, which partition LFM layers across nodes, are not designed to adapt to fluctuating workloads, dynamic bandwidth conditions, or evolving privacy constraints in high-utilization MEC environments. In this work, we propose a novel adaptive split inference orchestration framework that elevates both the placement and partitioning of LFM layers to runtime-tunable variables. Specifically, our framework enables real-time, quality-of-service (QoS)-aware management of inference workloads by extending conventional orchestrators with three key services: (1) Capacity-aware workload distribution, which continuously profiles node resources and selects an optimal subset of MEC nodes; (2) Dynamic partition migration, which transparently relocates pre-cut LFM segments in response to changes in utilization or network conditions; (3) Real-time reconfiguration, which dynamically re-splits LFM layers to balance latency, throughput, and privacy. We formalize the joint placement-partitioning problem, outline a reference architecture and algorithmic workflow, and discuss applicability in representative smart city, V2X, and industrial edge scenarios.

MAFeb 21, 2025

Multi-Agent Architecture in Distributed Environment Control Systems: vision, challenges, and opportunities

Natasha Astudillo, Fernando Koch

The increasing demand for energy-efficient solutions in large-scale infrastructure, particularly data centers, requires advanced control strategies to optimize environmental management systems. We propose a multi-agent architecture for distributed control of air-cooled chiller systems in data centers. Our vision employs autonomous agents to monitor and regulate local operational parameters and optimize system-wide efficiency. We demonstrate how this approach improves the responsiveness, operational robustness, and energy efficiency of the system, contributing to the broader goal of sustainable infrastructure management.