NIAug 21, 2010
Control and Optimization Meet the Smart Power Grid - Scheduling of Power Demands for Optimal Energy ManagementIordanis Koutsopoulos, Leandros Tassiulas
The smart power grid aims at harnessing information and communication technologies to enhance reliability and enforce sensible use of energy. Its realization is geared by the fundamental goal of effective management of demand load. In this work, we envision a scenario with real-time communication between the operator and consumers. The grid operator controller receives requests for power demands from consumers, with different power requirement, duration, and a deadline by which it is to be completed. The objective is to devise a power demand task scheduling policy that minimizes the grid operational cost over a time horizon. The operational cost is a convex function of instantaneous power consumption and reflects the fact that each additional unit of power needed to serve demands is more expensive as demand load increases.First, we study the off-line demand scheduling problem, where parameters are fixed and known. Next, we devise a stochastic model for the case when demands are generated continually and scheduling decisions are taken online and focus on long-term average cost. We present two instances of power consumption control based on observing current consumption. First, the controller may choose to serve a new demand request upon arrival or to postpone it to the end of its deadline. Second, the additional option exists to activate one of the postponed demands when an active demand terminates. For both instances, the optimal policies are threshold based. We derive a lower performance bound over all policies, which is asymptotically tight as deadlines increase. We propose the Controlled Release threshold policy and prove it is asymptotically optimal. The policy activates a new demand request if the current power consumption is less than a threshold, otherwise it is queued. Queued demands are scheduled when their deadline expires or when the consumption drops below the threshold.
NISep 11, 2012
Competition and Regulation in Wireless Services MarketsOmer Korcak, George Iosifidis, Tansu Alpcan et al.
We consider a wireless services market where a set of operators compete for a large common pool of users. The latter have a reservation utility of U0 units or, equivalently, an alternative option to satisfy their communication needs. The operators must satisfy these minimum requirements in order to attract the users. We model the users decisions and interaction as an evolutionary game and the competition among the operators as a non cooperative price game which is proved to be a potential game. For each set of prices selected by the operators, the evolutionary game attains a different stationary point. We show that the outcome of both games depend on the reservation utility of the users and the amount of spectrum W the operators have at their disposal. We express the market welfare and the revenue of the operators as functions of these two parameters. Accordingly, we consider the scenario where a regulating agency is able to intervene and change the outcome of the market by tuning W and/or U0. Different regulators may have different objectives and criteria according to which they intervene. We analyze the various possible regulation methods and discuss their requirements, implications and impact on the market.
NIDec 13, 2011
Incentive Mechanisms for Hierarchical Spectrum MarketsGeorge Iosifidis, Anil Kumar Chorppath, Tansu Alpcan et al.
In this paper, we study spectrum allocation mechanisms in hierarchical multi-layer markets which are expected to proliferate in the near future based on the current spectrum policy reform proposals. We consider a setting where a state agency sells spectrum channels to Primary Operators (POs) who subsequently resell them to Secondary Operators (SOs) through auctions. We show that these hierarchical markets do not result in a socially efficient spectrum allocation which is aimed by the agency, due to lack of coordination among the entities in different layers and the inherently selfish revenue-maximizing strategy of POs. In order to reconcile these opposing objectives, we propose an incentive mechanism which aligns the strategy and the actions of the POs with the objective of the agency, and thus leads to system performance improvement in terms of social welfare. This pricing-based scheme constitutes a method for hierarchical market regulation. A basic component of the proposed incentive mechanism is a novel auction scheme which enables POs to allocate their spectrum by balancing their derived revenue and the welfare of the SOs.
DCMay 5, 2025
Large Language Model Partitioning for Low-Latency Inference at the EdgeDimitrios Kafetzis, Ramin Khalili, Iordanis Koutsopoulos
Large Language Models (LLMs) based on autoregressive, decoder-only Transformers generate text one token at a time, where a token represents a discrete unit of text. As each newly produced token is appended to the partial output sequence, the length grows and so does the memory and compute load, due to the expanding key-value caches, which store intermediate representations of all previously generated tokens in the multi-head attention (MHA) layer. As this iterative process steadily increases memory and compute demands, layer-based partitioning in resource-constrained edge environments often results in memory overload or high inference latency. To address this and reduce inference latency, we propose a resource-aware Transformer architecture partitioning algorithm, where the partitioning decision is updated at regular intervals during token generation. The approach is myopic in that it is based on instantaneous information about device resource availability and network link bandwidths. When first executed, the algorithm places blocks on devices, and in later executions, it migrates these blocks among devices so that the sum of migration delay and inference delay remains low. Our approach partitions the decoder at the attention head level, co-locating each attention head with its key-value cache and allowing dynamic migrations whenever resources become tight. By allocating different attention heads to different devices, we exploit parallel execution of attention heads and thus achieve substantial reductions in inference delays. Our experiments show that in small-scale settings (3-5 devices), the proposed method achieves within 15 to 20 percent of an exact optimal solver's latency, while in larger-scale tests it achieves notable improvements in inference speed and memory usage compared to state-of-the-art layer-based partitioning approaches.
LGMar 10, 2025
Joint Explainability-Performance Optimization With Surrogate Models for AI-Driven Edge ServicesFoivos Charalampakos, Thomas Tsouparopoulos, Iordanis Koutsopoulos
Explainable AI is a crucial component for edge services, as it ensures reliable decision making based on complex AI models. Surrogate models are a prominent approach of XAI where human-interpretable models, such as a linear regression model, are trained to approximate a complex (black-box) model's predictions. This paper delves into the balance between the predictive accuracy of complex AI models and their approximation by surrogate ones, advocating that both these models benefit from being learned simultaneously. We derive a joint (bi-level) training scheme for both models and we introduce a new algorithm based on multi-objective optimization (MOO) to simultaneously minimize both the complex model's prediction error and the error between its outputs and those of the surrogate. Our approach leads to improvements that exceed 99% in the approximation of the black-box model through the surrogate one, as measured by the metric of Fidelity, for a compromise of less than 3% absolute reduction in the black-box model's predictive accuracy, compared to single-task and multi-task learning baselines. By improving Fidelity, we can derive more trustworthy explanations of the complex model's outcomes from the surrogate, enabling reliable AI applications for intelligent services at the network edge.
DCApr 22, 2025
Collaborative Split Federated Learning with Parallel Training and AggregationYiannis Papageorgiou, Yannis Thomas, Alexios Filippakopoulos et al.
Federated learning (FL) operates based on model exchanges between the server and the clients, and it suffers from significant client-side computation and communication burden. Split federated learning (SFL) arises a promising solution by splitting the model into two parts, that are trained sequentially: the clients train the first part of the model (client-side model) and transmit it to the server that trains the second (server-side model). Existing SFL schemes though still exhibit long training delays and significant communication overhead, especially when clients of different computing capability participate. Thus, we propose Collaborative-Split Federated Learning~(C-SFL), a novel scheme that splits the model into three parts, namely the model parts trained at the computationally weak clients, the ones trained at the computationally strong clients, and the ones at the server. Unlike existing works, C-SFL enables parallel training and aggregation of model's parts at the clients and at the server, resulting in reduced training delays and commmunication overhead while improving the model's accuracy. Experiments verify the multiple gains of C-SFL against the existing schemes.
LGApr 11, 2025
Explainability and Continual Learning meet Federated Learning at the Network EdgeThomas Tsouparopoulos, Iordanis Koutsopoulos
As edge devices become more capable and pervasive in wireless networks, there is growing interest in leveraging their collective compute power for distributed learning. However, optimizing learning at the network edge entails unique challenges, particularly when moving beyond conventional settings and objectives. While Federated Learning (FL) has emerged as a key paradigm for distributed model training, critical challenges persist. First, existing approaches often overlook the trade-off between predictive accuracy and interpretability. Second, they struggle to integrate inherently explainable models such as decision trees because their non-differentiable structure makes them not amenable to backpropagation-based training algorithms. Lastly, they lack meaningful mechanisms for continual Machine Learning (ML) model adaptation through Continual Learning (CL) in resource-limited environments. In this paper, we pave the way for a set of novel optimization problems that emerge in distributed learning at the network edge with wirelessly interconnected edge devices, and we identify key challenges and future directions. Specifically, we discuss how Multi-objective optimization (MOO) can be used to address the trade-off between predictive accuracy and explainability when using complex predictive models. Next, we discuss the implications of integrating inherently explainable tree-based models into distributed learning settings. Finally, we investigate how CL strategies can be effectively combined with FL to support adaptive, lifelong learning when limited-size buffers are used to store past data for retraining. Our approach offers a cohesive set of tools for designing privacy-preserving, adaptive, and trustworthy ML solutions tailored to the demands of edge computing and intelligent services.
LGFeb 20, 2022
Personalized Federated Learning with Exact Stochastic Gradient DescentSotirios Nikoloutsopoulos, Iordanis Koutsopoulos, Michalis K. Titsias
We propose a Stochastic Gradient Descent (SGD)-type algorithm for Personalized Federated Learning which can be particularly attractive for mobile energy-limited regimes due to its low per-client computational cost. The model to be trained includes a set of common weights for all clients, and a set of personalized weights that are specific to each client. At each optimization round, randomly selected clients perform multiple full gradient-descent updates over their client-specific weights towards optimizing the loss function on their own datasets, without updating the common weights. This procedure is energy-efficient since it has low computational cost per client. At the final update of each round, each client computes the joint gradient over both the client-specific and the common weights and returns the gradient of common weights to the server, which allows to perform an exact SGD step over the full set of weights in a distributed manner. For the overall optimization scheme, we rigorously prove convergence, even in non-convex settings such as those encountered when training neural networks, with a rate of $\mathcal{O} \left (\frac{1}{\sqrt{T}} \right )$ with respect to communication rounds $T$. In practice, PFLEGO exhibits substantially lower per-round wall-clock time, used as a proxy for energy. Our theoretical guarantees translate to superior performance in practice against baselines such as FedAvg and FedPer, as evaluated in several multi-class classification datasets, in particular, Omniglot, CIFAR-10, MNIST, Fashion-MNIST, and EMNIST.