Dimitrios Sikeridis

h-index1
2papers

2 Papers

LGDec 12, 2024Code
PickLLM: Context-Aware RL-Assisted Large Language Model Routing

Dimitrios Sikeridis, Dennis Ramdass, Pranay Pareek

Recently, the number of off-the-shelf Large Language Models (LLMs) has exploded with many open-source options. This creates a diverse landscape regarding both serving options (e.g., inference on local hardware vs remote LLM APIs) and model heterogeneous expertise. However, it is hard for the user to efficiently optimize considering operational cost (pricing structures, expensive LLMs-as-a-service for large querying volumes), efficiency, or even per-case specific measures such as response accuracy, bias, or toxicity. Also, existing LLM routing solutions focus mainly on cost reduction, with response accuracy optimizations relying on non-generalizable supervised training, and ensemble approaches necessitating output computation for every considered LLM candidate. In this work, we tackle the challenge of selecting the optimal LLM from a model pool for specific queries with customizable objectives. We propose PickLLM, a lightweight framework that relies on Reinforcement Learning (RL) to route on-the-fly queries to available models. We introduce a weighted reward function that considers per-query cost, inference latency, and model response accuracy by a customizable scoring function. Regarding the learning algorithms, we explore two alternatives: PickLLM router acting as a learning automaton that utilizes gradient ascent to select a specific LLM, or utilizing stateless Q-learning to explore the set of LLMs and perform selection with a $ε$-greedy approach. The algorithm converges to a single LLM for the remaining session queries. To evaluate, we utilize a pool of four LLMs and benchmark prompt-response datasets with different contexts. A separate scoring function is assessing response accuracy during the experiment. We demonstrate the speed of convergence for different learning rates and improvement in hard metrics such as cost per querying session and overall response latency.

GTJan 27, 2022
Smart City Defense Game: Strategic Resource Management during Socio-Cyber-Physical Attacks

Dimitrios Sikeridis, Michael Devetsikiotis

Ensuring public safety in a Smart City (SC) environment is a critical and increasingly complicated task due to the involvement of multiple agencies and the city's expansion across cyber and social layers. In this paper, we propose an extensive form perfect information game to model interactions and optimal city resource allocations when a Terrorist Organization (TO) performs attacks on multiple targets across two conceptual SC levels, a physical, and a cyber-social. The Smart City Defense Game (SCDG) considers three players that initially are entitled to a specific finite budget. Two SC agencies that have to defend their physical or social territories respectively, fight against a common enemy, the TO. Each layer consists of multiple targets and the attack outcome depends on whether the resources allocated there by the associated agency, exceed or not the TO's. Each player's utility is equal to the number of successfully defended targets. The two agencies are allowed to make budget transfers provided that it is beneficial for both. We completely characterize the Sub-game Perfect Nash Equilibrium (SPNE) of the SCDG that consists of strategies for optimal resource exchanges between SC agencies and accounts for the TO's budget allocation across the physical and social targets. Also, we present numerical and comparative results demonstrating that when the SC players act according to the SPNE, they maximize the number of successfully defended targets. The SCDG is shown to be a promising solution for modeling critical resource allocations between SC parties in the face of multi-layer simultaneous terrorist attacks.