NAMay 23
Optimal Network Pricing for Oblivious Users under Projected Decision-Dependent DistributionsYixuan Li, Andersen Ang, Sebastian Stein
Efficient large-scale network allocation requires pricing mechanisms that internalize the stochastic and non-linear dynamics of user behavior. Moving beyond classical models of strategic agents, we introduce an Optimal Network Pricing (ONP) problem for ``oblivious'' users. This shift introduces a Decision-Dependent (DD) environment where pricing decisions endogenously shift the flow demand distribution. A key novelty of our model is the incorporation of a projection operator, creating a nonsmooth optimization landscape. We demonstrate that Performative Stability (PS) fails in ONP, degenerating to a trivial solution. Instead, we prove that the expected objective admits a unique global optimum, termed the Projected Performative Optimum (ΠPO). To overcome the algorithmic challenges, we propose a rigorous framework combining Sample Average Approximation (SAA) with a Trust-Region Sequential Quadratic Programming (TR-SQP) solver. Our method targets ΠPO by explicitly modeling the nonsmooth Jacobian, effectively handling saturation constraints. We establish theoretical guarantees for probabilistic convexity and sample complexity, and exploit network sparsity to reduce per-iteration computational complexity to near-linear in the number of routes. Experimental validation on the classic Braess network and large-scale real-world topologies demonstrates that our ΠPO-targeting solver significantly outperforms PS-seeking heuristics and our proposed baseline. The results highlight that properly accounting for the ``gating'' effects of capacity unlocks substantial gains in social welfare, providing a robust foundation for network pricing.
MAOct 5, 2022
From Intelligent Agents to Trustworthy Human-Centred Multiagent SystemsMohammad Divband Soorati, Enrico H. Gerding, Enrico Marchioni et al.
The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on recent work and open research challenges in developing trustworthy autonomous systems and deploying human-centred AI systems that aim to support societal good.
AIFeb 18, 2024
Combinatorial Client-Master Multiagent Deep Reinforcement Learning for Task Offloading in Mobile Edge ComputingTesfay Zemuy Gebrekidan, Sebastian Stein, Timothy J. Norman
Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge computing (MEC) has emerged as a promising technology to meet the increasing computing demands of UDs. Task offloading in MEC is a strategy that meets the demands of UDs by distributing tasks between UDs and MEC servers. Deep reinforcement learning (DRL) is gaining attention in task-offloading problems because it can adapt to dynamic changes and minimize online computational complexity. However, the various types of continuous and discrete resource constraints on UDs and MEC servers pose challenges to the design of an efficient DRL-based task-offloading strategy. Existing DRL-based task-offloading algorithms focus on the constraints of the UDs, assuming the availability of enough storage resources on the server. Moreover, existing multiagent DRL (MADRL)--based task-offloading algorithms are homogeneous agents and consider homogeneous constraints as a penalty in their reward function. We proposed a novel combinatorial client-master MADRL (CCM\_MADRL) algorithm for task offloading in MEC (CCM\_MADRL\_MEC) that enables UDs to decide their resource requirements and the server to make a combinatorial decision based on the requirements of the UDs. CCM\_MADRL\_MEC is the first MADRL in task offloading to consider server storage capacity in addition to the constraints in the UDs. By taking advantage of the combinatorial action selection, CCM\_MADRL\_MEC has shown superior convergence over existing MADDPG and heuristic algorithms.
HCDec 19, 2024
Active Inference and Human--Computer InteractionRoderick Murray-Smith, John H. Williamson, Sebastian Stein
Active Inference is a closed-loop computational theoretical basis for understanding behaviour, based on agents with internal probabilistic generative models that encode their beliefs about how hidden states in their environment cause their sensations. We review Active Inference and how it could be applied to model the human-computer interaction loop. Active Inference provides a coherent framework for managing generative models of humans, their environments, sensors and interface components. It informs off-line design and supports real-time, online adaptation. It provides model-based explanations for behaviours observed in HCI, and new tools to measure important concepts such as agency and engagement. We discuss how Active Inference offers a new basis for a theory of interaction in HCI, tools for design of modern, complex sensor-based systems, and integration of artificial intelligence technologies, enabling it to cope with diversity in human users and contexts. We discuss the practical challenges in implementing such Active Inference-based systems.
CYNov 29, 2024
Responsible AI Governance: A Response to UN Interim Report on Governing AI for HumanitySarah Kiden, Bernd Stahl, Beverley Townsend et al.
This report presents a comprehensive response to the United Nation's Interim Report on Governing Artificial Intelligence (AI) for Humanity. It emphasizes the transformative potential of AI in achieving the Sustainable Development Goals (SDGs) while acknowledging the need for robust governance to mitigate associated risks. The response highlights opportunities for promoting equitable, secure, and inclusive AI ecosystems, which should be supported by investments in infrastructure and multi-stakeholder collaborations across jurisdictions. It also underscores challenges, including societal inequalities exacerbated by AI, ethical concerns, and environmental impacts. Recommendations advocate for legally binding norms, transparency, and multi-layered data governance models, alongside fostering AI literacy and capacity-building initiatives. Internationally, the report calls for harmonising AI governance frameworks with established laws, human rights standards, and regulatory approaches. The report concludes with actionable principles for fostering responsible AI governance through collaboration among governments, industry, academia, and civil society, ensuring the development of AI aligns with universal human values and the public good.
CYApr 24, 2024
Integrating LSTM and BERT for Long-Sequence Data Analysis in Intelligent Tutoring SystemsZhaoxing Li, Jujie Yang, Jindi Wang et al.
The field of Knowledge Tracing aims to understand how students learn and master knowledge over time by analyzing their historical behaviour data. To achieve this goal, many researchers have proposed Knowledge Tracing models that use data from Intelligent Tutoring Systems to predict students' subsequent actions. However, with the development of Intelligent Tutoring Systems, large-scale datasets containing long-sequence data began to emerge. Recent deep learning based Knowledge Tracing models face obstacles such as low efficiency, low accuracy, and low interpretability when dealing with large-scale datasets containing long-sequence data. To address these issues and promote the sustainable development of Intelligent Tutoring Systems, we propose a LSTM BERT-based Knowledge Tracing model for long sequence data processing, namely LBKT, which uses a BERT-based architecture with a Rasch model-based embeddings block to deal with different difficulty levels information and an LSTM block to process the sequential characteristic in students' actions. LBKT achieves the best performance on most benchmark datasets on the metrics of ACC and AUC. Additionally, an ablation study is conducted to analyse the impact of each component of LBKT's overall performance. Moreover, we used t-SNE as the visualisation tool to demonstrate the model's embedding strategy. The results indicate that LBKT is faster, more interpretable, and has a lower memory cost than the traditional deep learning based Knowledge Tracing methods.
LGApr 14, 2025
Improving Controller Generalization with Dimensionless Markov Decision ProcessesValentin Charvet, Sebastian Stein, Roderick Murray-Smith
Controllers trained with Reinforcement Learning tend to be very specialized and thus generalize poorly when their testing environment differs from their training one. We propose a Model-Based approach to increase generalization where both world model and policy are trained in a dimensionless state-action space. To do so, we introduce the Dimensionless Markov Decision Process ($Π$-MDP): an extension of Contextual-MDPs in which state and action spaces are non-dimensionalized with the Buckingham-$Π$ theorem. This procedure induces policies that are equivariant with respect to changes in the context of the underlying dynamics. We provide a generic framework for this approach and apply it to a model-based policy search algorithm using Gaussian Process models. We demonstrate the applicability of our method on simulated actuated pendulum and cartpole systems, where policies trained on a single environment are robust to shifts in the distribution of the context.
IRJan 20, 2025
TutorLLM: Customizing Learning Recommendations with Knowledge Tracing and Retrieval-Augmented GenerationZhaoxing Li, Vahid Yazdanpanah, Jindi Wang et al.
The integration of AI in education offers significant potential to enhance learning efficiency. Large Language Models (LLMs), such as ChatGPT, Gemini, and Llama, allow students to query a wide range of topics, providing unprecedented flexibility. However, LLMs face challenges, such as handling varying content relevance and lack of personalization. To address these challenges, we propose TutorLLM, a personalized learning recommender LLM system based on Knowledge Tracing (KT) and Retrieval-Augmented Generation (RAG). The novelty of TutorLLM lies in its unique combination of KT and RAG techniques with LLMs, which enables dynamic retrieval of context-specific knowledge and provides personalized learning recommendations based on the student's personal learning state. Specifically, this integration allows TutorLLM to tailor responses based on individual learning states predicted by the Multi-Features with Latent Relations BERT-based KT (MLFBK) model and to enhance response accuracy with a Scraper model. The evaluation includes user assessment questionnaires and performance metrics, demonstrating a 10% improvement in user satisfaction and a 5\% increase in quiz scores compared to using general LLMs alone.
HCOct 16, 2025
An Active Inference Model of Mouse Point-and-Click BehaviourMarkus Klar, Sebastian Stein, Fraser Paterson et al.
We explore the use of Active Inference (AIF) as a computational user model for spatial pointing, a key problem in Human-Computer Interaction (HCI). We present an AIF agent with continuous state, action, and observation spaces, performing one-dimensional mouse pointing and clicking. We use a simple underlying dynamic system to model the mouse cursor dynamics with realistic perceptual delay. In contrast to previous optimal feedback control-based models, the agent's actions are selected by minimizing Expected Free Energy, solely based on preference distributions over percepts, such as observing clicking a button correctly. Our results show that the agent creates plausible pointing movements and clicks when the cursor is over the target, with similar end-point variance to human users. In contrast to other models of pointing, we incorporate fully probabilistic, predictive delay compensation into the agent. The agent shows distinct behaviour for differing target difficulties without the need to retune system parameters, as done in other approaches. We discuss the simulation results and emphasize the challenges in identifying the correct configuration of an AIF agent interacting with continuous systems.
CLAug 14, 2025
Reinforced Language Models for Sequential Decision MakingJim Dilkes, Vahid Yazdanpanah, Sebastian Stein
Large Language Models (LLMs) show potential as sequential decision-making agents, but their application is often limited due to a reliance on large, computationally expensive models. This creates a need to improve smaller models, yet existing post-training methods are designed for single-turn interactions and cannot handle credit assignment in multi-step agentic tasks. To address this, we introduce Multi-Step Group-Relative Policy Optimization (MS-GRPO), a new algorithm for post-training LLM agents, grounded in formal Text-Mediated Stochastic Game (TSMG) and Language-Agent Policy (LAP) frameworks. For credit assignment, MS-GRPO attributes the entire cumulative episode reward to each individual episode step. We supplement this algorithm with a novel absolute-advantage-weighted episode sampling strategy that we show improves training performance. We evaluate our approach by post-training a 3-billion parameter model on Snake and Frozen Lake. Our experiments demonstrate that the method is effective in improving decision-making performance: our post-trained 3B parameter model outperforms a 72B parameter baseline by 50% on the Frozen Lake task. This work demonstrates that targeted post-training is a practical and efficient alternative to relying on model scale for creating sequential decision-making agents using LLMs.
HCMar 16, 2025
PTFA: An LLM-based Agent that Facilitates Online Consensus Building through Parallel ThinkingWen Gu, Zhaoxing Li, Jan Buermann et al.
Consensus building is inherently challenging due to the diverse opinions held by stakeholders. Effective facilitation is crucial to support the consensus building process and enable efficient group decision making. However, the effectiveness of facilitation is often constrained by human factors such as limited experience and scalability. In this research, we propose a Parallel Thinking-based Facilitation Agent (PTFA) that facilitates online, text-based consensus building processes.The PTFA automatically collects real-time textual input and leverages large language models (LLMs)to perform all six distinct roles of the well-established Six Thinking Hats technique in parallel thinking.To illustrate the potential of the agent, a pilot study was conducted, demonstrating its capabilities in idea generation, emotional probing, and deeper analysis of idea quality. Additionally, future open research challenges such as optimizing scheduling and managing behaviors in divergent phase are identified. Furthermore, a comprehensive dataset that contains not only the conversational content among the participants but also between the participants and the agent is constructed for future study.
HCJan 10, 2022
Does Interacting Help Users Better Understand the Structure of Probabilistic Models?Evdoxia Taka, Sebastian Stein, John H. Williamson
Despite growing interest in probabilistic modeling approaches and availability of learning tools, people with no or less statistical background feel hesitant to use them. There is need for tools for communicating probabilistic models to less experienced users more intuitively to help them build, validate, use effectively or trust probabilistic models. Users' comprehension of probabilistic models is vital in these cases and interactive visualizations could enhance it. Although there are various studies evaluating interactivity in Bayesian reasoning and available tools for visualizing the sample-based distributions, we focus specifically on evaluating the effect of interaction on users' comprehension of probabilistic models' structure. We conducted a user study based on our Interactive Pair Plot for visualizing models' distribution and conditioning the sample space graphically. Our results suggest that improvements in the understanding of the interaction group are most pronounced for more exotic structures, such as hierarchical models or unfamiliar parameterizations in comparison to the static group. As the detail of the inferred information increases, interaction does not lead to considerably longer response times. Finally, interaction improves users' confidence.
HCJan 10, 2022
Evaluating Bayesian Model VisualisationsSebastian Stein, John H. Williamson
Probabilistic models inform an increasingly broad range of business and policy decisions ultimately made by people. Recent algorithmic, computational, and software framework development progress facilitate the proliferation of Bayesian probabilistic models, which characterise unobserved parameters by their joint distribution instead of point estimates. While they can empower decision makers to explore complex queries and to perform what-if-style conditioning in theory, suitable visualisations and interactive tools are needed to maximise users' comprehension and rational decision making under uncertainty. In this paper, propose a protocol for quantitative evaluation of Bayesian model visualisations and introduce a software framework implementing this protocol to support standardisation in evaluation practice and facilitate reproducibility. We illustrate the evaluation and analysis workflow on a user study that explores whether making Boxplots and Hypothetical Outcome Plots interactive can increase comprehension or rationality and conclude with design guidelines for researchers looking to conduct similar studies in the future.
CRMay 15, 2019
Selfish Mining in Proof-of-Work Blockchain with Multiple Miners: An Empirical EvaluationTin Leelavimolsilp, Long Tran-Thanh, Sebastian Stein et al.
Proof-of-Work blockchain, despite its numerous benefits, is still not an entirely secure technology due to the existence of Selfish Mining (SM) strategies that can disrupt the system and its mining economy. While the effect of SM has been studied mostly in a two-miners scenario, it has not been investigated in a more practical context where there are multiple malicious miners individually performing SM. To fill this gap, we carry out an empirical study that separately accounts for different numbers of SM miners (who always perform SM) and strategic miners (who choose either SM or Nakamoto's mining protocol depending on which maximises their individual mining reward). Our result shows that SM is generally more effective as the number of SM miners increases, however its effectiveness does not vary in the presence of a large number of strategic miners. Under specific mining power distributions, we also demonstrate that multiple miners can perform SM and simultaneously gain higher mining rewards than they should. Surprisingly, we also show that the more strategic miners there are, the more robust the systems become. Since blockchain miners should naturally be seen as self-interested strategic miners, our findings encourage blockchain system developers and engineers to attract as many miners as possible to prevent SM and similar behaviour.
LGMar 2, 2019
neuralRank: Searching and ranking ANN-based model repositoriesNirmit Desai, Linsong Chu, Raghu K. Ganti et al.
Widespread applications of deep learning have led to a plethora of pre-trained neural network models for common tasks. Such models are often adapted from other models via transfer learning. The models may have varying training sets, training algorithms, network architectures, and hyper-parameters. For a given application, what isthe most suitable model in a model repository? This is a critical question for practical deployments but it has not received much attention. This paper introduces the novel problem of searching and ranking models based on suitability relative to a target dataset and proposes a ranking algorithm called \textit{neuralRank}. The key idea behind this algorithm is to base model suitability on the discriminating power of a model, using a novel metric to measure it. With experimental results on the MNIST, Fashion, and CIFAR10 datasets, we demonstrate that (1) neuralRank is independent of the domain, the training set, or the network architecture and (2) that the models ranked highly by neuralRank ranking tend to have higher model accuracy in practice.
MAFeb 6, 2018
On the Preliminary Investigation of Selfish Mining Strategy with Multiple Selfish MinersTin Leelavimolsilp, Long Tran-Thanh, Sebastian Stein
Eyal and Sirer's selfish mining strategy has demonstrated that Bitcoin system is not secure even if 50% of total mining power is held by altruistic miners. Since then, researchers have been investigating either to improve the efficiency of selfish mining, or how to defend against it, typically in a single selfish miner setting. Yet there is no research on a selfish mining strategies concurrently used by multiple miners in the system. The effectiveness of such selfish mining strategies and their required mining power under such multiple selfish miners setting remains unknown. In this paper, a preliminary investigation and our findings of selfish mining strategy used by multiple miners are reported. In addition, the conventional model of Bitcoin system is slightly redesigned to tackle its shortcoming: namely, a concurrency of individual mining processes. Although a theoretical analysis of selfish mining strategy under this setting is yet to be established, the current findings based on simulations is promising and of great interest. In particular, our work shows that a lower bound of power threshold required for selfish mining strategy decreases in proportion to a number of selfish miners. Moreover, there exist Nash equilibria where all selfish miners in the system do not change to an honest mining strategy and simultaneously earn their unfair amount of mining reward given that they equally possess sufficiently large mining power. Lastly, our new model yields a power threshold for mounting selfish mining strategy slightly greater than one from the conventional model.
HCSep 5, 2016
Incentive Engineering Framework for Crowdsourcing SystemsNhat V. Q. Truong, Sebastian Stein, Long Tran-Thanh et al.
Significant effort has been made to understand user motivation and to elicit user participation in crowdsourcing systems. However, incentive engineering, i.e., designing incentives that can purposefully motivate users, is still an open question and remains one of the key challenges of crowdsourcing initiatives. In this work in progress, we propose a general and systematic incentive engineering framework that system designers can use to implement appropriate incentives in order to effect desirable user behaviours.