LGJan 28
Adapting the Behavior of Reinforcement Learning Agents to Changing Action Spaces and Reward FunctionsRaul de la Rosa, Ivana Dusparic, Nicolas Cardozo
Reinforcement Learning (RL) agents often struggle in real-world applications where environmental conditions are non-stationary, particularly when reward functions shift or the available action space expands. This paper introduces MORPHIN, a self-adaptive Q-learning framework that enables on-the-fly adaptation without full retraining. By integrating concept drift detection with dynamic adjustments to learning and exploration hyperparameters, MORPHIN adapts agents to changes in both the reward function and on-the-fly expansions of the agent's action space, while preserving prior policy knowledge to prevent catastrophic forgetting. We validate our approach using a Gridworld benchmark and a traffic signal control simulation. The results demonstrate that MORPHIN achieves superior convergence speed and continuous adaptation compared to a standard Q-learning baseline, improving learning efficiency by up to 1.7x.
5.1LOApr 21
Equational and Inductive Reasoning for Maude in AthenaMateo Sanabria, Carlos Varela, Camilo Rocha et al.
In the rewriting logic framework, equational-based specifications are used to define deterministic functional behavior, abstract data types, and canonical representations of data. These specifications include a (possibly order-sorted) signature and equations interpreted modulo structural axioms, such as associativity, commutativity, and identity. While equational rewriting provides a powerful basis for execution and symbolic reasoning, it does not by itself offer native support for inductive or deductive reasoning. This paper presents maude2athena, a framework that systematically translates Maude's equational theories into Athena, a theorem proving language designed to support natural deduction proofs over many-sorted first-order logic specifications, including inductive reasoning, equational chaining, case-based reasoning, and proofs by contradiction. The translation supports induction-based reasoning modulo structural axioms with parametric induction rules; it faithfully encodes membership equational logic in a many-sorted setting without exponential blowup under reasonable conditions. This approach preserves the semantics of the original specification, while ensuring that the translation remains compact and amenable to deductive reasoning. This work helps bridge the gap between model checking and theorem proving, enabling formal verification efforts that can benefit from both of these approaches.
AIMar 11, 2021
Adaptation to Unknown Situations as the Holy Grail of Learning-Based Self-Adaptive Systems: Research DirectionsIvana Dusparic, Nicolas Cardozo
Self-adaptive systems continuously adapt to changes in their execution environment. Capturing all possible changes to define suitable behaviour beforehand is unfeasible, or even impossible in the case of unknown changes, hence human intervention may be required. We argue that adapting to unknown situations is the ultimate challenge for self-adaptive systems. Learning-based approaches are used to learn the suitable behaviour to exhibit in the case of unknown situations, to minimize or fully remove human intervention. While such approaches can, to a certain extent, generalize existing adaptations to new situations, there is a number of breakthroughs that need to be achieved before systems can adapt to general unknown and unforeseen situations. We posit the research directions that need to be explored to achieve unanticipated adaptation from the perspective of learning-based self-adaptive systems. At minimum, systems need to define internal representations of previously unseen situations on-the-fly, extrapolate the relationship to the previously encountered situations to evolve existing adaptations, and reason about the feasibility of achieving their intrinsic goals in the new set of conditions. We close discussing whether, even when we can, we should indeed build systems that define their own behaviour and adapt their goals, without involving a human supervisor.