Matthias Scheutz

h-index52

31papers

3,008citations

Novelty40%

AI Score56

Ranked #6,883 of 194,257 authors (top 4%)#198 in AI (top 2%)

31 Papers

16.7AIJun 24, 2022Code

RAPid-Learn: A Framework for Learning to Recover for Handling Novelties in Open-World Environments

Shivam Goel, Yash Shukla, Vasanth Sarathy et al.

We propose RAPid-Learn: Learning to Recover and Plan Again, a hybrid planning and learning method, to tackle the problem of adapting to sudden and unexpected changes in an agent's environment (i.e., novelties). RAPid-Learn is designed to formulate and solve modifications to a task's Markov Decision Process (MDPs) on-the-fly and is capable of exploiting domain knowledge to learn any new dynamics caused by the environmental changes. It is capable of exploiting the domain knowledge to learn action executors which can be further used to resolve execution impasses, leading to a successful plan execution. This novelty information is reflected in its updated domain model. We demonstrate its efficacy by introducing a wide variety of novelties in a gridworld environment inspired by Minecraft, and compare our algorithm with transfer learning baselines from the literature. Our method is (1) effective even in the presence of multiple novelties, (2) more sample efficient than transfer learning RL baselines, and (3) robust to incomplete model information, as opposed to pure symbolic planning approaches.

2.1AIFeb 28, 2023

Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

Tung Thai, Ming Shen, Mayank Garg et al. · amazon-science

Learning to detect, characterize and accommodate novelties is a challenge that agents operating in open-world domains need to address to be able to guarantee satisfactory task performance. Certain novelties (e.g., changes in environment dynamics) can interfere with the performance or prevent agents from accomplishing task goals altogether. In this paper, we introduce general methods and architectural mechanisms for detecting and characterizing different types of novelties, and for building an appropriate adaptive model to accommodate them utilizing logical representations and reasoning methods. We demonstrate the effectiveness of the proposed methods in evaluations performed by a third party in the adversarial multi-agent board game Monopoly. The results show high novelty detection and accommodation rates across a variety of novelty types, including changes to the rules of the game, as well as changes to the agent's action capabilities.

4.6LGAug 1, 2022Code

Joint covariate-alignment and concept-alignment: a framework for domain generalization

Thuan Nguyen, Boyang Lyu, Prakash Ishwar et al.

In this paper, we propose a novel domain generalization (DG) framework based on a new upper bound to the risk on the unseen domain. Particularly, our framework proposes to jointly minimize both the covariate-shift as well as the concept-shift between the seen domains for a better performance on the unseen domain. While the proposed approach can be implemented via an arbitrary combination of covariate-alignment and concept-alignment modules, in this work we use well-established approaches for distributional alignment namely, Maximum Mean Discrepancy (MMD) and covariance Alignment (CORAL), and use an Invariant Risk Minimization (IRM)-based approach for concept alignment. Our numerical results show that the proposed methods perform as well as or better than the state-of-the-art for domain generalization on several data sets.

5.8ROApr 4

Build on Priors: Vision--Language--Guided Neuro-Symbolic Imitation Learning for Data-Efficient Real-World Robot Manipulation

Pierrick Lorang, Johannes Huemer, Timothy Duggan et al.

Enabling robots to learn long-horizon manipulation tasks from a handful of demonstrations remains a central challenge in robotics. Existing neuro-symbolic approaches often rely on hand-crafted symbolic abstractions, semantically labeled trajectories or large demonstration datasets, limiting their scalability and real-world applicability. We present a scalable neuro-symbolic framework that autonomously constructs symbolic planning domains and data-efficient control policies from as few as one to thirty unannotated skill demonstrations, without requiring manual domain engineering. Our method segments demonstrations into skills and employs a Vision-Language Model (VLM) to classify skills and identify equivalent high-level states, enabling automatic construction of a state-transition graph. This graph is processed by an Answer Set Programming solver to synthesize a PDDL planning domain, which an oracle function exploits to isolate the minimal, task-relevant and target relative observation and action spaces for each skill policy. Policies are learned at the control reference level rather than at the raw actuator signal level, yielding a smoother and less noisy learning target. Known controllers can be leveraged for real-world data augmentation by projecting a single demonstration onto other objects in the scene, simultaneously enriching the graph construction process and the dataset for imitation learning. We validate our framework primarily on a real industrial forklift across statistically rigorous manipulation trials, and demonstrate cross-platform generality on a Kinova Gen3 robotic arm across two standard benchmarks. Our results show that grounding control learning, VLM-driven abstraction, and automated planning synthesis into a unified pipeline constitutes a practical path toward scalable, data-efficient, expert-free and interpretable neuro-symbolic robotics.

3.8LGApr 2, 2023Code

A principled approach to model validation in domain generalization

Boyang Lyu, Thuan Nguyen, Matthias Scheutz et al.

Domain generalization aims to learn a model with good generalization ability, that is, the learned model should not only perform well on several seen domains but also on unseen domains with different data distributions. State-of-the-art domain generalization methods typically train a representation function followed by a classifier jointly to minimize both the classification risk and the domain discrepancy. However, when it comes to model selection, most of these methods rely on traditional validation routines that select models solely based on the lowest classification risk on the validation set. In this paper, we theoretically demonstrate a trade-off between minimizing classification risk and mitigating domain discrepancy, i.e., it is impossible to achieve the minimum of these two objectives simultaneously. Motivated by this theoretical result, we propose a novel model selection method suggesting that the validation process should account for both the classification risk and the domain discrepancy. We validate the effectiveness of the proposed method by numerical results on several domain generalization datasets.

4.6LGOct 26, 2022Code

Trade-off between reconstruction loss and feature alignment for domain generalization

Thuan Nguyen, Boyang Lyu, Prakash Ishwar et al.

Domain generalization (DG) is a branch of transfer learning that aims to train the learning models on several seen domains and subsequently apply these pre-trained models to other unseen (unknown but related) domains. To deal with challenging settings in DG where both data and label of the unseen domain are not available at training time, the most common approach is to design the classifiers based on the domain-invariant representation features, i.e., the latent representations that are unchanged and transferable between domains. Contrary to popular belief, we show that designing classifiers based on invariant representation features alone is necessary but insufficient in DG. Our analysis indicates the necessity of imposing a constraint on the reconstruction loss induced by representation functions to preserve most of the relevant information about the label in the latent space. More importantly, we point out the trade-off between minimizing the reconstruction loss and achieving domain alignment in DG. Our theoretical results motivate a new DG framework that jointly optimizes the reconstruction loss and the domain discrepancy. Both theoretical and numerical results are provided to justify our approach.

1.2SYNov 18, 2017

Norm Conflict Resolution in Stochastic Domains

Daniel Kasenberg, Matthias Scheutz

Artificial agents will need to be aware of human moral and social norms, and able to use them in decision-making. In particular, artificial agents will need a principled approach to managing conflicting norms, which are common in human social interactions. Existing logic-based approaches suffer from normative explosion and are typically designed for deterministic environments; reward-based approaches lack principled ways of determining which normative alternatives exist in a given environment. We propose a hybrid approach, using Linear Temporal Logic (LTL) representations in Markov Decision Processes (MDPs), that manages norm conflicts in a systematic manner while accommodating domain stochasticity. We provide a proof-of-concept implementation in a simulated vacuum cleaning domain.

10.5ROMar 11

Novelty Adaptation Through Hybrid Large Language Model (LLM)-Symbolic Planning and LLM-guided Reinforcement Learning

Hong Lu, Pierrick Lorang, Timothy R. Duggan et al.

In dynamic open-world environments, autonomous agents often encounter novelties that hinder their ability to find plans to achieve their goals. Specifically, traditional symbolic planners fail to generate plans when the robot's planning domain lacks the operators that enable it to interact appropriately with novel objects in the environment. We propose a neuro-symbolic architecture that integrates symbolic planning, reinforcement learning, and a large language model (LLM) to learn how to handle novel objects. In particular, we leverage the common sense reasoning capability of the LLM to identify missing operators, generate plans with the symbolic AI planner, and write reward functions to guide the reinforcement learning agent in learning control policies for newly identified operators. Our method outperforms the state-of-the-art methods in operator discovery as well as operator learning in continuous robotic domains.

3.7CVJun 23, 2022Code

NovelCraft: A Dataset for Novelty Detection and Discovery in Open Worlds

Patrick Feeney, Sarah Schneider, Panagiotis Lymperopoulos et al.

In order for artificial agents to successfully perform tasks in changing environments, they must be able to both detect and adapt to novelty. However, visual novelty detection research often only evaluates on repurposed datasets such as CIFAR-10 originally intended for object classification, where images focus on one distinct, well-centered object. New benchmarks are needed to represent the challenges of navigating the complex scenes of an open world. Our new NovelCraft dataset contains multimodal episodic data of the images and symbolic world-states seen by an agent completing a pogo stick assembly task within a modified Minecraft environment. In some episodes, we insert novel objects of varying size within the complex 3D scene that may impact gameplay. Our visual novelty detection benchmark finds that methods that rank best on popular area-under-the-curve metrics may be outperformed by simpler alternatives when controlling false positives matters most. Further multimodal novelty detection experiments suggest that methods that fuse both visual and symbolic information can improve time until detection as well as overall discrimination. Finally, our evaluation of recent generalized category discovery methods suggests that adapting to new imbalanced categories in complex scenes remains an exciting open problem.

4.9CLNov 6, 2025

IntelliProof: An Argumentation Network-based Conversational Helper for Organized Reflection

Kaveh Eskandari Miandoab, Katharine Kowalyshyn, Kabir Pamnani et al.

We present IntelliProof, an interactive system for analyzing argumentative essays through LLMs. IntelliProof structures an essay as an argumentation graph, where claims are represented as nodes, supporting evidence is attached as node properties, and edges encode supporting or attacking relations. Unlike existing automated essay scoring systems, IntelliProof emphasizes the user experience: each relation is initially classified and scored by an LLM, then visualized for enhanced understanding. The system provides justifications for classifications and produces quantitative measures for essay coherence. It enables rapid exploration of argumentative quality while retaining human oversight. In addition, IntelliProof provides a set of tools for a better understanding of an argumentative essay and its corresponding graph in natural language, bridging the gap between the structural semantics of argumentative essays and the user's understanding of a given text.

13.5ROFeb 6, 2025

Probing a Vision-Language-Action Model for Symbolic States and Integration into a Cognitive Architecture

Hong Lu, Hengxu Li, Prithviraj Singh Shahani et al.

Vision-language-action (VLA) models hold promise as generalist robotics solutions by translating visual and linguistic inputs into robot actions, yet they lack reliability due to their black-box nature and sensitivity to environmental changes. In contrast, cognitive architectures (CA) excel in symbolic reasoning and state monitoring but are constrained by rigid predefined execution. This work bridges these approaches by probing OpenVLA's hidden layers to uncover symbolic representations of object properties, relations, and action states, enabling integration with a CA for enhanced interpretability and robustness. Through experiments on LIBERO-spatial pick-and-place tasks, we analyze the encoding of symbolic states across different layers of OpenVLA's Llama backbone. Our probing results show consistently high accuracies (> 0.90) for both object and action states across most layers, though contrary to our hypotheses, we did not observe the expected pattern of object states being encoded earlier than action states. We demonstrate an integrated DIARC-OpenVLA system that leverages these symbolic representations for real-time state monitoring, laying the foundation for more interpretable and reliable robotic manipulation.

7.3AIJan 7, 2024

NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds

Shivam Goel, Yichen Wei, Panagiotis Lymperopoulos et al.

As AI agents leave the lab and venture into the real world as autonomous vehicles, delivery robots, and cooking robots, it is increasingly necessary to design and comprehensively evaluate algorithms that tackle the ``open-world''. To this end, we introduce NovelGym, a flexible and adaptable ecosystem designed to simulate gridworld environments, serving as a robust platform for benchmarking reinforcement learning (RL) and hybrid planning and learning agents in open-world contexts. The modular architecture of NovelGym facilitates rapid creation and modification of task environments, including multi-agent scenarios, with multiple environment transformations, thus providing a dynamic testbed for researchers to develop open-world AI agents.

6.7CLSep 2, 2025

LLMs and their Limited Theory of Mind: Evaluating Mental State Annotations in Situated Dialogue

Katharine Kowalyshyn, Matthias Scheutz

What if large language models could not only infer human mindsets but also expose every blind spot in team dialogue such as discrepancies in the team members' joint understanding? We present a novel, two-step framework that leverages large language models (LLMs) both as human-style annotators of team dialogues to track the team's shared mental models (SMMs) and as automated discrepancy detectors among individuals' mental states. In the first step, an LLM generates annotations by identifying SMM elements within task-oriented dialogues from the Cooperative Remote Search Task (CReST) corpus. Then, a secondary LLM compares these LLM-derived annotations and human annotations against gold-standard labels to detect and characterize divergences. We define an SMM coherence evaluation framework for this use case and apply it to six CReST dialogues, ultimately producing: (1) a dataset of human and LLM annotations; (2) a reproducible evaluation framework for SMM coherence; and (3) an empirical assessment of LLM-based discrepancy detection. Our results reveal that, although LLMs exhibit apparent coherence on straightforward natural-language annotation tasks, they systematically err in scenarios requiring spatial reasoning or disambiguation of prosodic cues.

1.2CYAug 14, 2025

Are AI Machines Making Humans Obsolete?

Matthias Scheutz

This chapter starts with a sketch of how we got to "generative AI" (GenAI) and a brief summary of the various impacts it had so far. It then discusses some of the opportunities of GenAI, followed by the challenges and dangers, including dystopian outcomes resulting from using uncontrolled machine learning and our failures to understand the results. It concludes with some suggestions for how to control GenAI and address its dangers.

6.9LGJan 25, 2022Code

Conditional entropy minimization principle for learning domain invariant representation features

Thuan Nguyen, Boyang Lyu, Prakash Ishwar et al.

Invariance-principle-based methods such as Invariant Risk Minimization (IRM), have recently emerged as promising approaches for Domain Generalization (DG). Despite promising theory, such approaches fail in common classification tasks due to the mixing of true invariant features and spurious invariant features. To address this, we propose a framework based on the conditional entropy minimization (CEM) principle to filter-out the spurious invariant features leading to a new algorithm with a better generalization capability. We show that our proposed approach is closely related to the well-known Information Bottleneck (IB) framework and prove that under certain assumptions, entropy minimization can exactly recover the true invariant features. Our approach provides competitive classification accuracy compared to recent theoretically-principled state-of-the-art alternatives across several DG datasets.

1.0CLOct 12, 2021

Decision-Theoretic Question Generation for Situated Reference Resolution: An Empirical Study and Computational Model

Felix Gervits, Gordon Briggs, Antonio Roque et al.

Dialogue agents that interact with humans in situated environments need to manage referential ambiguity across multiple modalities and ask for help as needed. However, it is not clear what kinds of questions such agents should ask nor how the answers to such questions can be used to resolve ambiguity. To address this, we analyzed dialogue data from an interactive study in which participants controlled a virtual robot tasked with organizing a set of tools while engaging in dialogue with a live, remote experimenter. We discovered a number of novel results, including the distribution of question types used to resolve ambiguity and the influence of dialogue-level factors on the reference resolution process. Based on these empirical findings we: (1) developed a computational model for clarification requests using a decision network with an entropy-based utility assignment method that operates across modalities, (2) evaluated the model, showing that it outperforms a slot-filling baseline in environments of varying ambiguity, and (3) interpreted the results to offer insight into the ways that agents can ask questions to facilitate situated reference resolution.

8.4LGSep 4, 2021Code

Barycentric-alignment and reconstruction loss minimization for domain generalization

Boyang Lyu, Thuan Nguyen, Prakash Ishwar et al.

This paper advances the theory and practice of Domain Generalization (DG) in machine learning. We consider the typical DG setting where the hypothesis is composed of a representation mapping followed by a labeling function. Within this setting, the majority of popular DG methods aim to jointly learn the representation and the labeling functions by minimizing a well-known upper bound for the classification risk in the unseen domain. In practice, however, methods based on this theoretical upper bound ignore a term that cannot be directly optimized due to its dual dependence on both the representation mapping and the unknown optimal labeling function in the unseen domain. To bridge this gap between theory and practice, we introduce a new upper bound that is free of terms having such dual dependence, resulting in a fully optimizable risk upper bound for the unseen domain. Our derivation leverages classical and recent transport inequalities that link optimal transport metrics with information-theoretic measures. Compared to previous bounds, our bound introduces two new terms: (i) the Wasserstein-2 barycenter term that aligns distributions between domains, and (ii) the reconstruction loss term that assesses the quality of representation in reconstructing the original data. Based on this new upper bound, we propose a novel DG algorithm named Wasserstein Barycenter Auto-Encoder (WBAE) that simultaneously minimizes the classification loss, the barycenter loss, and the reconstruction loss. Numerical results demonstrate that the proposed method outperforms current state-of-the-art DG algorithms on several datasets.

2.4AIJul 9, 2021

Integrating Planning, Execution and Monitoring in the presence of Open World Novelties: Case Study of an Open World Monopoly Solver

Sriram Gopalakrishnan, Utkarsh Soni, Tung Thai et al.

The game of monopoly is an adversarial multi-agent domain where there is no fixed goal other than to be the last player solvent, There are useful subgoals like monopolizing sets of properties, and developing them. There is also a lot of randomness from dice rolls, card-draws, and adversaries' strategies. This unpredictability is made worse when unknown novelties are added during gameplay. Given these challenges, Monopoly was one of the test beds chosen for the DARPA-SAILON program which aims to create agents that can detect and accommodate novelties. To handle the game complexities, we developed an agent that eschews complete plans, and adapts it's policy online as the game evolves. In the most recent independent evaluation in the SAILON program, our agent was the best performing agent on most measures. We herein present our approach and results.

31.3CLJun 11, 2021Code

How Should Agents Ask Questions For Situated Learning? An Annotated Dialogue Corpus

Felix Gervits, Antonio Roque, Gordon Briggs et al.

Intelligent agents that are confronted with novel concepts in situated environments will need to ask their human teammates questions to learn about the physical world. To better understand this problem, we need data about asking questions in situated task-based interactions. To this end, we present the Human-Robot Dialogue Learning (HuRDL) Corpus - a novel dialogue corpus collected in an online interactive virtual environment in which human participants play the role of a robot performing a collaborative tool-organization task. We describe the corpus data and a corresponding annotation scheme to offer insight into the form and content of questions that humans ask to facilitate learning in a situated environment. We provide the corpus as an empirically-grounded resource for improving question generation in situated intelligent agents.

3.0ROApr 7, 2021Code

Robot Development and Path Planning for Indoor Ultraviolet Light Disinfection

Jonathan Conroy, Christopher Thierauf, Parker Rule et al.

Regular irradiation of indoor environments with ultraviolet C (UVC) light has become a regular task for many indoor settings as a result of COVID-19, but current robotic systems attempting to automate it suffer from high costs and inefficient irradiation. In this paper, we propose a purpose-made inexpensive robotic platform with off-the-shelf components and standard navigation software that, with a novel algorithm for finding optimal irradiation locations, addresses both shortcomings to offer affordable and efficient solutions for UVC irradiation. We demonstrate in simulations the efficacy of the algorithm and show a prototypical run of the autonomous integrated robotic system in an indoor environment. In our sample instances, our proposed algorithm reduces the time needed by roughly 30\% while it increases the coverage by a factor of 35\% (when compared to the best possible placement of a static light).

18.0AIDec 24, 2020

SPOTTER: Extending Symbolic Planning Operators through Targeted Reinforcement Learning

Vasanth Sarathy, Daniel Kasenberg, Shivam Goel et al.

Symbolic planning models allow decision-making agents to sequence actions in arbitrary ways to achieve a variety of goals in dynamic domains. However, they are typically handcrafted and tend to require precise formulations that are not robust to human error. Reinforcement learning (RL) approaches do not require such models, and instead learn domain dynamics by exploring the environment and collecting rewards. However, RL approaches tend to require millions of episodes of experience and often learn policies that are not easily transferable to other tasks. In this paper, we address one aspect of the open problem of integrating these approaches: how can decision-making agents resolve discrepancies in their symbolic planning models while attempting to accomplish goals? We propose an integrated framework named SPOTTER that uses RL to augment and support ("spot") a planning agent by discovering new operators needed by the agent to accomplish goals that are initially unreachable for the agent. SPOTTER outperforms pure-RL approaches while also discovering transferable symbolic knowledge and does not require supervision, successful plan traces or any a priori knowledge about the missing planning operator.

4.1ROMay 4, 2020

"Can you do this?" Self-Assessment Dialogues with Autonomous Robots Before, During, and After a Mission

Tyler Frasca, Evan Krause, Ravenna Thielstrom et al.

Autonomous robots with sophisticated capabilities can make it difficult for human instructors to assess its capabilities and proficiencies. Therefore, it is important future robots have the ability to: introspect on their capabilities and assess their task performance. Introspection allows the robot to determine what it can accomplish and self-assessment allows the robot estimate the likelihood it will accomplish at given task. We introduce a general framework for introspection and self-assessment that enables robots to have task and performance-based dialogues before, during, and after a mission. We then realize aspects of the framework in the cognitive robotic DIARC architecture, and finally show a proof-of-concept demonstration on a Nao robot showing its self-assessment capabilities before, during, and after an instructed task.

30.0CLNov 1, 2019

Engaging in Dialogue about an Agent's Norms and Behaviors

Daniel Kasenberg, Antonio Roque, Ravenna Thielstrom et al.

We present a set of capabilities allowing an agent planning with moral and social norms represented in temporal logic to respond to queries about its norms and behaviors in natural language, and for the human user to add and remove norms directly in natural language. The user may also pose hypothetical modifications to the agent's norms and inquire about their effects.

30.1CLNov 1, 2019

Generating Justifications for Norm-Related Agent Decisions

Daniel Kasenberg, Antonio Roque, Ravenna Thielstrom et al.

We present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. We use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.

4.9ROFeb 4, 2019

When Exceptions are the Norm: Exploring the Role of Consent in HRI

Vasanth Sarathy, Thomas Arnold, Matthias Scheutz

HRI researchers have made major strides in developing robotic architectures that are capable of reading a limited set of social cues and producing behaviors that enhance their likeability and feeling of comfort amongst humans. However, the cues in these models are fairly direct and the interactions largely dyadic. To capture the normative qualities of interaction more robustly, we propose consent as a distinct, critical area for HRI research. Convening important insights in existing HRI work around topics like touch, proxemics, gaze, and moral norms, the notion of consent reveals key expectations that can shape how a robot acts in social space. By sorting various kinds of consent through social and legal doctrine, we delineate empirical and technical questions to meet consent challenges faced in major application domains and robotic roles. Attention to consent could show, for example, how extraordinary, norm-violating actions can be justified by agents and accepted by those around them. We argue that operationalizing ideas from legal scholarship can better guide how robotic systems might cultivate and sustain proper forms of consent.

2.9RONov 26, 2018

Augmenting Robot Knowledge Consultants with Distributed Short Term Memory

Tom Williams, Ravenna Thielstrom, Evan Krause et al.

Human-robot communication in situated environments involves a complex interplay between knowledge representations across a wide variety of modalities. Crucially, linguistic information must be associated with representations of objects, locations, people, and goals, which may be represented in very different ways. In previous work, we developed a Consultant Framework that facilitates modality-agnostic access to information distributed across a set of heterogeneously represented knowledge sources. In this work, we draw inspiration from cognitive science to augment these distributed knowledge sources with Short Term Memory Buffers to create an STM-augmented algorithm for referring expression generation. We then discuss the potential performance benefits of this approach and insights from cognitive science that may inform future refinements in the design of our approach.

1.7AIJul 6, 2018

Quasi-Dilemmas for Artificial Moral Agents

Daniel Kasenberg, Vasanth Sarathy, Thomas Arnold et al.

In this paper we describe moral quasi-dilemmas (MQDs): situations similar to moral dilemmas, but in which an agent is unsure whether exploring the plan space or the world may reveal a course of action that satisfies all moral requirements. We argue that artificial moral agents (AMAs) should be built to handle MQDs (in particular, by exploring the plan space rather than immediately accepting the inevitability of the moral dilemma), and that MQDs may be useful for evaluating AMA architectures.

15.2SYOct 28, 2017

Interpretable Apprenticeship Learning with Temporal Logic Specifications

Daniel Kasenberg, Matthias Scheutz

Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains.

19.0AIJul 15, 2017

AI Challenges in Human-Robot Cognitive Teaming

Tathagata Chakraborti, Subbarao Kambhampati, Matthias Scheutz et al.

Among the many anticipated roles for robots in the future is that of being a human teammate. Aside from all the technological hurdles that have to be overcome with respect to hardware and control to make robots fit to work with humans, the added complication here is that humans have many conscious and subconscious expectations of their teammates - indeed, we argue that teaming is mostly a cognitive rather than physical coordination activity. This introduces new challenges for the AI and robotics community and requires fundamental changes to the traditional approach to the design of autonomy. With this in mind, we propose an update to the classical view of the intelligent agent architecture, highlighting the requirements for mental modeling of the human in the deliberative process of the autonomous agent. In this article, we outline briefly the recent efforts of ours, and others in the community, towards developing cognitive teammates along these guidelines.

5.6AIApr 26, 2017

The MacGyver Test - A Framework for Evaluating Machine Resourcefulness and Creative Problem Solving

Vasanth Sarathy, Matthias Scheutz

Current measures of machine intelligence are either difficult to evaluate or lack the ability to test a robot's problem-solving capacity in open worlds. We propose a novel evaluation framework based on the formal notion of MacGyver Test which provides a practical way for assessing the resilience and resourcefulness of artificial agents.

5.4ROFeb 11, 2016

Enabling Basic Normative HRI in a Cognitive Robotic Architecture

Vasanth Sarathy, Jason R. Wilson, Thomas Arnold et al.

Collaborative human activities are grounded in social and moral norms, which humans consciously and subconsciously use to guide and constrain their decision-making and behavior, thereby strengthening their interactions and preventing emotional and physical harm. This type of norm-based processing is also critical for robots in many human-robot interaction scenarios (e.g., when helping elderly and disabled persons in assisted living facilities, or assisting humans in assembly tasks in factories or even the space station). In this position paper, we will briefly describe how several components in an integrated cognitive architecture can be used to implement processes that are required for normative human-robot interactions, especially in collaborative tasks where actions and situations could potentially be perceived as threatening and thus need a change in course of action to mitigate the perceived threats.