HCNov 12, 2022
Seamful XAI: Operationalizing Seamful Design in Explainable AIUpol Ehsan, Q. Vera Liao, Samir Passi et al. · gatech
Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. While black-boxing AI systems can make the user experience seamless, hiding the seams risks disempowering users to mitigate fallouts from AI mistakes. Instead of hiding these AI imperfections, can we leverage them to help the user? While Explainable AI (XAI) has predominantly tackled algorithmic opaqueness, we propose that seamful design can foster AI explainability by revealing and leveraging sociotechnical and infrastructural mismatches. We introduce the concept of Seamful XAI by (1) conceptually transferring "seams" to the AI context and (2) developing a design process that helps stakeholders anticipate and design with seams. We explore this process with 43 AI practitioners and real end-users, using a scenario-based co-design activity informed by real-world use cases. We found that the Seamful XAI design process helped users foresee AI harms, identify underlying reasons (seams), locate them in the AI's lifecycle, learn how to leverage seamful information to improve XAI and user agency. We share empirical insights, implications, and reflections on how this process can help practitioners anticipate and craft seams in AI, how seamfulness can improve explainability, empower end-users, and facilitate Responsible AI.
HCFeb 1, 2023
Charting the Sociotechnical Gap in Explainable AI: A Framework to Address the Gap in XAIUpol Ehsan, Koustuv Saha, Munmun De Choudhury et al. · gatech
Explainable AI (XAI) systems are sociotechnical in nature; thus, they are subject to the sociotechnical gap--divide between the technical affordances and the social needs. However, charting this gap is challenging. In the context of XAI, we argue that charting the gap improves our problem understanding, which can reflexively provide actionable insights to improve explainability. Utilizing two case studies in distinct domains, we empirically derive a framework that facilitates systematic charting of the sociotechnical gap by connecting AI guidelines in the context of XAI and elucidating how to use them to address the gap. We apply the framework to a third case in a new domain, showcasing its affordances. Finally, we discuss conceptual implications of the framework, share practical considerations in its operationalization, and offer guidance on transferring it to new contexts. By making conceptual and practical contributions to understanding the sociotechnical gap in XAI, the framework expands the XAI design space.
CYJun 3, 2022
The Algorithmic ImprintUpol Ehsan, Ranjit Singh, Jacob Metcalf et al. · gatech
When algorithmic harms emerge, a reasonable response is to stop using the algorithm to resolve concerns related to fairness, accountability, transparency, and ethics (FATE). However, just because an algorithm is removed does not imply its FATE-related issues cease to exist. In this paper, we introduce the notion of the "algorithmic imprint" to illustrate how merely removing an algorithm does not necessarily undo or mitigate its consequences. We operationalize this concept and its implications through the 2020 events surrounding the algorithmic grading of the General Certificate of Education (GCE) Advanced (A) Level exams, an internationally recognized UK-based high school diploma exam administered in over 160 countries. While the algorithmic standardization was ultimately removed due to global protests, we show how the removal failed to undo the algorithmic imprint on the sociotechnical infrastructures that shape students', teachers', and parents' lives. These events provide a rare chance to analyze the state of the world both with and without algorithmic mediation. We situate our case study in Bangladesh to illustrate how algorithms made in the Global North disproportionately impact stakeholders in the Global South. Chronicling more than a year-long community engagement consisting of 47 inter-views, we present the first coherent timeline of "what" happened in Bangladesh, contextualizing "why" and "how" they happened through the lenses of the algorithmic imprint and situated algorithmic fairness. Analyzing these events, we highlight how the contours of the algorithmic imprints can be inferred at the infrastructural, social, and individual levels. We share conceptual and practical implications around how imprint-awareness can (a) broaden the boundaries of how we think about algorithmic impact, (b) inform how we design algorithms, and (c) guide us in AI governance.
HCNov 11, 2022
Social Construction of XAI: Do We Need One Definition to Rule Them All?Upol Ehsan, Mark O. Riedl · gatech
There is a growing frustration amongst researchers and developers in Explainable AI (XAI) around the lack of consensus around what is meant by 'explainability'. Do we need one definition of explainability to rule them all? In this paper, we argue why a singular definition of XAI is neither feasible nor desirable at this stage of XAI's development. We view XAI through the lenses of Social Construction of Technology (SCOT) to explicate how diverse stakeholders (relevant social groups) have different interpretations (interpretative flexibility) that shape the meaning of XAI. Forcing a standardization (closure) on the pluralistic interpretations too early can stifle innovation and lead to premature conclusions. We share how we can leverage the pluralism to make progress in XAI without having to wait for a definitional consensus.
HCAug 9, 2024
Explainable AI Reloaded: Challenging the XAI Status Quo in the Era of Large Language ModelsUpol Ehsan, Mark O. Riedl · gatech
When the initial vision of Explainable (XAI) was articulated, the most popular framing was to open the (proverbial) "black-box" of AI so that we could understand the inner workings. With the advent of Large Language Models (LLMs), the very ability to open the black-box is increasingly limited especially when it comes to non-AI expert end-users. In this paper, we challenge the assumption of "opening" the black-box in the LLM era and argue for a shift in our XAI expectations. Highlighting the epistemic blind spots of an algorithm-centered XAI view, we argue that a human-centered perspective can be a path forward. We operationalize the argument by synthesizing XAI research along three dimensions: explainability outside the black-box, explainability around the edges of the black box, and explainability that leverages infrastructural seams. We conclude with takeaways that reflexively inform XAI as a domain.
AIJan 16, 2023
Neuro-Symbolic World Models for Adapting to Open World NoveltyJonathan Balloch, Zhiyu Lin, Robert Wright et al. · gatech
Open-world novelty--a sudden change in the mechanics or properties of an environment--is a common occurrence in the real world. Novelty adaptation is an agent's ability to improve its policy performance post-novelty. Most reinforcement learning (RL) methods assume that the world is a closed, fixed process. Consequentially, RL policies adapt inefficiently to novelties. To address this, we introduce WorldCloner, an end-to-end trainable neuro-symbolic world model for rapid novelty adaptation. WorldCloner learns an efficient symbolic representation of the pre-novelty environment transitions, and uses this transition model to detect novelty and efficiently adapt to novelty in a single-shot fashion. Additionally, WorldCloner augments the policy learning process using imagination-based adaptation, where the world model simulates transitions of the post-novelty environment to help the policy adapt. By blending ''imagined'' transitions with interactions in the post-novelty environment, performance can be recovered with fewer total environment interactions. Using environments designed for studying novelty in sequential decision-making problems, we show that the symbolic world model helps its neural policy adapt more efficiently than model-based and model-based neural-only reinforcement learning methods.
AIOct 10, 2022
Experiential Explanations for Reinforcement LearningAmal Alabdulkarim, Madhuri Singh, Gennie Mansi et al. · gatech
Reinforcement learning (RL) systems can be complex and non-interpretable, making it challenging for non-AI experts to understand or intervene in their decisions. This is due in part to the sequential nature of RL in which actions are chosen because of their likelihood of obtaining future rewards. However, RL agents discard the qualitative features of their training, making it difficult to recover user-understandable information for "why" an action is chosen. We propose a technique Experiential Explanations to generate counterfactual explanations by training influence predictors along with the RL policy. Influence predictors are models that learn how different sources of reward affect the agent in different states, thus restoring information about how the policy reflects the environment. Two human evaluation studies revealed that participants presented with Experiential Explanations were better able to correctly guess what an agent would do than those presented with other standard types of explanation. Participants also found that Experiential Explanations are more understandable, satisfying, complete, useful, and accurate. Qualitative analysis provides information on the factors of Experiential Explanations that are most useful and the desired characteristics that participants seek from the explanations.
CLDec 16, 2022
Neural Story PlanningAnbang Ye, Christopher Cui, Taiwei Shi et al.
Automated plot generation is the challenge of generating a sequence of events that will be perceived by readers as the plot of a coherent story. Traditional symbolic planners plan a story from a goal state and guarantee logical causal plot coherence but rely on a library of hand-crafted actions with their preconditions and effects. This closed world setting limits the length and diversity of what symbolic planners can generate. On the other hand, pre-trained neural language models can generate stories with great diversity, while being generally incapable of ending a story in a specified manner and can have trouble maintaining coherence. In this paper, we present an approach to story plot generation that unifies causal planning with neural language models. We propose to use commonsense knowledge extracted from large language models to recursively expand a story plot in a backward chaining fashion. Specifically, our system infers the preconditions for events in the story and then events that will cause those conditions to become true. We performed automatic evaluation to measure narrative coherence as indicated by the ability to answer questions about whether different events in the story are causally related to other events. Results indicate that our proposed method produces more coherent plotlines than several strong baselines.
HCJan 29
From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms for Dignified Human-AI InteractionUpol Ehsan, Samir Passi, Koustuv Saha et al.
In the future of work discourse, AI is touted as the ultimate productivity amplifier. Yet, beneath the efficiency gains lie subtle erosions of human expertise and agency. This paper shifts focus from the future of work to the future of workers by navigating the AI-as-Amplifier Paradox: AI's dual role as enhancer and eroder, simultaneously strengthening performance while eroding underlying expertise. We present a year-long study on the longitudinal use of AI in a high-stakes workplace among cancer specialists. Initial operational gains hid ``intuition rust'': the gradual dulling of expert judgment. These asymptomatic effects evolved into chronic harms, such as skill atrophy and identity commoditization. Building on these findings, we offer a framework for dignified Human-AI interaction co-constructed with professional knowledge workers facing AI-induced skill erosion without traditional labor protections. The framework operationalizes sociotechnical immunity through dual-purpose mechanisms that serve institutional quality goals while building worker power to detect, contain, and recover from skill erosion, and preserve human identity. Evaluated across healthcare and software engineering, our work takes a foundational step toward dignified human-AI interaction futures by balancing productivity with the preservation of human expertise.
AIOct 12, 2023
Novelty Detection in Reinforcement Learning with World ModelsGeigh Zollicoffer, Kenneth Eaton, Jonathan Balloch et al.
Reinforcement learning (RL) using world models has found significant recent successes. However, when a sudden change to world mechanics or properties occurs then agent performance and reliability can dramatically decline. We refer to the sudden change in visual properties or state transitions as novelties. Implementing novelty detection within generated world model frameworks is a crucial task for protecting the agent when deployed. In this paper, we propose straightforward bounding approaches to incorporate novelty detection into world model RL agents, by utilizing the misalignment of the world model's hallucinated states and the true observed states as an anomaly score. We provide effective approaches to detecting novelties in a distribution of transitions learned by an agent in a world model. Finally, we show the advantage of our work in a novel environment compared to traditional machine learning novelty detection methods as well as currently accepted RL focused novelty detection algorithms.
AIJan 28, 2020Code
Bringing Stories Alive: Generating Interactive Fiction WorldsPrithviraj Ammanabrolu, Wesley Cheung, Dan Tu et al.
World building forms the foundation of any task that requires narrative intelligence. In this work, we focus on procedurally generating interactive fiction worlds---text-based worlds that players "see" and "talk to" using natural language. Generating these worlds requires referencing everyday and thematic commonsense priors in addition to being semantically consistent, interesting, and coherent throughout. Using existing story plots as inspiration, we present a method that first extracts a partial knowledge graph encoding basic information regarding world structure such as locations and objects. This knowledge graph is then automatically completed utilizing thematic knowledge and used to guide a neural language generation model that fleshes out the rest of the world. We perform human participant-based evaluations, testing our neural model's ability to extract and fill-in a knowledge graph and to generate language conditioned on it against rule-based and human-made baselines. Our code is available at https://github.com/rajammanabrolu/WorldGeneration.
CLDec 4, 2018Code
Playing Text-Adventure Games with Graph-Based Deep Reinforcement LearningPrithviraj Ammanabrolu, Mark O. Riedl
Text-based adventure games provide a platform on which to explore reinforcement learning in the context of a combinatorial action space, such as natural language. We present a deep reinforcement learning architecture that represents the game state as a knowledge graph which is learned during exploration. This graph is used to prune the action space, enabling more efficient exploration. The question of which action to take can be reduced to a question-answering task, a form of transfer learning that pre-trains certain parts of our architecture. In experiments using the TextWorld framework, we show that our proposed technique can learn a control policy faster than baseline alternatives. We have also open-sourced our code at https://github.com/rajammanabrolu/KG-DQN.
AIMay 12, 2025
Explainable Reinforcement Learning Agents Using World ModelsMadhuri Singh, Amal Alabdulkarim, Gennie Mansi et al. · gatech
Explainable AI (XAI) systems have been proposed to help people understand how AI systems produce outputs and behaviors. Explainable Reinforcement Learning (XRL) has an added complexity due to the temporal nature of sequential decision-making. Further, non-AI experts do not necessarily have the ability to alter an agent or its policy. We introduce a technique for using World Models to generate explanations for Model-Based Deep RL agents. World Models predict how the world will change when actions are performed, allowing for the generation of counterfactual trajectories. However, identifying what a user wanted the agent to do is not enough to understand why the agent did something else. We augment Model-Based RL agents with a Reverse World Model, which predicts what the state of the world should have been for the agent to prefer a given counterfactual action. We show that explanations that show users what the world should have been like significantly increase their understanding of the agent policy. We hypothesize that our explanations can help users learn how to control the agents execution through by manipulating the environment.
AIMay 6, 2025
STORY2GAME: Generating (Almost) Everything in an Interactive Fiction GameEric Zhou, Shreyas Basavatia, Moontashir Siam et al.
We introduce STORY2GAME, a novel approach to using Large Language Models to generate text-based interactive fiction games that starts by generating a story, populates the world, and builds the code for actions in a game engine that enables the story to play out interactively. Whereas a given set of hard-coded actions can artificially constrain story generation, the ability to generate actions means the story generation process can be more open-ended but still allow for experiences that are grounded in a game state. The key to successful action generation is to use LLM-generated preconditions and effects of actions in the stories as guides for what aspects of the game state must be tracked and changed by the game engine when a player performs an action. We also introduce a technique for dynamically generating new actions to accommodate the player's desire to perform actions that they think of that are not part of the story. Dynamic action generation may require on-the-fly updates to the game engine's state representation and revision of previously generated actions. We evaluate the success rate of action code generation with respect to whether a player can interactively play through the entire generated story.
LGApr 2, 2024
Is Exploration All You Need? Effective Exploration Characteristics for Transfer in Reinforcement LearningJonathan C. Balloch, Rishav Bhagat, Geigh Zollicoffer et al.
In deep reinforcement learning (RL) research, there has been a concerted effort to design more efficient and productive exploration methods while solving sparse-reward problems. These exploration methods often share common principles (e.g., improving diversity) and implementation details (e.g., intrinsic reward). Prior work found that non-stationary Markov decision processes (MDPs) require exploration to efficiently adapt to changes in the environment with online transfer learning. However, the relationship between specific exploration characteristics and effective transfer learning in deep RL has not been characterized. In this work, we seek to understand the relationships between salient exploration characteristics and improved performance and efficiency in transfer learning. We test eleven popular exploration algorithms on a variety of transfer types -- or ``novelties'' -- to identify the characteristics that positively affect online transfer learning. Our analysis shows that some characteristics correlate with improved performance and efficiency across a wide range of transfer tasks, while others only improve transfer performance with respect to specific environment changes. From our analysis, make recommendations about which exploration algorithm characteristics are best suited to specific transfer situations.
CYAug 12, 2025
AI Agents and the LawMark O. Riedl, Deven R. Desai
As AI becomes more "agentic," it faces technical and socio-legal issues it must address if it is to fulfill its promise of increased economic productivity and efficiency. This paper uses technical and legal perspectives to explain how things change when AI systems start being able to directly execute tasks on behalf of a user. We show how technical conceptions of agents track some, but not all, socio-legal conceptions of agency. That is, both computer science and the law recognize the problems of under-specification for an agent, and both disciplines have robust conceptions of how to address ensuring an agent does what the programmer, or in the law, the principal desires and no more. However, to date, computer science has under-theorized issues related to questions of loyalty and to third parties that interact with an agent, both of which are central parts of the law of agency. First, we examine the correlations between implied authority in agency law and the principle of value-alignment in AI, wherein AI systems must operate under imperfect objective specification. Second, we reveal gaps in the current computer science view of agents pertaining to the legal concepts of disclosure and loyalty, and how failure to account for them can result in unintended effects in AI ecommerce agents. In surfacing these gaps, we show a path forward for responsible AI agent development and deployment.
CLMay 9, 2024
A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text WorldsChristopher Z. Cui, Xiangyu Peng, Mark O. Riedl
Open-ended worlds are those in which there are no pre-specified goals or environmental reward signal. As a consequence, an agent must know how to perform a multitude of tasks. However, when a new task is presented to an agent, we expect it to be able to reuse some of what it knows from previous tasks to rapidly learn that new task. We introduce a novel technique whereby policies for different a priori known tasks are combined into a Mixture-of-Experts model with an attention mechanism across a mix of frozen and unfrozen experts. The model learns when to attend to frozen task-specific experts when appropriate and learns new experts to handle novel situations. We work in an open-ended text-based environment in which the agent is tasked with behaving like different types of character roles and must rapidly learn behaviors associated with new character role types. We show that our agent both obtains more rewards in the zero-shot setting, and discovers these rewards with greater sample efficiency in the few-shot learning settings.
HCDec 16, 2021
Inherently Explainable Reinforcement Learning in Natural LanguageXiangyu Peng, Mark O. Riedl, Prithviraj Ammanabrolu
We focus on the task of creating a reinforcement learning agent that is inherently explainable -- with the ability to produce immediate local explanations by thinking out loud while performing a task and analyzing entire trajectories post-hoc to produce causal explanations. This Hierarchically Explainable Reinforcement Learning agent (HEX-RL), operates in Interactive Fictions, text-based game environments in which an agent perceives and acts upon the world using textual natural language. These games are usually structured as puzzles or quests with long-term dependencies in which an agent must complete a sequence of actions to succeed -- providing ideal environments in which to test an agent's ability to explain its actions. Our agent is designed to treat explainability as a first-class citizen, using an extracted symbolic knowledge graph-based state representation coupled with a Hierarchical Graph Attention mechanism that points to the facts in the internal graph representation that most influenced the choice of actions. Experiments show that this agent provides significantly improved explanations over strong baselines, as rated by human participants generally unfamiliar with the environment, while also matching state-of-the-art task performance.
CLDec 16, 2021
Guiding Neural Story Generation with Reader ModelsXiangyu Peng, Kaige Xie, Amal Alabdulkarim et al.
Automated storytelling has long captured the attention of researchers for the ubiquity of narratives in everyday life. However, it is challenging to maintain coherence and stay on-topic toward a specific ending when generating narratives with neural language models. In this paper, we introduce Story generation with Reader Models (StoRM), a framework in which a reader model is used to reason about the story should progress. A reader model infers what a human reader believes about the concepts, entities, and relations about the fictional story world. We show how an explicit reader model represented as a knowledge graph affords story coherence and provides controllability in the form of achieving a given story world state goal. Experiments show that our model produces significantly more coherent and on-topic stories, outperforming baselines in dimensions including plot plausibility and staying on topic.
CLDec 16, 2021
Goal-Directed Story Generation: Augmenting Generative Language Models with Reinforcement LearningAmal Alabdulkarim, Winston Li, Lara J. Martin et al.
The advent of large pre-trained generative language models has provided a common framework for AI story generation via sampling the model to create sequences that continue the story. However, sampling alone is insufficient for story generation. In particular, it is hard to direct a language model to create stories to reach a specific goal event. We present two automated techniques grounded in deep reinforcement learning and reward shaping to control the plot of computer-generated stories. The first utilizes proximal policy optimization to fine-tune an existing transformer-based language model to generate text continuations but also be goal-seeking. The second extracts a knowledge graph from the unfolding story, which is used by a policy network with graph attention to select a candidate continuation generated by a language model. We report on automated metrics pertaining to how often stories achieve a given goal event as well as human participant rankings of coherence and overall story quality compared to baselines and ablations.
CLOct 7, 2021
Situated Dialogue Learning through Procedural Environment GenerationPrithviraj Ammanabrolu, Renee Jia, Mark O. Riedl
We teach goal-driven agents to interactively act and speak in situated environments by training on generated curriculums. Our agents operate in LIGHT (Urbanek et al. 2019) -- a large-scale crowd-sourced fantasy text adventure game wherein an agent perceives and interacts with the world through textual natural language. Goals in this environment take the form of character-based quests, consisting of personas and motivations. We augment LIGHT by learning to procedurally generate additional novel textual worlds and quests to create a curriculum of steadily increasing difficulty for training agents to achieve such goals. In particular, we measure curriculum difficulty in terms of the rarity of the quest in the original training distribution -- an easier environment is one that is more likely to have been found in the unaugmented dataset. An ablation study shows that this method of learning from the tail of a distribution results in significantly higher generalization abilities as measured by zero-shot performance on never-before-seen quests.
HCSep 26, 2021
Explainability Pitfalls: Beyond Dark Patterns in Explainable AIUpol Ehsan, Mark O. Riedl
To make Explainable AI (XAI) systems trustworthy, understanding harmful effects is just as important as producing well-designed explanations. In this paper, we address an important yet unarticulated type of negative effect in XAI. We introduce explainability pitfalls(EPs), unanticipated negative downstream effects from AI explanations manifesting even when there is no intention to manipulate users. EPs are different from, yet related to, dark patterns, which are intentionally deceptive practices. We articulate the concept of EPs by demarcating it from dark patterns and highlighting the challenges arising from uncertainties around pitfalls. We situate and operationalize the concept using a case study that showcases how, despite best intentions, unsuspecting negative effects such as unwarranted trust in numerical explanations can emerge. We propose proactive and preventative strategies to address EPs at three interconnected levels: research, design, and organizational.
HCJul 28, 2021
The Who in XAI: How AI Background Shapes Perceptions of AI ExplanationsUpol Ehsan, Samir Passi, Q. Vera Liao et al.
Explainability of AI systems is critical for users to take informed actions. Understanding "who" opens the black-box of AI is just as important as opening it. We conduct a mixed-methods study of how two different groups--people with and without AI background--perceive different types of AI explanations. Quantitatively, we share user perceptions along five dimensions. Qualitatively, we describe how AI background can influence interpretations, elucidating the differences through lenses of appropriation and cognitive heuristics. We find that (1) both groups showed unwarranted faith in numbers for different reasons and (2) each group found value in different explanations beyond their intended design. Carrying critical implications for the field of XAI, our findings showcase how AI generated explanations can have negative consequences despite best intentions and how that could lead to harmful manipulation of trust. We propose design interventions to mitigate them.
LGJun 17, 2021
Learning Knowledge Graph-based World Models of Textual EnvironmentsPrithviraj Ammanabrolu, Mark O. Riedl
World models improve a learning agent's ability to efficiently operate in interactive and situated environments. This work focuses on the task of building world models of text-based game environments. Text-based games, or interactive narratives, are reinforcement learning environments in which agents perceive and interact with the world using textual natural language. These environments contain long, multi-step puzzles or quests woven through a world that is filled with hundreds of characters, locations, and objects. Our world model learns to simultaneously: (1) predict changes in the world caused by an agent's actions when representing the world as a knowledge graph; and (2) generate the set of contextually relevant natural language actions required to operate in the world. We frame this task as a Set of Sequences generation problem by exploiting the inherent structure of knowledge graphs and actions and introduce both a transformer-based multi-task architecture and a loss function to train it. A zero-shot ablation study on never-before-seen textual worlds shows that our methodology significantly outperforms existing textual world modeling techniques as well as the importance of each of our contributions.
CLJun 17, 2021
Modeling Worlds in TextPrithviraj Ammanabrolu, Mark O. Riedl
We provide a dataset that enables the creation of learning agents that can build knowledge graph-based world models of interactive narratives. Interactive narratives -- or text-adventure games -- are partially observable environments structured as long puzzles or quests in which an agent perceives and interacts with the world purely through textual natural language. Each individual game typically contains hundreds of locations, characters, and objects -- each with their own unique descriptions -- providing an opportunity to study the problem of giving language-based agents the structured memory necessary to operate in such worlds. Our dataset provides 24198 mappings between rich natural language observations and: (1) knowledge graphs that reflect the world state in the form of a map; (2) natural language actions that are guaranteed to cause a change in that particular world state. The training data is collected across 27 games in multiple genres and contains a further 7836 heldout instances over 9 additional games in the test set. We further provide baseline models using rules-based, question-answering, and sequence learning approaches in addition to an analysis of the data and corresponding learning tasks.
AIJun 4, 2021
Detecting and Adapting to Novelty in GamesXiangyu Peng, Jonathan C. Balloch, Mark O. Riedl
Open-world novelty occurs when the rules of an environment can change abruptly, such as when a game player encounters "house rules". To address open-world novelty, game playing agents must be able to detect when novelty is injected, and to quickly adapt to the new rules. We propose a model-based reinforcement learning approach where game state and rules are represented as knowledge graphs. The knowledge graph representation of the state and rules allows novelty to be detected as changes in the knowledge graph, assists with the training of deep reinforcement learners, and enables imagination-based re-training where the agent uses the knowledge graph to perform look-ahead.
CLMay 31, 2021
Telling Stories through Multi-User Dialogue by Modeling Character RelationsWai Man Si, Prithviraj Ammanabrolu, Mark O. Riedl
This paper explores character-driven story continuation, in which the story emerges through characters' first- and second-person narration as well as dialogue -- requiring models to select language that is consistent with a character's persona and their relationships with other characters while following and advancing the story. We hypothesize that a multi-task model that trains on character dialogue plus character relationship information improves transformer-based story continuation. To this end, we extend the Critical Role Dungeons and Dragons Dataset (Rameshkumar and Bailey, 2020) -- consisting of dialogue transcripts of people collaboratively telling a story while playing the role-playing game Dungeons and Dragons -- with automatically extracted relationships between each pair of interacting characters as well as their personas. A series of ablations lend evidence to our hypothesis, showing that our multi-task model using character relationships improves story continuation accuracy over strong baselines.
LGApr 15, 2021
LEx: A Framework for Operationalising Layers of Machine Learning ExplanationsRonal Singh, Upol Ehsan, Marc Cheong et al.
Several social factors impact how people respond to AI explanations used to justify AI decisions affecting them personally. In this position paper, we define a framework called the \textit{layers of explanation} (LEx), a lens through which we can assess the appropriateness of different types of explanations. The framework uses the notions of \textit{sensitivity} (emotional responsiveness) of features and the level of \textit{stakes} (decision's consequence) in a domain to determine whether different types of explanations are \textit{appropriate} in a given context. We demonstrate how to use the framework to assess the appropriateness of different types of explanations in different domains.
AIMar 18, 2021
Situated Language Learning via Interactive NarrativesPrithviraj Ammanabrolu, Mark O. Riedl
This paper provides a roadmap that explores the question of how to imbue learning agents with the ability to understand and generate contextually relevant natural language in service of achieving a goal. We hypothesize that two key components in creating such agents are interactivity and environment grounding, shown to be vital parts of language learning in humans, and posit that interactive narratives should be the environments of choice for such training these agents. These games are simulations in which an agent interacts with the world through natural language -- "perceiving", "acting upon", and "talking to" the world using textual descriptions, commands, and dialogue -- and as such exist at the intersection of natural language processing, storytelling, and sequential decision making. We discuss the unique challenges a text games' puzzle-like structure combined with natural language state-and-action spaces provides: knowledge representation, commonsense reasoning, and exploration. Beyond the challenges described so far, progress in the realm of interactive narratives can be applied in adjacent problem domains. These applications provide interesting challenges of their own as well as extensions to those discussed so far. We describe three of them in detail: (1) evaluating AI system's commonsense understanding by automatically creating interactive narratives; (2) adapting abstract text-based policies to include other modalities such as vision; and (3) enabling multi-agent and human-AI collaboration in shared, situated worlds.
HCJan 12, 2021
Expanding Explainability: Towards Social Transparency in AI systemsUpol Ehsan, Q. Vera Liao, Michael Muller et al.
As AI-powered systems increasingly mediate consequential decision-making, their explainability is critical for end-users to take informed and accountable actions. Explanations in human-human interactions are socially-situated. AI systems are often socio-organizationally embedded. However, Explainable AI (XAI) approaches have been predominantly algorithm-centered. We take a developmental step towards socially-situated XAI by introducing and exploring Social Transparency (ST), a sociotechnically informed perspective that incorporates the socio-organizational context into explaining AI-mediated decision-making. To explore ST conceptually, we conducted interviews with 29 AI users and practitioners grounded in a speculative design scenario. We suggested constitutive design elements of ST and developed a conceptual framework to unpack ST's effect and implications at the technical, decision-making, and organizational level. The framework showcases how ST can potentially calibrate trust in AI, improve decision-making, facilitate organizational collective actions, and cultivate holistic explainability. Our work contributes to the discourse of Human-Centered XAI by expanding the design space of XAI.
AIDec 4, 2020
Playing Text-Based Games with Common SenseSahith Dambekodi, Spencer Frazier, Prithviraj Ammanabrolu et al.
Text based games are simulations in which an agent interacts with the world purely through natural language. They typically consist of a number of puzzles interspersed with interactions with common everyday objects and locations. Deep reinforcement learning agents can learn to solve these puzzles. However, the everyday interactions with the environment, while trivial for human players, present as additional puzzles to agents. We explore two techniques for incorporating commonsense knowledge into agents. Inferring possibly hidden aspects of the world state with either a commonsense inference model COMET, or a language model BERT. Biasing an agents exploration according to common patterns recognized by a language model. We test our technique in the 9to05 game, which is an extreme version of a text based game that requires numerous interactions with common, everyday objects in common, everyday scenarios. We conclude that agents that augment their beliefs about the world state with commonsense inferences are more robust to observational errors and omissions of common elements from text descriptions.
CLSep 2, 2020
Automated Storytelling via Causal, Commonsense Plot OrderingPrithviraj Ammanabrolu, Wesley Cheung, William Broniec et al.
Automated story plot generation is the task of generating a coherent sequence of plot events. Causal relations between plot events are believed to increase the perception of story and plot coherence. In this work, we introduce the concept of soft causal relations as causal relations inferred from commonsense reasoning. We demonstrate C2PO, an approach to narrative generation that operationalizes this concept through Causal, Commonsense Plot Ordering. Using human-participant protocols, we evaluate our system against baseline systems with different commonsense reasoning reasoning and inductive biases to determine the role of soft causal relations in perceived story quality. Through these studies we also probe the interplay of how changes in commonsense norms across storytelling genres affect perceptions of story quality.
AIJun 12, 2020
How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual WorldsPrithviraj Ammanabrolu, Ethan Tien, Matthew Hausknecht et al.
Text-based games are long puzzles or quests, characterized by a sequence of sparse and potentially deceptive rewards. They provide an ideal platform to develop agents that perceive and act upon the world using a combinatorially sized natural language state-action space. Standard Reinforcement Learning agents are poorly equipped to effectively explore such spaces and often struggle to overcome bottlenecks---states that agents are unable to pass through simply because they do not see the right action sequence enough times to be sufficiently reinforced. We introduce Q*BERT, an agent that learns to build a knowledge graph of the world by answering questions, which leads to greater sample efficiency. To overcome bottlenecks, we further introduce MC!Q*BERT an agent that uses an knowledge-graph-based intrinsic motivation to detect bottlenecks and a novel exploration strategy to efficiently learn a chain of policy modules to overcome them. We present an ablation study and results demonstrating how our method outperforms the current state-of-the-art on nine text games, including the popular game, Zork, where, for the first time, a learning agent gets past the bottleneck where the player is eaten by a Grue.
LGFeb 19, 2020
How To Avoid Being Eaten By a Grue: Exploration Strategies for Text-Adventure AgentsPrithviraj Ammanabrolu, Ethan Tien, Zhaochen Luo et al.
Text-based games -- in which an agent interacts with the world through textual natural language -- present us with the problem of combinatorially-sized action-spaces. Most current reinforcement learning algorithms are not capable of effectively handling such a large number of possible actions per turn. Poor sample efficiency, consequently, results in agents that are unable to pass bottleneck states, where they are unable to proceed because they do not see the right action sequence to pass the bottleneck enough times to be sufficiently reinforced. Building on prior work using knowledge graphs in reinforcement learning, we introduce two new game state exploration strategies. We compare our exploration strategies against strong baselines on the classic text-adventure game, Zork1, where prior agent have been unable to get past a bottleneck where the agent is eaten by a Grue.
HCFeb 4, 2020
Human-centered Explainable AI: Towards a Reflective Sociotechnical ApproachUpol Ehsan, Mark O. Riedl
Explanations--a form of post-hoc interpretability--play an instrumental role in making systems accessible as AI continues to proliferate complex and sensitive sociotechnical systems. In this paper, we introduce Human-centered Explainable AI (HCXAI) as an approach that puts the human at the center of technology design. It develops a holistic understanding of "who" the human is by considering the interplay of values, interpersonal dynamics, and the socially situated nature of AI systems. In particular, we advocate for a reflective sociotechnical approach. We illustrate HCXAI through a case study of an explanation system for non-technical end-users that shows how technical advancements and the understanding of human factors co-evolve. Building on the case study, we lay out open research questions pertaining to further refining our understanding of "who" the human is and extending beyond 1-to-1 human-computer interactions. Finally, we propose that a reflective HCXAI paradigm-mediated through the perspective of Critical Technical Practice and supplemented with strategies from HCI, such as value-sensitive design and participatory design--not only helps us understand our intellectual blind spots, but it can also open up new design and research spaces.
AINov 20, 2019
Integrating Automated Play in Level Co-CreationAndrew Hoyt, Matthew Guzdial, Yalini Kumar et al.
In level co-creation an AI and human work together to create a video game level. One open challenge in level co-creation is how to empower human users to ensure particular qualities of the final level, such as challenge. There has been significant prior research into automated pathing and automated playtesting for video game levels, but not in how to incorporate these into tools. In this demonstration we present an improvement of the Morai Maker mixed-initiative level editor for Super Mario Bros. that includes automated pathing and challenge approximation features.
CLSep 13, 2019
Toward Automated Quest Generation in Text-Adventure GamesPrithviraj Ammanabrolu, William Broniec, Alex Mueller et al.
Interactive fictions, or text-adventures, are games in which a player interacts with a world entirely through textual descriptions and text actions. Text-adventure games are typically structured as puzzles or quests wherein the player must execute certain actions in a certain order to succeed. In this paper, we consider the problem of procedurally generating a quest, defined as a series of actions required to progress towards a goal, in a text-adventure game. Quest generation in text environments is challenging because they must be semantically coherent. We present and evaluate two quest generation techniques: (1) a Markov model, and (2) a neural generative model. We specifically look at generating quests about cooking and train our models on recipe data. We evaluate our techniques with human participant studies looking at perceived creativity and coherence.
CLSep 8, 2019
Story Realization: Expanding Plot Events into SentencesPrithviraj Ammanabrolu, Ethan Tien, Wesley Cheung et al.
Neural network based approaches to automated story plot generation attempt to learn how to generate novel plots from a corpus of natural language plot summaries. Prior work has shown that a semantic abstraction of sentences called events improves neural plot generation and and allows one to decompose the problem into: (1) the generation of a sequence of events (event-to-event) and (2) the transformation of these events into natural language sentences (event-to-sentence). However, typical neural language generation approaches to event-to-sentence can ignore the event details and produce grammatically-correct but semantically-unrelated sentences. We present an ensemble-based model that generates natural language guided by events.We provide results---including a human subjects study---for a full end-to-end automated story generation system showing that our method generates more coherent and plausible stories than baseline approaches.
CLSep 5, 2019
Automated Let's Play CommentaryShukan Shah, Matthew Guzdial, Mark O. Riedl
Let's Plays of video games represent a relatively unexplored area for experimental AI in games. In this short paper, we discuss an approach to generate automated commentary for Let's Play videos, drawing on convolutional deep neural networks. We focus on Let's Plays of the popular game Minecraft. We compare our approach and a prior approach and demonstrate the generation of automated, artificial commentary.
CLAug 19, 2019
Transfer in Deep Reinforcement Learning using Knowledge GraphsPrithviraj Ammanabrolu, Mark O. Riedl
Text adventure games, in which players must make sense of the world through text descriptions and declare actions through text descriptions, provide a stepping stone toward grounding action in language. Prior work has demonstrated that using a knowledge graph as a state representation and question-answering to pre-train a deep Q-network facilitates faster control policy transfer. In this paper, we explore the use of knowledge graphs as a representation for domain knowledge transfer for training text-adventure playing reinforcement learning agents. Our methods are tested across multiple computer generated and human authored games, varying in domain and complexity, and demonstrate that our transfer learning methods let us learn a higher-quality control policy faster.
AIAug 4, 2019
Monte-Carlo Tree Search for Simulation-based Strategy AnalysisAlexander Zook, Brent Harrison, Mark O. Riedl
Games are often designed to shape player behavior in a desired way; however, it can be unclear how design decisions affect the space of behaviors in a game. Designers usually explore this space through human playtesting, which can be time-consuming and of limited effectiveness in exhausting the space of possible behaviors. In this paper, we propose the use of automated planning agents to simulate humans of varying skill levels to generate game playthroughs. Metrics can then be gathered from these playthroughs to evaluate the current game design and identify its potential flaws. We demonstrate this technique in two games: the popular word game Scrabble and a collectible card game of our own design named Cardonomicon. Using these case studies, we show how using simulated agents to model humans of varying skill levels allows us to extract metrics to describe game balance (in the case of Scrabble) and highlight potential design flaws (in the case of Cardonomicon).
AIAug 4, 2019
Automatic Game Design via Mechanic GenerationAlexander Zook, Mark O. Riedl
Game designs often center on the game mechanics---rules governing the logical evolution of the game. We seek to develop an intelligent system that generates computer games. As first steps towards this goal we present a composable and cross-domain representation for game mechanics that draws from AI planning action representations. We use a constraint solver to generate mechanics subject to design requirements on the form of those mechanics---what they do in the game. A planner takes a set of generated mechanics and tests whether those mechanics meet playability requirements---controlling how mechanics function in a game to affect player behavior. We demonstrate our system by modeling and generating mechanics in a role-playing game, platformer game, and combined role-playing-platformer game.
AIAug 4, 2019
Automatic Playtesting for Game Parameter Tuning via Active LearningAlexander Zook, Eric Fruchter, Mark O. Riedl
Game designers use human playtesting to gather feedback about game design elements when iteratively improving a game. Playtesting, however, is expensive: human testers must be recruited, playtest results must be aggregated and interpreted, and changes to game designs must be extrapolated from these results. Can automated methods reduce this expense? We show how active learning techniques can formalize and automate a subset of playtesting goals. Specifically, we focus on the low-level parameter tuning required to balance a game once the mechanics have been chosen. Through a case study on a shoot-`em-up game we demonstrate the efficacy of active learning to reduce the amount of playtesting needed to choose the optimal set of game parameters for two classes of (formal) design objectives. This work opens the potential for additional methods to reduce the human burden of performing playtesting for a variety of relevant design concerns.
AIJan 31, 2019
Human-Centered Artificial Intelligence and Machine LearningMark O. Riedl
Humans are increasingly coming into contact with artificial intelligence and machine learning systems. Human-centered artificial intelligence is a perspective on AI and ML that algorithms must be designed with awareness that they are part of a larger system consisting of humans. We lay forth an argument that human-centered artificial intelligence can be broken down into two aspects: (1) AI systems that understand humans from a sociocultural perspective, and (2) AI systems that help humans understand them. We further argue that issues of social responsibility such as fairness, accountability, interpretability, and transparency.
CLSep 27, 2018
Controllable Neural Story Plot Generation via Reward ShapingPradyumna Tambwekar, Murtaza Dhuliawala, Lara J. Martin et al.
Language-modeling--based approaches to story plot generation attempt to construct a plot by sampling from a language model (LM) to predict the next character, word, or sentence to add to the story. LM techniques lack the ability to receive guidance from the user to achieve a specific goal, resulting in stories that don't have a clear sense of progression and lack coherence. We present a reward-shaping technique that analyzes a story corpus and produces intermediate rewards that are backpropagated into a pre-trained LM in order to guide the model towards a given goal. Automated evaluations show our technique can create a model that generates story plots which consistently achieve a specified goal. Human-subject studies show that the generated stories have more plausible event ordering than baseline plot generation techniques.
AIMay 9, 2018
Creative Invention BenchmarkMatthew Guzdial, Nicholas Liao, Vishwa Shah et al.
In this paper we present the Creative Invention Benchmark (CrIB), a 2000-problem benchmark for evaluating a particular facet of computational creativity. Specifically, we address combinational p-creativity, the creativity at play when someone combines existing knowledge to achieve a solution novel to that individual. We present generation strategies for the five problem categories of the benchmark and a set of initial baselines.
LGFeb 10, 2018
Combinets: Creativity via Recombination of Neural NetworksMatthew Guzdial, Mark O. Riedl
One of the defining characteristics of human creativity is the ability to make conceptual leaps, creating something surprising from typical knowledge. In comparison, deep neural networks often struggle to handle cases outside of their training data, which is especially problematic for problems with limited training data. Approaches exist to transfer knowledge from problems with sufficient data to those with insufficient data, but they tend to require additional training or a domain-specific method of transfer. We present a new approach, conceptual expansion, that serves as a general representation for reusing existing trained models to derive new models without backpropagation. We evaluate our approach on few-shot variations of two tasks: image classification and image generation, and outperform standard transfer learning approaches.
AISep 12, 2017
Explore, Exploit or Listen: Combining Human Feedback and Policy Model to Speed up Deep Reinforcement Learning in 3D WorldsZhiyu Lin, Brent Harrison, Aaron Keech et al.
We describe a method to use discrete human feedback to enhance the performance of deep learning agents in virtual three-dimensional environments by extending deep-reinforcement learning to model the confidence and consistency of human feedback. This enables deep reinforcement learning algorithms to determine the most appropriate time to listen to the human feedback, exploit the current policy model, or explore the agent's environment. Managing the trade-off between these three strategies allows DRL agents to be robust to inconsistent or intermittent human feedback. Through experimentation using a synthetic oracle, we show that our technique improves the training speed and overall performance of deep reinforcement learning in navigating three-dimensional environments using Minecraft. We further show that our technique is robust to highly innacurate human feedback and can also operate when no human feedback is given.
AIJul 26, 2017
Guiding Reinforcement Learning Exploration Using Natural LanguageBrent Harrison, Upol Ehsan, Mark O. Riedl
In this work we present a technique to use natural language to help reinforcement learning generalize to unseen environments. This technique uses neural machine translation, specifically the use of encoder-decoder networks, to learn associations between natural language behavior descriptions and state-action information. We then use this learned model to guide agent exploration using a modified version of policy shaping to make it more effective at learning in unseen environments. We evaluate this technique using the popular arcade game, Frogger, under ideal and non-ideal conditions. This evaluation shows that our modified policy shaping algorithm improves over a Q-learning agent as well as a baseline version of policy shaping.
HCJun 11, 2017
A Framework for Exploring and Evaluating Mechanics in Human Computation GamesKristin Siu, Alexander Zook, Mark O. Riedl
Human computation games (HCGs) are a crowdsourcing approach to solving computationally-intractable tasks using games. In this paper, we describe the need for generalizable HCG design knowledge that accommodates the needs of both players and tasks. We propose a formal representation of the mechanics in HCGs, providing a structural breakdown to visualize, compare, and explore the space of HCG mechanics. We present a methodology based on small-scale design experiments using fixed tasks while varying game elements to observe effects on both the player experience and the human computation task completion. Finally we discuss applications of our framework using comparisons of prior HCGs and recent design experiments. Ultimately, we wish to enable easier exploration and development of HCGs, helping these games provide meaningful player experiences while solving difficult problems.