CRNov 16, 2023Code
Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment FrameworkMatthew Pisano, Peter Ly, Abraham Sanders et al.
Research into AI alignment has grown considerably since the recent introduction of increasingly capable Large Language Models (LLMs). Unfortunately, modern methods of alignment still fail to fully prevent harmful responses when models are deliberately attacked. Such vulnerabilities can lead to LLMs being manipulated into generating hazardous content: from instructions for creating dangerous materials to inciting violence or endorsing unethical behaviors. To help mitigate this issue, we introduce Bergeron: a framework designed to improve the robustness of LLMs against attacks without any additional parameter fine-tuning. Bergeron is organized into two tiers; with a secondary LLM acting as a guardian to the primary LLM. This framework better safeguards the primary model against incoming attacks while monitoring its output for any harmful content. Empirical analysis reviews that by using Bergeron to complement models with existing alignment training, we can significantly improve the robustness and safety of multiple, commonly used commercial and open-source LLMs. Specifically, we found that models integrated with Bergeron are, on average, nearly seven times more resistant to attacks compared to models without such support.
CLMay 7, 2022
Towards a Progression-Aware Autonomous Dialogue AgentAbraham Sanders, Tomek Strzalkowski, Mei Si et al.
Recent advances in large-scale language modeling and generation have enabled the creation of dialogue agents that exhibit human-like responses in a wide range of conversational scenarios spanning a diverse set of tasks, from general chit-chat to focused goal-oriented discourse. While these agents excel at generating high-quality responses that are relevant to prior context, they suffer from a lack of awareness of the overall direction in which the conversation is headed, and the likelihood of task success inherent therein. Thus, we propose a framework in which dialogue agents can evaluate the progression of a conversation toward or away from desired outcomes, and use this signal to inform planning for subsequent responses. Our framework is composed of three key elements: (1) the notion of a "global" dialogue state (GDS) space, (2) a task-specific progression function (PF) computed in terms of a conversation's trajectory through this space, and (3) a planning mechanism based on dialogue rollouts by which an agent may use progression signals to select its next response.
CLJun 22, 2023
Prompt to GPT-3: Step-by-Step Thinking Instructions for Humor GenerationYuetian Chen, Bowen Shi, Mei Si
Artificial intelligence has made significant progress in natural language processing, with models like GPT-3 demonstrating impressive capabilities. However, these models still have limitations when it comes to complex tasks that require an understanding of the user, such as mastering human comedy writing strategies. This paper explores humor generation using GPT-3 by modeling human comedy writing theory and leveraging step-by-step thinking instructions. In addition, we explore the role of cognitive distance in creating humor.
LGAug 18, 2022
A Review of Uncertainty for Deep Reinforcement LearningOwen Lockwood, Mei Si
Uncertainty is ubiquitous in games, both in the agents playing games and often in the games themselves. Working with uncertainty is therefore an important component of successful deep reinforcement learning agents. While there has been substantial effort and progress in understanding and working with uncertainty for supervised learning, the body of literature for uncertainty aware deep reinforcement learning is less developed. While many of the same problems regarding uncertainty in neural networks for supervised learning remain for reinforcement learning, there are additional sources of uncertainty due to the nature of an interactable environment. In this work, we provide an overview motivating and presenting existing techniques in uncertainty aware deep reinforcement learning. These works show empirical benefits on a variety of reinforcement learning tasks. This work serves to help to centralize the disparate results and promote future research in this area.
AIJan 7, 2023
Visual Story Generation Based on Emotion and KeywordsYuetian Chen, Ruohua Li, Bowen Shi et al.
Automated visual story generation aims to produce stories with corresponding illustrations that exhibit coherence, progression, and adherence to characters' emotional development. This work proposes a story generation pipeline to co-create visual stories with the users. The pipeline allows the user to control events and emotions on the generated content. The pipeline includes two parts: narrative and image generation. For narrative generation, the system generates the next sentence using user-specified keywords and emotion labels. For image generation, diffusion models are used to create a visually appealing image corresponding to each generated sentence. Further, object recognition is applied to the generated images to allow objects in these images to be mentioned in future story development.
LGNov 25, 2023
Enhancing Sentiment Analysis Results through Outlier Detection OptimizationYuetian Chen, Mei Si
When dealing with text data containing subjective labels like speaker emotions, inaccuracies or discrepancies among labelers are not uncommon. Such discrepancies can significantly affect the performance of machine learning algorithms. This study investigates the potential of identifying and addressing outliers in text data with subjective labels, aiming to enhance classification outcomes. We utilized the Deep SVDD algorithm, a one-class classification method, to detect outliers in nine text-based emotion and sentiment analysis datasets. By employing both a small-sized language model (DistilBERT base model with 66 million parameters) and non-deep learning machine learning algorithms (decision tree, KNN, Logistic Regression, and LDA) as the classifier, our findings suggest that the removal of outliers can lead to enhanced results in most cases. Additionally, as outliers in such datasets are not necessarily unlearnable, we experienced utilizing a large language model -- DeBERTa v3 large with 131 million parameters, which can capture very complex patterns in data. We continued to observe performance enhancements across multiple datasets.
AIJan 21, 2022
A System for Image Understanding using Sensemaking and NarrativeZev Battad, Mei Si
Sensemaking and narrative are two inherently interconnected concepts about how people understand the world around them. Sensemaking is the process by which people structure and interconnect the information they encounter in the world with the knowledge and inferences they have made in the past. Narratives are important constructs that people use sensemaking to create; ones that reflect provide a more holistic account of the world than the information within any given narrative is able to alone. Both are important to how human beings parse the world, and both would be valuable for a computational system attempting to do the same. In this paper, we discuss theories of sensemaking and narrative with respect to how people build an understanding of the world based on the information they encounter, as well as the links between the fields of sensemaking and narrative research. We highlight a specific computational task, visual storytelling, whose solutions we believe can be enhanced by employing a sensemaking and narrative component. We then describe our system for visual storytelling using sensemaking and narrative and discuss examples from its current implementation.
QUANT-PHAug 15, 2020
Reinforcement Learning with Quantum Variational CircuitsOwen Lockwood, Mei Si
The development of quantum computational techniques has advanced greatly in recent years, parallel to the advancements in techniques for deep reinforcement learning. This work explores the potential for quantum computing to facilitate reinforcement learning problems. Quantum computing approaches offer important potential improvements in time and space complexity over traditional algorithms because of its ability to exploit the quantum phenomena of superposition and entanglement. Specifically, we investigate the use of quantum variational circuits, a form of quantum machine learning. We present our techniques for encoding classical data for a quantum variational circuit, we further explore pure and hybrid quantum algorithms for DQN and Double DQN. Our results indicate both hybrid and pure quantum variational circuit have the ability to solve reinforcement learning tasks with a smaller parameter space. These comparison are conducted with two OpenAI Gym environments: CartPole and Blackjack, The success of this work is indicative of a strong future relationship between quantum machine learning and deep reinforcement learning.