AIJul 8, 2024
Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie WorksheetsHarshit Joshi, Shicheng Liu, James Chen et al. · stanford
Large Language Models can carry out human-like conversations in diverse settings, responding to user requests for tasks and knowledge. However, existing conversational agents implemented with LLMs often struggle with hallucination, following instructions with conditional logic, and integrating knowledge from different sources. These shortcomings compromise the agents' effectiveness, rendering them unsuitable for deployment. To address these challenges, we introduce Genie, a programmable framework for creating knowledge-intensive task-oriented conversational agents. Genie can handle involved interactions and answer complex queries. Unlike LLMs, it delivers reliable, grounded responses through advanced dialogue state management and supports controllable agent policies via its declarative specification -- Genie Worksheet. This is achieved through an algorithmic runtime system that implements the developer-supplied policy, limiting LLMs to (1) parse user input using a succinct conversational history, and (2) generate responses according to supplied context. Agents built with Genie outperform SOTA methods on complex logic dialogue datasets. We conducted a user study with 62 participants on three real-life applications: restaurant reservations with Yelp, as well as ticket submission and course enrollment for university students. Genie agents with GPT-4 Turbo outperformed the GPT-4 Turbo agents with function calling, improving goal completion rates from 21.8% to 82.8% across three real-world tasks.
67.8NAMar 15
The largest 5th pivot may be the root of a 61st degree polynomialJames Chen, Alan Edelman, John Urschel
This paper introduces a number of new techniques in the study of the famous question from numerical linear algebra: what is the largest possible growth factor when performing Gaussian elimination with complete pivoting? This question is highly complex, due to a complicated set of polynomial inequalities that need to be simultaneously satisfied. This paper introduces the JuMP + Groebner basis + discriminant polynomial approach as well as the use of interval arithmetic computations. Thus, we are introducing a marriage of numerical and exact mathematical computations. In 1988, Day and Peterson performed numerical optimization on $n=5$ with NPSOL and obtained a largest seen value of $4.1325...$. This same best value was reproduced by Gould with LANCELOT in 1991. We ran extensive comparable experiments with the modern software tool JuMP and also saw the same value $4.1325...$. While the combinatorial explosion of possibilities prevents us from knowing whether there may not be a larger maximum, we succeed in obtaining the exact mathematical value: the number $4.1325...$ is exactly the root of a 61st degree polynomial provided in this work, and is a maximum given the equality constraints seen by JuMP. In light of the numerics, we pose the conjecture that this lower bound is indeed the maximum. We also apply this technique to $n = 6$, $7$, and $8$. Furthermore, in 1969, an upper bound of $4\frac{17}{18}\approx 4.94$ was produced for the maximum possible growth for $n = 5$. We slightly lower this upper bound to $4.84$.