HCAICLPLOct 2, 2023

Co-audit: tools to help humans double-check AI-generated content

Microsoft
arXiv:2310.01297v116 citationsh-index: 35
Originality Synthesis-oriented
AI Analysis

This addresses the need for improved quality assurance in generative AI applications where errors are consequential, such as in spreadsheet computations, though it appears incremental as it builds on existing tool-assisted experiences.

The paper tackles the problem of users struggling to audit complex AI-generated content like summaries, tables, or code for correctness, by introducing co-audit tools to help double-check such outputs, with a focus on spreadsheet computations as a specific example.

Users are increasingly being warned to check AI-generated content for correctness. Still, as LLMs (and other generative models) generate more complex output, such as summaries, tables, or code, it becomes harder for the user to audit or evaluate the output for quality or correctness. Hence, we are seeing the emergence of tool-assisted experiences to help the user double-check a piece of AI-generated content. We refer to these as co-audit tools. Co-audit tools complement prompt engineering techniques: one helps the user construct the input prompt, while the other helps them check the output response. As a specific example, this paper describes recent research on co-audit tools for spreadsheet computations powered by generative models. We explain why co-audit experiences are essential for any application of generative AI where quality is important and errors are consequential (as is common in spreadsheet computations). We propose a preliminary list of principles for co-audit, and outline research challenges.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes