Citation: A Key to Building Responsible and Accountable Large Language Models
It tackles IP and ethical issues for LLM developers and users, but is incremental as it builds on existing web system parallels without new empirical results.
This position paper identifies citation as a missing component in large language models (LLMs) to address intellectual property and ethical challenges, proposing its incorporation to enhance transparency and verifiability.
Large Language Models (LLMs) bring transformative benefits alongside unique challenges, including intellectual property (IP) and ethical concerns. This position paper explores a novel angle to mitigate these risks, drawing parallels between LLMs and established web systems. We identify "citation" - the acknowledgement or reference to a source or evidence - as a crucial yet missing component in LLMs. Incorporating citation could enhance content transparency and verifiability, thereby confronting the IP and ethical issues in the deployment of LLMs. We further propose that a comprehensive citation mechanism for LLMs should account for both non-parametric and parametric content. Despite the complexity of implementing such a citation mechanism, along with the potential pitfalls, we advocate for its development. Building on this foundation, we outline several research problems in this area, aiming to guide future explorations towards building more responsible and accountable LLMs.