CL AIJul 21, 2022

Language Model Cascades

David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-dickstein, Kevin Murphy, Charles Sutton

AnthropicDeepMind

arXiv:2207.10342v215.4115 citationsh-index: 63Has Code

Originality Synthesis-oriented

AI Analysis

It provides a theoretical framework for researchers and practitioners to design and analyze complex model interactions, though it is incremental as it builds on prior methods.

The paper formalizes language model cascades as probabilistic programs to unify existing techniques like chain of thought and tool use, addressing the challenge of composing models for enhanced capabilities without specifying concrete numerical results.

Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with control flow and dynamic structure require techniques from probabilistic programming, which allow implementing disparate model structures and inference strategies in a unified language. We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use. We refer to the resulting programs as language model cascades.

View on arXiv PDF Code

Similar