CLAIMar 15, 2023

GPT-4 Technical Report

BerkeleyDeepMindUW
arXiv:2303.08774v624999 citationsh-index: 74
Originality Incremental advance
AI Analysis

This work advances AI capabilities for general-purpose applications, though it builds incrementally on previous large language models.

The researchers developed GPT-4, a large-scale multimodal model that accepts image and text inputs to produce text outputs, achieving human-level performance on benchmarks such as passing a simulated bar exam with a score in the top 10% of test takers.

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes