CLAIAug 8, 2025

gpt-oss-120b & gpt-oss-20b Model Card

OpenAI
arXiv:2508.10925v1874 citationsh-index: 39
Originality Synthesis-oriented
AI Analysis

This work provides open-source models for broad use in AI research and applications, though it appears incremental as it builds on existing transformer and distillation techniques.

The researchers tackled the problem of developing open-weight reasoning models with high accuracy and low inference cost, resulting in gpt-oss-120b and gpt-oss-20b achieving strong results on benchmarks in mathematics, coding, and safety.

We present gpt-oss-120b and gpt-oss-20b, two open-weight reasoning models that push the frontier of accuracy and inference cost. The models use an efficient mixture-of-expert transformer architecture and are trained using large-scale distillation and reinforcement learning. We optimize the models to have strong agentic capabilities (deep research browsing, python tool use, and support for developer-provided functions), all while using a rendered chat format that enables clear instruction following and role delineation. Both models achieve strong results on benchmarks ranging from mathematics, coding, and safety. We release the model weights, inference implementations, tool environments, and tokenizers under an Apache 2.0 license to enable broad use and further research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes