CLDec 19, 2025
OpenAI GPT-5 System CardAaditya Singh, Adam Fry, Adam Perelman et al. · berkeley, mila
This is the system card published alongside the OpenAI GPT-5 launch, August 2025. GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent (for example, if you say 'think hard about this' in the prompt). The router is continuously trained on real signals, including when users switch models, preference rates for responses, and measured correctness, improving over time. Once usage limits are reached, a mini version of each model handles remaining queries. This system card focuses primarily on gpt-5-thinking and gpt-5-main, while evaluations for other models are available in the appendix. The GPT-5 system not only outperforms previous models on benchmarks and answers questions more quickly, but -- more importantly -- is more useful for real-world queries. We've made significant advances in reducing hallucinations, improving instruction following, and minimizing sycophancy, and have leveled up GPT-5's performance in three of ChatGPT's most common uses: writing, coding, and health. All of the GPT-5 models additionally feature safe-completions, our latest approach to safety training to prevent disallowed content. Similarly to ChatGPT agent, we have decided to treat gpt-5-thinking as High capability in the Biological and Chemical domain under our Preparedness Framework, activating the associated safeguards. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm -- our defined threshold for High capability -- we have chosen to take a precautionary approach.
IVOct 14, 2022
Data-Limited Tissue Segmentation using Inpainting-Based Self-Supervised LearningJeffrey Dominic, Nandita Bhaskhar, Arjun D. Desai et al.
Although supervised learning has enabled high performance for image segmentation, it requires a large amount of labeled training data, which can be difficult to obtain in the medical imaging field. Self-supervised learning (SSL) methods involving pretext tasks have shown promise in overcoming this requirement by first pretraining models using unlabeled data. In this work, we evaluate the efficacy of two SSL methods (inpainting-based pretext tasks of context prediction and context restoration) for CT and MRI image segmentation in label-limited scenarios, and investigate the effect of implementation design choices for SSL on downstream segmentation performance. We demonstrate that optimally trained and easy-to-implement inpainting-based SSL segmentation models can outperform classically supervised methods for MRI and CT tissue segmentation in label-limited scenarios, for both clinically-relevant metrics and the traditional Dice score.
CRSep 13, 2019
Toward Efficient Evaluation of Logic Encryption Schemes: Models and MetricsYinghua Hu, Vivek V. Menon, Andrew Schmidt et al.
Research in logic encryption over the last decade has resulted in various techniques to prevent different security threats such as Trojan insertion, intellectual property leakage, and reverse engineering. However, there is little agreement on a uniform set of metrics and models to efficiently assess the achieved security level and the trade-offs between security and overhead. This paper addresses the above challenges by relying on a general logic encryption model that can encompass all the existing techniques, and a uniform set of metrics that can capture multiple, possibly conflicting, security concerns. We apply our modeling approach to four state-of-the-art encryption techniques, showing that it enables fast and accurate evaluation of design trade-offs, average prediction errors that are at least 2X smaller than previous approaches, and the evaluation of compound encryption methods.