59.7LOMar 23
Determination of the fifth Busy Beaver valueThe bbchallenge Collaboration, Justin Blanchard, Daniel Briggs et al.
The Busy Beaver value $S(n)$ is the maximum number of steps that an $n$-state 2-symbol Turing machine can perform from the all-zero tape before halting. $S$ was historically introduced by Tibor Radó in 1962 as one of the simplest examples of an uncomputable function. We prove that $S(5) = 47,176,870$ using the Coq proof assistant. The proof enumerates $181,385,789$ Turing machines with 5 states and, for each machine, decides whether it halts or not. Our result marks the first determination of a new Busy Beaver value in over 40 years and the first Busy Beaver value ever to be formally verified, attesting to the effectiveness of massively collaborative online research (bbchallenge$.$org).
LGJul 1, 2018
Accurate Uncertainties for Deep Learning Using Calibrated RegressionVolodymyr Kuleshov, Nathan Fenner, Stefano Ermon
Methods for reasoning under uncertainty are a key building block of accurate and reliable machine learning systems. Bayesian methods provide a general framework to quantify uncertainty. However, because of model misspecification and the use of approximate inference, Bayesian uncertainty estimates are often inaccurate -- for example, a 90% credible interval may not contain the true outcome 90% of the time. Here, we propose a simple procedure for calibrating any regression algorithm; when applied to Bayesian and probabilistic models, it is guaranteed to produce calibrated uncertainty estimates given enough data. Our procedure is inspired by Platt scaling and extends previous work on classification. We evaluate this approach on Bayesian linear regression, feedforward, and recurrent neural networks, and find that it consistently outputs well-calibrated credible intervals while improving performance on time series forecasting and model-based reinforcement learning tasks.