78.1AIMay 26Code
Laguna M.1/XS.2 Technical ReportJulien Abadji, Marah Abdin, Connor Adams et al.
We present Laguna M.1 and Laguna XS.2, two Mixture-of-Experts foundation models built for long-horizon, agentic coding: M.1 has $225.8$B total parameters ($23.4$B activated per token) and XS.2 has $33.4$B total ($3$B activated). Both models were trained from scratch end-to-end inside the same internal system that we refer to as our Model Factory: a tightly-integrated stack of versioned data, training, evaluation, and inference components that turn model development into an industrial process. We describe the principles and design choices of the Model Factory and also detail the end-to-end training process of our models, throughout pre-training data and architecture, post-training stages, evaluation, and quantization. On agentic software engineering and terminal benchmarks (SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro, and Terminal-Bench 2.0) M.1 and XS.2 are competitive with state-of-the-art open models in their respective weight classes. Laguna XS.2 weights are released under Apache~2.0 at https://huggingface.co/collections/poolside/laguna-xs2.
SEFeb 17, 2020
The Probabilistic Model Checker StormChristian Hensel, Sebastian Junges, Joost-Pieter Katoen et al.
We present the probabilistic model checker Storm. Storm supports the analysis of discrete- and continuous-time variants of both Markov chains and Markov decision processes. Storm has three major distinguishing features. It supports multiple input languages for Markov models, including the JANI and PRISM modeling languages, dynamic fault trees, generalized stochastic Petri nets, and the probabilistic guarded command language. It has a modular set-up in which solvers and symbolic engines can easily be exchanged. Its Python API allows for rapid prototyping by encapsulating Storm's fast and scalable algorithms. This paper reports on the main features of Storm and explains how to effectively use them. A description is provided of the main distinguishing functionalities of Storm. Finally, an empirical evaluation of different configurations of Storm on the QComp 2019 benchmark set is presented.
SEApr 28, 2019
Counterexample-Driven Synthesis for Probabilistic Program SketchesMilan Češka, Christian Hensel, Sebastian Junges et al.
Probabilistic programs are key to deal with uncertainty in e.g. controller synthesis. They are typically small but intricate. Their development is complex and error prone requiring quantitative reasoning over a myriad of alternative designs. To mitigate this complexity, we adopt counterexample-guided inductive synthesis (CEGIS) to automatically synthesise finite-state probabilistic programs. Our approach leverages efficient model checking, modern SMT solving, and counterexample generation at program level. Experiments on practically relevant case studies show that design spaces with millions of candidate designs can be fully explored using a few thousand verification queries.