63.3LGMay 20
torchtune: PyTorch native post-training libraryMark Obozov, Maxime Griot, Joseph Cummings et al.
Modern LLMs typically require multistage training pipelines to achieve strong downstream performance, with post-training serving as the main interface for adapting open-weight models. We introduce torchtune, a PyTorch-native library designed to streamline the post-training lifecycle of LLMs, enabling efficient fine-tuning, experimentation, and deployment-oriented workflows. Unlike many existing fine-tuning frameworks, which often optimize for ease of use, specialized recipes, or hardware efficiency at the cost of transparency and extensibility, torchtune emphasizes modularity, hackability, and direct access to the underlying PyTorch components. In this paper, we present the design principles behind torchtune, describe how they are reflected in its model builders, training recipes, and distributed training stack, and evaluate the library across representative post-training settings. We compare against popular fine-tuning frameworks, including Axolotl and Unsloth, and show that torchtune provides strong performance and memory efficiency across many settings while remaining flexible enough for rapid research iteration. These results position torchtune as a practical foundation for reproducible LLMs post-training research.
IVJun 4, 2025
Conformal coronary calcification volume estimation with conditional coverage via histogram clusteringOlivier Jaubert, Salman Mohammadi, Keith A. Goatman et al.
Incidental detection and quantification of coronary calcium in CT scans could lead to the early introduction of lifesaving clinical interventions. However, over-reporting could negatively affect patient wellbeing and unnecessarily burden the medical system. Therefore, careful considerations should be taken when automatically reporting coronary calcium scores. A cluster-based conditional conformal prediction framework is proposed to provide score intervals with calibrated coverage from trained segmentation networks without retraining. The proposed method was tuned and used to calibrate predictive intervals for 3D UNet models (deterministic, MCDropout and deep ensemble) reaching similar coverage with better triage metrics compared to conventional conformal prediction. Meaningful predictive intervals of calcium scores could help triage patients according to the confidence of their risk category prediction.
LGDec 14, 2020
Odd-One-Out Representation LearningSalman Mohammadi, Anders Kirk Uhrenholt, Bjørn Sand Jensen
The effective application of representation learning to real-world problems requires both techniques for learning useful representations, and also robust ways to evaluate properties of representations. Recent work in disentangled representation learning has shown that unsupervised representation learning approaches rely on fully supervised disentanglement metrics, which assume access to labels for ground-truth factors of variation. In many real-world cases ground-truth factors are expensive to collect, or difficult to model, such as for perception. Here we empirically show that a weakly-supervised downstream task based on odd-one-out observations is suitable for model selection by observing high correlation on a difficult downstream abstract visual reasoning task. We also show that a bespoke metric-learning VAE model which performs highly on this task also out-performs other standard unsupervised and a weakly-supervised disentanglement model across several metrics.