Daniel Tamayo

EP
h-index25
7papers
69citations
Novelty54%
AI Score47

7 Papers

CLFeb 24Code
MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

Daniel Tamayo, Iñaki Lacunza, Paula Rivera-Hidalgo et al.

We introduce MrBERT, a family of 150M-300M parameter encoders built on the ModernBERT architecture and pre-trained on 35 languages and code. Through targeted adaptation, this model family achieves state-of-the-art results on Catalan- and Spanish-specific tasks, while establishing robust performance across specialized biomedical and legal domains. To bridge the gap between research and production, we incorporate Matryoshka Representation Learning (MRL), enabling flexible vector sizing that significantly reduces inference and storage costs. Ultimately, the MrBERT family demonstrates that modern encoder architectures can be optimized for both localized linguistic excellence and efficient, high-stakes domain specialization. We open source the complete model family on Huggingface.

EPAug 16, 2024
Accelerating Giant Impact Simulations with Machine Learning

Caleb Lammers, Miles Cranmer, Sam Hadden et al. · cambridge

Constraining planet formation models based on the observed exoplanet population requires generating large samples of synthetic planetary systems, which can be computationally prohibitive. A significant bottleneck is simulating the giant impact phase, during which planetary embryos evolve gravitationally and combine to form planets, which may themselves experience later collisions. To accelerate giant impact simulations, we present a machine learning (ML) approach to predicting collisional outcomes in multiplanet systems. Trained on more than 500,000 $N$-body simulations of three-planet systems, we develop an ML model that can accurately predict which two planets will experience a collision, along with the state of the post-collision planets, from a short integration of the system's initial conditions. Our model greatly improves on non-ML baselines that rely on metrics from dynamics theory, which struggle to accurately predict which pair of planets will experience a collision. By combining with a model for predicting long-term stability, we create an ML-based giant impact emulator, which can predict the outcomes of giant impact simulations with reasonable accuracy and a speedup of up to four orders of magnitude. We expect our model to enable analyses that would not otherwise be computationally feasible. As such, we release our training code, along with an easy-to-use API for our collision outcome model and giant impact emulator.

CLFeb 4, 2025Code
Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge

Daniel Tamayo, Aitor Gonzalez-Agirre, Javier Hernando et al.

Recent research has explored methods for updating and modifying factual knowledge in large language models, often focusing on specific multi-layer perceptron blocks. This study expands on this work by examining the effectiveness of existing knowledge editing methods across languages and delving into the role of attention mechanisms in this process. Drawing from the insights gained, we propose Mass-Editing Memory with Attention in Transformers (MEMAT), a method that achieves significant improvements in all metrics while requiring minimal parameter modifications. MEMAT delivers a remarkable 10% increase in magnitude metrics, benefits languages not included in the training data and also demonstrates a high degree of portability. Our code and data are at https://github.com/dtamayo-nlp/MEMAT.

CLFeb 12, 2025Code
Salamandra Technical Report

Aitor Gonzalez-Agirre, Marc Pàmies, Joan Llop et al.

This work introduces Salamandra, a suite of open-source decoder-only large language models available in three different sizes: 2, 7, and 40 billion parameters. The models were trained from scratch on highly multilingual data that comprises text in 35 European languages and code. Our carefully curated corpus is made exclusively from open-access data compiled from a wide variety of sources. Along with the base models, supplementary checkpoints that were fine-tuned on public-domain instruction data are also released for chat applications. Additionally, we also share our preliminary experiments on multimodality, which serve as proof-of-concept to showcase potential applications for the Salamandra family. Our extensive evaluations on multilingual benchmarks reveal that Salamandra has strong capabilities, achieving competitive performance when compared to similarly sized open-source models. We provide comprehensive evaluation results both on standard downstream tasks as well as key aspects related to bias and safety.With this technical report, we intend to promote open science by sharing all the details behind our design choices, data curation strategy and evaluation methodology. In addition to that, we deviate from the usual practice by making our training and evaluation scripts publicly accessible. We release all models under a permissive Apache 2.0 license in order to foster future research and facilitate commercial use, thereby contributing to the open-source ecosystem of large language models.

EPJan 11, 2021Code
A Bayesian neural network predicts the dissolution of compact planetary systems

Miles Cranmer, Daniel Tamayo, Hanno Rein et al.

Despite over three hundred years of effort, no solutions exist for predicting when a general planetary configuration will become unstable. We introduce a deep learning architecture to push forward this problem for compact systems. While current machine learning algorithms in this area rely on scientist-derived instability metrics, our new technique learns its own metrics from scratch, enabled by a novel internal structure inspired from dynamics theory. Our Bayesian neural network model can accurately predict not only if, but also when a compact planetary system with three or more planets will go unstable. Our model, trained directly from short N-body time series of raw orbital elements, is more than two orders of magnitude more accurate at predicting instability times than analytical estimators, while also reducing the bias of existing machine learning algorithms by nearly a factor of three. Despite being trained on compact resonant and near-resonant three-planet configurations, the model demonstrates robust generalization to both non-resonant and higher multiplicity configurations, in the latter case outperforming models fit to that specific set of integrations. The model computes instability estimates up to five orders of magnitude faster than a numerical integrator, and unlike previous efforts provides confidence intervals on its predictions. Our inference model is publicly available in the SPOCK package, with training code open-sourced.

EPJan 25, 2025
SPOCK 2.0: Update to the FeatureClassifier in the Stability of Planetary Orbital Configurations Klassifier

Elio Thadhani, Yolanda Ba, Hanno Rein et al.

The Stability of Planetary Orbital Configurations Klassifier (SPOCK) package collects machine learning models for predicting the stability and collisional evolution of compact planetary systems. In this paper we explore improvements to SPOCK's binary stability classifier (FeatureClassifier), which predicts orbital stability by collecting data over a short N-body integration of a system. We find that by using a system-specific timescale (rather than a fixed $10^4$ orbits) for the integration, and by using this timescale as an additional feature, we modestly improve the model's AUC metric from 0.943 to 0.950 (AUC=1 for a perfect model). We additionally discovered that $\approx 10\%$ of N-body integrations in SPOCK's original training dataset were duplicated by accident, and that $<1\%$ were misclassified as stable when they in fact led to ejections. We provide a cleaned dataset of 100,000+ unique integrations, release a newly trained stability classification model, and make minor updates to the API.

EPJun 2, 2015
WHFast: A fast and unbiased implementation of a symplectic Wisdom-Holman integrator for long term gravitational simulations

Hanno Rein, Daniel Tamayo

We present WHFast, a fast and accurate implementation of a Wisdom-Holman symplectic integrator for long-term orbit integrations of planetary systems. WHFast is significantly faster and conserves energy better than all other Wisdom-Holman integrators tested. We achieve this by significantly improving the Kepler-solver and ensuring numerical stability of coordinate transformations to and from Jacobi coordinates. These refinements allow us to remove the linear secular trend in the energy error that is present in other implementations. For small enough timesteps we achieve Brouwer's law, i.e. the energy error is dominated by an unbiased random walk due to floating-point round-off errors. We implement symplectic correctors up to order eleven that significantly reduce the energy error. We also implement a symplectic tangent map for the variational equations. This allows us to efficiently calculate two widely used chaos indicators the Lyapunov characteristic number (LCN) and the Mean Exponential Growth factor of Nearby Orbits (MEGNO). WHFast is freely available as a flexible C package, as a shared library, and as an easy-to-use python module.