LG AINov 6, 2025

seqme: a Python library for evaluating biological sequence design

Rasmus Møller-Larsen, Adam Izdebski, Jan Olszewski, Pankhil Gawade, Michal Kmicikiewicz, Wojciech Zarzecki, Ewa Szczurek

arXiv:2511.04239v14.11 citationsh-index: 3Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses a gap for researchers in computational biology by providing a modular tool for evaluating sequence design methods, though it is incremental as it consolidates existing metrics into a library.

The authors tackled the lack of a unified software library for evaluating biological sequence design methods by introducing seqme, a Python library that provides model-agnostic metrics for assessing fidelity to target distributions and desired properties, applicable to various biological sequences like proteins and DNA.

Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the designed sequences to a target distribution and their attainment of desired properties. However, a single software library implementing these metrics was lacking. In this work we introduce seqme, a modular and highly extendable open-source Python library, containing model-agnostic metrics for evaluating computational methods for biological sequence design. seqme considers three groups of metrics: sequence-based, embedding-based, and property-based, and is applicable to a wide range of biological sequences: small molecules, DNA, ncRNA, mRNA, peptides and proteins. The library offers a number of embedding and property models for biological sequences, as well as diagnostics and visualization functions to inspect the results. seqme can be used to evaluate both one-shot and iterative computational design methods.

View on arXiv PDF

Similar