LGAISDNov 7, 2025

Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models

arXiv:2511.05171v22 citationsh-index: 18
Originality Incremental advance
AI Analysis

This addresses a specific problem in bioacoustics for researchers needing flexible AI models, but it is incremental as it builds on existing models with a simple merging strategy.

The paper tackled the trade-off between domain-specific fine-tuning and instruction-following flexibility in bioacoustic foundation models, resulting in a merged model that achieved over a 200% relative improvement in zero-shot generalization and set a new state-of-the-art.

Foundation models capable of generalizing across species and tasks represent a promising new frontier in bioacoustics, with NatureLM being one of the most prominent examples. While its domain-specific fine-tuning yields strong performance on bioacoustic benchmarks, we observe that it also introduces trade-offs in instruction-following flexibility. For instance, NatureLM achieves high accuracy when prompted for either the common or scientific name individually, but its accuracy drops significantly when both are requested in a single prompt. We address this by applying a simple model merging strategy that interpolates NatureLM with its base language model, recovering instruction-following capabilities with minimal loss of domain expertise. Finally, we show that the merged model exhibits markedly stronger zero-shot generalization, achieving over a 200% relative improvement and setting a new state-of-the-art in closed-set zero-shot classification of unseen species.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes