GNLGMar 12, 2025

Terrier: A Deep Learning Repeat Classifier

arXiv:2503.09312v25 citationsh-index: 18Brief Bioinform
Originality Incremental advance
AI Analysis

This work addresses the problem of limited classification accuracy and reproducibility in repeat annotation for genomics researchers, facilitating studies on repeat evolution and function, though it is incremental as it builds on existing deep learning approaches.

The authors tackled the challenge of accurately classifying repetitive DNA sequences by developing Terrier, a deep learning model that achieved superior accuracy and mapped 97.1% of sequences to categories, outperforming existing methods in both model and non-model organisms.

Repetitive DNA sequences underpin genome architecture and evolutionary processes, yet they remain challenging to classify accurately. Terrier is a deep learning model designed to overcome these challenges by classifying repetitive DNA sequences using a publicly available, curated repeat sequence library trained under the RepeatMasker schema. Poor representation of taxa within repeat databases often limits the classification accuracy and reproducibility of current repeat annotation methods, limiting our understanding of repeat evolution and function. Terrier overcomes these challenges by leveraging deep learning for improved accuracy. Trained on Repbase, which includes over 100,000 repeat families -- four times more than Dfam -- Terrier maps 97.1% of Repbase sequences to RepeatMasker categories, offering the most comprehensive classification system available. When benchmarked against DeepTE, TERL, and TEclass2 in model organisms (rice, fruit flies, humans, and mice), Terrier achieved superior accuracy while classifying a broader range of sequences. Further validation in non-model amphibian, flatworm and Northern krill genomes highlights its effectiveness in improving classification in non-model species, facilitating research on repeat-driven evolution, genomic instability, and phenotypic variation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes