Samuel Chaffron

h-index2
2papers

2 Papers

CLJun 14, 2023
Building a Corpus for Biomedical Relation Extraction of Species Mentions

Oumaima El Khettari, Solen Quiniou, Samuel Chaffron

We present a manually annotated corpus, Species-Species Interaction, for extracting meaningful binary relations between species, in biomedical texts, at sentence level, with a focus on the gut microbiota. The corpus leverages PubTator to annotate species in full-text articles after evaluating different Named Entity Recognition species taggers. Our first results are promising for extracting relations between species using BERT and its biomedical variants.

CLJun 10, 2025
Summarization for Generative Relation Extraction in the Microbiome Domain

Oumaima El Khettari, Solen Quiniou, Samuel Chaffron

We explore a generative relation extraction (RE) pipeline tailored to the study of interactions in the intestinal microbiome, a complex and low-resource biomedical domain. Our method leverages summarization with large language models (LLMs) to refine context before extracting relations via instruction-tuned generation. Preliminary results on a dedicated corpus show that summarization improves generative RE performance by reducing noise and guiding the model. However, BERT-based RE approaches still outperform generative models. This ongoing work demonstrates the potential of generative methods to support the study of specialized domains in low-resources setting.