CLApr 1, 2025

Leveraging Large Language Models for Automated Definition Extraction with TaxoMatic A Case Study on Media Bias

Timo Spinde, Luyang Lin, Smi Hinterreiter, Isao Echizen

arXiv:2504.00343v12.72 citationsh-index: 15ICWSM

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of automating definition extraction for researchers in media bias, but it is incremental as it applies existing LLM methods to a new domain.

The paper tackles automated definition extraction from academic literature by introducing TaxoMatic, a framework using large language models, and demonstrates its effectiveness with Claude-3-sonnet achieving the best results on a dataset of 2,398 manually rated articles in the media bias domain.

This paper introduces TaxoMatic, a framework that leverages large language models to automate definition extraction from academic literature. Focusing on the media bias domain, the framework encompasses data collection, LLM-based relevance classification, and extraction of conceptual definitions. Evaluated on a dataset of 2,398 manually rated articles, the study demonstrates the frameworks effectiveness, with Claude-3-sonnet achieving the best results in both relevance classification and definition extraction. Future directions include expanding datasets and applying TaxoMatic to additional domains.

View on arXiv PDF

Similar