CLApr 1, 2025

Leveraging Large Language Models for Automated Definition Extraction with TaxoMatic A Case Study on Media Bias

arXiv:2504.00343v11 citationsh-index: 15ICWSM
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automating definition extraction for researchers in media bias, but it is incremental as it applies existing LLM methods to a new domain.

The paper tackles automated definition extraction from academic literature by introducing TaxoMatic, a framework using large language models, and demonstrates its effectiveness with Claude-3-sonnet achieving the best results on a dataset of 2,398 manually rated articles in the media bias domain.

This paper introduces TaxoMatic, a framework that leverages large language models to automate definition extraction from academic literature. Focusing on the media bias domain, the framework encompasses data collection, LLM-based relevance classification, and extraction of conceptual definitions. Evaluated on a dataset of 2,398 manually rated articles, the study demonstrates the frameworks effectiveness, with Claude-3-sonnet achieving the best results in both relevance classification and definition extraction. Future directions include expanding datasets and applying TaxoMatic to additional domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes