CLSep 30, 2023

Finding Pragmatic Differences Between Disciplines

arXiv:2310.00204v1726 citationsh-index: 26
Originality Synthesis-oriented
AI Analysis

This work addresses the gap in scholarly document understanding by focusing on pragmatics, which is incremental as it applies existing methods to new aspects of document analysis.

The study tackled the problem of understanding pragmatic differences in scholarly document structure across disciplines, finding that despite diversity, disciplines share similar structural patterns, with analysis across 19 disciplines revealing within-discipline archetypes and variability.

Scholarly documents have a great degree of variation, both in terms of content (semantics) and structure (pragmatics). Prior work in scholarly document understanding emphasizes semantics through document summarization and corpus topic modeling but tends to omit pragmatics such as document organization and flow. Using a corpus of scholarly documents across 19 disciplines and state-of-the-art language modeling techniques, we learn a fixed set of domain-agnostic descriptors for document sections and "retrofit" the corpus to these descriptors (also referred to as "normalization"). Then, we analyze the position and ordering of these descriptors across documents to understand the relationship between discipline and structure. We report within-discipline structural archetypes, variability, and between-discipline comparisons, supporting the hypothesis that scholarly communities, despite their size, diversity, and breadth, share similar avenues for expressing their work. Our findings lay the foundation for future work in assessing research quality, domain style transfer, and further pragmatic analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes