IRSIMar 2

Why They Link: An Intent Taxonomy for Including Hyperlinks in Social Posts

arXiv:2601.17601
Originality Incremental advance
AI Analysis

For researchers and practitioners in social media analysis and information retrieval, this taxonomy provides a structured way to interpret hyperlink intent, enabling more accurate retrieval and recommendation.

The authors develop a reader-centered taxonomy of 6 top-level and 26 fine-grained intent categories for hyperlinks in social posts, using crowdsourced annotations and LLM refinement. They annotate 1,000 posts, finding advertising, arguing, and sharing as most prevalent, and show that incorporating intent improves microblog retrieval.

URLs serve as bridges between social media platforms and the broader web, linking user-generated content to external information resources. On Twitter (X), approximately one in five tweets contains at least one URL, underscoring their central role in information dissemination. While prior studies have examined the motivations of authors who share URLs, such author-centered intentions are difficult to observe in practice. To enable broader downstream use, this work investigates reader-centered interpretations, i.e., how users perceive the intentions behind hyperlinks included in posts. We develop an intent taxonomy for including hyperlinks in social posts through a hybrid approach that begins with a bottom-up, data-driven process using large-scale crowdsourced annotations, and is then refined using a large language model (LLM) assistance to generate descriptive category names and precise definitions. The final taxonomy comprises 6 top-level categories and 26 fine-grained intention classes, capturing diverse communicative purposes. Applying this taxonomy, we annotate and analyze 1,000 user posts, revealing that advertising, arguing, and sharing are the most prevalent intentions. We further compare our taxonomy with existing taxonomies and demonstrate its utility in a microblog retrieval task by incorporating intent as an additional feature. Overall, our taxonomy provides a foundation for intent-aware information retrieval and NLP applications, enabling more accurate retrieval, recommendation, and interpretation of social media content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes