Anirban Sen

AI
h-index3
3papers
Novelty20%
AI Score33

3 Papers

SIApr 22
MediaGraph: A Network Theoretic Framework to Analyze Reporting Preferences in Indian News Media

Aditya Bali, Rupsha, Vidur Kaushik et al.

We present MediaGraph, a network-theoretic framework for analyzing reporting preferences in news media through entity co-occurrence networks. Using articles from four Indian news-sources, two mainstream (The Times of India and The Indian Express) and two fringe outlets (dna and firstpost), we construct source-specific co-occurrence networks around the 2020-21 and 2024 Farmers Protests. We analyze these networks along three network theoretic axes of centrality, community structure, and co-occurrence link predictability. The link predictability metric is a novel metric proposed that quantifies the consistency of entity associations over time using a GraphSAGE-based model. Our results reveal significant differences in reporting preferences across sources for the same event, and a consistent under-representation of farmer leaders across sources. By shifting the focus from textual signals to relational structures, our approach offers a scalable, label-independent perspective on media analysis and introduces link predictability as a complementary measure of reporting behavior.

CLFeb 25
Small Wins Big: Comparing Large Language Models and Domain Fine-Tuned Models for Sarcasm Detection in Code-Mixed Hinglish Text

Bitan Majumder, Anirban Sen

Sarcasm detection in multilingual and code-mixed environments remains a challenging task for natural language processing models due to structural variations, informal expressions, and low-resource linguistic availability. This study compares four large language models, Llama 3.1, Mistral, Gemma 3, and Phi-4, with a fine-tuned DistilBERT model for sarcasm detection in code-mixed Hinglish text. The results indicate that the smaller, sequentially fine-tuned DistilBERT model achieved the highest overall accuracy of 84%, outperforming all of the LLMs in zero and few-shot set ups, using minimal LLM generated code-mixed data used for fine-tuning. These findings indicate that domain-adaptive fine-tuning of smaller transformer based models may significantly improve sarcasm detection over general LLM inference, in low-resource and data scarce settings.

AIAug 22, 2025
Extending FKG.in: Towards a Food Claim Traceability Network

Saransh Kumar Gupta, Rizwan Gulzar Mir, Lipika Dey et al.

The global food landscape is rife with scientific, cultural, and commercial claims about what foods are, what they do, what they should not do, or should not do. These range from rigorously studied health benefits (probiotics improve gut health) and misrepresentations (soaked almonds make one smarter) to vague promises (superfoods boost immunity) and culturally rooted beliefs (cold foods cause coughs). Despite their widespread influence, the infrastructure for tracing, verifying, and contextualizing these claims remains fragmented and underdeveloped. In this paper, we propose a Food Claim-Traceability Network (FCN) as an extension of FKG[.]in, a knowledge graph of Indian food that we have been incrementally building. We also present the ontology design and the semi-automated knowledge curation workflow that we used to develop a proof of concept of FKG[.]in-FCN using Reddit data and Large Language Models. FCN integrates curated data inputs, structured schemas, and provenance-aware pipelines for food-related claim extraction and validation. While directly linked to the Indian food knowledge graph as an application, our methodology remains application-agnostic and adaptable to other geographic, culinary, or regulatory settings. By modeling food claims and their traceability in a structured, verifiable, and explainable way, we aim to contribute to more transparent and accountable food knowledge ecosystems, supporting researchers, policymakers, and most importantly, everyday consumers in navigating a world saturated with dietary assertions.