SIMay 20
Reddit's Appetite: Predicting User Engagement with Nutritional ContentGabriela Ozegovic, Thorsten Ruprechter, Denis Helic
Food communities on online platforms enjoy great popularity among social media users. Due to the far-reaching consequences of food-related content on user eating behavior, recent research has studied the factors that drive user online engagement with food. While most of these studies have focused on visual aspects of food content in social media, only a few initial studies have explored the impact of nutritional content on user engagement. In this paper, we set out to close this gap and analyze food-related posts on Reddit, focusing on the association between the calories and macronutrients of a meal and engagement levels, particularly the number of comments. To that end, we collect and analyze almost half a million food-related posts and uncover differences in nutritional content between engaging and non-engaging posts. Moreover, we train a series of XGBoost models, and evaluate the importance of nutritional content while predicting user engagement and how posts will resonate with the community. We find that nutritional features improve the baseline model's accuracy by almost 5%, with a positive contribution of calorie density towards the prediction of engagement, suggesting that higher nutritional content is associated with higher levels of user engagement in food-related posts. Our results provide valuable insights for the design of more engaging online initiatives aimed at, for example, encouraging healthy eating habits.
IROct 17, 2024Code
Large Language Models as Narrative-Driven RecommendersLukas Eberhard, Thorsten Ruprechter, Denis Helic
Narrative-driven recommenders aim to provide personalized suggestions for user requests expressed in free-form text such as "I want to watch a thriller with a mind-bending story, like Shutter Island." Although large language models (LLMs) have been shown to excel in processing general natural language queries, their effectiveness for handling such recommendation requests remains relatively unexplored. To close this gap, we compare the performance of 38 open- and closed-source LLMs of various sizes, such as LLama 3.2 and GPT-4o, in a movie recommendation setting. For this, we utilize a gold-standard, crowdworker-annotated dataset of posts from reddit's movie suggestion community and employ various prompting strategies, including zero-shot, identity, and few-shot prompting. Our findings demonstrate the ability of LLMs to generate contextually relevant movie recommendations, significantly outperforming other state-of-the-art approaches, such as doc2vec. While we find that closed-source and large-parameterized models generally perform best, medium-sized open-source models remain competitive, being only slightly outperformed by their more computationally expensive counterparts. Furthermore, we observe no significant differences across prompting strategies for most models, underscoring the effectiveness of simple approaches such as zero-shot prompting for narrative-driven recommendations. Overall, this work offers valuable insights for recommender system researchers as well as practitioners aiming to integrate LLMs into real-world recommendation tools.
CYFeb 9, 2025
NutriTransform: Estimating Nutritional Information From Online Food PostsThorsten Ruprechter, Marion Garaus, Ivo Ponocny et al.
Deriving nutritional information from online food posts is challenging, particularly when users do not explicitly log the macro-nutrients of a shared meal. In this work, we present an efficient and straightforward approach to approximating macro-nutrients based solely on the titles of food posts. Our method combines a public food database from the U.S. Department of Agriculture with advanced text embedding techniques. We evaluate the approach on a labeled food dataset, demonstrating its effectiveness, and apply it to over 500,000 real-world posts from Reddit's popular /r/food subreddit to uncover trends in food-sharing behavior based on the estimated macro-nutrient content. Altogether, this work lays a foundation for researchers and practitioners aiming to estimate caloric and nutritional content using only text data.