Profiling vs. Time vs. Content: What does Matter for Top-k Publication Recommendation based on Twitter Profiles? - An Extended Technical Report
This work addresses the problem of optimizing publication recommendations for users based on sparse social media data, though it is incremental in exploring specific factors.
The study investigated how profiling method, temporal decay, and content richness affect the performance of a top-k publication recommender system based on Twitter profiles, finding that the best strategy combined CF-IDF, sliding window, and full-texts, while the novel HCF-IDF method achieved similar results using only titles.
So far it is unclear how different factors of a scientific publication recommender system based on users' tweets have an influence on the recommendation performance. We examine three different factors, namely profiling method, temporal decay, and richness of content. Regarding profiling, we compare CF-IDF that replaces terms in TF-IDF by semantic concepts, HCF-IDF as novel hierarchical variant of CF-IDF, and topic modeling. As temporal decay functions, we apply sliding window and exponential decay. In terms of the richness of content, we compare recommendations using both full-texts and titles of publications and using only titles. Overall, the three factors make twelve recommendation strategies. We have conducted an online experiment with 123 participants and compared the strategies in a within-group design. The best recommendations are achieved by the strategy combining CF-IDF, sliding window, and with full-texts. However, the strategies using the novel HCF-IDF profiling method achieve similar results with just using the titles of the publications. Therefore, HCF-IDF can make recommendations when only short and sparse data is available.