Science Concierge: A fast content-based recommendation system for scientific publications
This work addresses the challenge for scientists in efficiently exploring exponentially growing scholarly material, though it is incremental as it applies existing recommendation techniques to a new domain.
The authors tackled the problem of finding relevant scientific publications by developing a content-based recommendation algorithm and Python library, which significantly outperformed keyword-based suggestions when tested on 15K neuroscience posters.
Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implements a recommendation system based on the content of articles. Design principles are to adapt to new content, provide near-real time suggestions, and be open source. We tested the library on 15K posters from the Society of Neuroscience Conference 2015. Human curated topics are used to cross validate parameters in the algorithm and produce a similarity metric that maximally correlates with human judgments. We show that our algorithm significantly outperformed suggestions based on keywords. The work presented here promises to make the exploration of scholarly material faster and more accurate.