CLDec 31, 2022

Logic Mill -- A Knowledge Navigation System

arXiv:2301.00200v23 citationsh-index: 49
Originality Synthesis-oriented
AI Analysis

This system provides a general-purpose tool for researchers in social sciences and other domains to navigate knowledge, though it is incremental as it applies existing methods to new data.

The authors tackled the problem of identifying semantically similar documents across large corpora by developing Logic Mill, a scalable and openly accessible system that uses NLP and pre-trained language models to generate document representations, currently containing over 200 million scientific and patent documents.

Logic Mill is a scalable and openly accessible software system that identifies semantically similar documents within either one domain-specific corpus or multi-domain corpora. It uses advanced Natural Language Processing (NLP) techniques to generate numerical representations of documents. Currently it leverages a large pre-trained language model to generate these document representations. The system focuses on scientific publications and patent documents and contains more than 200 million documents. It is easily accessible via a simple Application Programming Interface (API) or via a web interface. Moreover, it is continuously being updated and can be extended to text corpora from other domains. We see this system as a general-purpose tool for future research applications in the social sciences and other domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes