GPTKB v1.5: A Massive Knowledge Base for Exploring Factual LLM Knowledge
This provides a tool for researchers to systematically analyze LLM knowledge, though it is incremental as it builds on existing methodology.
The paper tackles the problem of understanding and accessing factual knowledge in language models by introducing GPTKB v1.5, a 100-million-triple knowledge base built from GPT-4.1 for $14,000, enabling exploration, querying, and comparative analysis of LLM knowledge.
Language models are powerful tools, yet their factual knowledge is still poorly understood, and inaccessible to ad-hoc browsing and scalable statistical analysis. This demonstration introduces GPTKB v1.5, a densely interlinked 100-million-triple knowledge base (KB) built for $14,000 from GPT-4.1, using the GPTKB methodology for massive-recursive LLM knowledge materialization (Hu et al., ACL 2025). The demonstration experience focuses on three use cases: (1) link-traversal-based LLM knowledge exploration, (2) SPARQL-based structured LLM knowledge querying, (3) comparative exploration of the strengths and weaknesses of LLM knowledge. Massive-recursive LLM knowledge materialization is a groundbreaking opportunity both for the research area of systematic analysis of LLM knowledge, as well as for automated KB construction. The GPTKB demonstrator is accessible at https://gptkb.org.