CLAINov 3, 2023

Towards Concept-Aware Large Language Models

arXiv:2311.01866v1135 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses the challenge of enabling machines to form and reason with concepts, which is crucial for advancing AI cognition, but it is incremental as it builds on existing LLMs.

The paper tackles the problem that large language models (LLMs) operate at the token level rather than capturing human concepts, and it shows that a proof-of-concept method improves alignment with human intuition and prediction robustness.

Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes